Table of Contents
In the realm that applies to e-commerce, the notion “the more, the merrier” might not always hold true. While a growing number of products, customers, and data often signifies prosperity, it can also introduce challenges – particularly regarding platform performance and infrastructure costs. Managing such scenarios calls for careful and sustainable strategies. In our e-commerce experience with the artwork online store, we delved into a situation that exemplified these intricacies.
The initial request from a customer, artwork online store owner
Most e-commerce businesses operate through a website or application, facilitating the buying and selling goods or services. The typical online shop carries a limited quantity of items, often evading challenges related to performance or infrastructure costs. However, when a platform serves numerous customers, the dynamics change. These platforms, termed “Multi-Tenant,” can host millions of unique products.
Handling such vast data volumes can unveil numerous optimization opportunities. This article explores one such case. Similar to platforms like Shopify, our client is an e-commerce solution (an artwork online store) supporting around 10,000 active websites. Since its foundation, the platform has accumulated about 4 million products, leading to substantial costs on its Cloudinary account. The cost increase didn’t occur suddenly but consistently grew over the years.
In our attempts to address the escalating costs of the artwork online store, we focused on optimizing the Cloudinary storage. We came up with two primary strategies:
- Derived Assets Cleanup. We looked into the derived assets, the feature that allows the creation of transformed versions of original files. Since our application frequently involves scenarios where buyers apply these transformations, it was a logical starting point. Cloudinary offers a built-in feature to purge derived assets that haven’t been accessed over a specific period. This action allowed us to save about 8% of Cloudinary storage. As part of our routine, we eliminate those derived assets that haven’t been accessed in the past three months.
- Cleaning Deleted Database Products’ Assets. Over time, we got rid of many products from our platform. However, due to a code-level vulnerability, the deleted products’ assets sporadically persisted in Cloudinary storage, leading to unnecessary costs. Our strategy was identifying and eliminating these orphaned assets, ensuring we only paid for relevant content. This pivotal improvement led to a savings of approximately 20% on asset storage.
Together, these essential steps led to a combined savings of approximately 28% on asset storage. While the outcomes of these measures were positive, our quest for further optimization continued.
Brief domain description
Understanding our artwork online store platform’s core: It is specially designed for photographers and painters, offering them a space to upload, manage, and market their unique creations. Once artworks are uploaded, they don’t only serve as visual goods like NFTs, but they also can turn into physical products through various methods such as:
- Prints on photo paper;
- Artistic woodcuts;
- Merchandise prints.
Now, diving a bit deeper into the functionalities, our platform offers two distinct types of product uploading:
- Website Owner Products. This feature allows website owners to upload their artworks and place them within specific sections of their website;
- Customer-Side Upload. Certain websites on our platform allow end customers to upload their images for customized product fulfillment. Example: a user uploads a photo of their pet to make a unique design on a T-shirt.
The “Customer-Side Upload” quickly became a prime area for deeper investigation. About 2 million or 50% of the entire product repository came from this method. But was it necessary to retain them all? While we couldn’t simply remove everything through one database transaction, this segment was a low-hanging fruit from the performance improvement perspective. It was a goal for the examination from a business point of view.
Upon examining the codebase, analyzing the historical data of approximately 800k orders since 2014, and consulting with the Product Owner, we discerned several customer behavior patterns:
- Typically, customers purchase their self-uploaded products within a day;
- Products that were not purchased within the first week seldom found buyers later on;
- We must retain customer-uploaded products ordered in the artwork online store regardless of how long ago. It is crucial because their images or any supplemental information might be essential for fulfillment, and they also hold value for the website owners.
Come up with a plan around the cleaning up
We decided to concentrate on new incoming products rather than removing old customer uploads. This measure was motivated by the safer approach, allowing us to test the hypothesis on a limited number of items without risking large-scale errors. The agreed-upon procedure is as follows:
- Initialization: When a product is uploaded, we trigger a background job that is set to wait for one month;
- Evaluation: After the one month, we determine if the product has been purchased;
- Action: If the product remains unpurchased, we delete it, ensuring efficient use of our storage.
Appendix A: Pseudocode for Product Deletion Process
After thoroughly testing the feature in our staging environment, we released it in production. We then patiently awaited feedback to determine if customers encountered any issues with the updated workflow.
This approach did not negatively impact our customers, allowing us to confidently proceed to the next major step: removing old uploaded products that meet the specified criteria.
Considering the significant system load associated with product deletions, we formulated the following plan:
- Job Creation: Design a delayed job that assesses whether a product qualifies for deletion. If the criteria are met, it initiates the removal service (Refer to Appendix B);
- Batch Processing: Establish another delayed job that handles product batches. This job schedules the product batch deletion tasks, spacing them out to mitigate performance issues (Refer to Appendix C);
- Rake Task: Develop a rake task that enables us to estimate the timing intervals and amount of products of the Batch Processing running, letting us identify the best settings;
- Scheduling: We’ll initiate the rake task at the calculated intervals once we determine the best delay.
After experimenting with various input parameters for our implemented rake task for the artwork online store, we found an optimal setting: a removal rate of 5,000 products per hour. Given the initial volume of data, the entire process of identifying and removing outdated products would take about one month.
We established a daily routine during removal to guarantee we didn’t introduce any critical issues. We began tracking several essential metrics daily: Number of Database Products, Amount of Database Entities related to products, Cloudinary Storage space used, and the Number of Cloudinary Resources.
Appendix B: Pseudocode for Product Deletion Job Creation for the Artwork Online Store
Share the results
Having set the stage with our thorough testing and daily monitoring routine, we now focus on the outcomes of our efforts. The process went smoothly: the set-up restrictions allowed us to remove significant amounts of data and not experience any performance issues. Once all products were processed, we received the following results:
- The Cloudinary storage was reduced from about 50 TB to around 40 TB (even though continuous product uploading was not taken into account), so we saved about 20% of our storage;
- Roughly 2 million products were removed, leading to a 50% downsizing of the products table;
- A similar 50% reduction in products-related database entities resulted in the deletion of 200 million records.
Beyond the primary goal of optimizing Cloudinary, we achieved substantial improvements at the database level. We assessed the loading times of our most frequented pages, and the average duration exhibited an impressive speed-up of approximately 25%.
Appendix C: Pseudocode for Scheduling Product Batches for Deletion
Diving into the domain’s internals and considering our customers’ buying experience, we optimized Cloudinary storage and significantly improved database performance. This experience reaffirmed a well-known principle that meaningful business outcomes often arise from a deep understanding of domain specifics. Many impactful technical solutions might require simple implementations.
Indeed, our journey echoed many of the principles outlined in Eric Evans’ recognized work – “Domain-Driven Design: Tackling Complexity in the Heart of Software.” Evans emphasizes the importance of deeply understanding the problem domain, collaborating closely with experts, and modeling solutions that align closely with the domain’s realities. Our experiences served as practical proof of these DDD principles. While the technical nuances of software can be intricate, they truly shine when grounded in a clear, well-understood domain context. It’s an enduring reminder that, sometimes, the key to innovative solutions or successful performance optimization lies in revisiting foundational principles and applying them with renewed insight.
In closing, while some new approaches are still invaluable, the power of deep domain knowledge paired with strategic action is transformative. True mastery often involves optimizing what’s already known rather than perpetually seeking the new.