📝 Guest post: Data Labeling and Its Role in E-commerce Today – Recent Use Cases*
Was this email forwarded to you? Sign up here In TheSequence Guest Post, our partners explain in detail what machine learning (ML) challenges they help deal with. In this post, Toloka’s team offers you an insightful overview of the data labeling use cases in e-commerce. AI and e-commerce are now inextricably intertwined – unbeknownst to most, the latter can no longer maintain a competitive edge without the former. According to Statista, this is so much the case that over 70% of all surveyed e-commerce business executives in Europe and North America believe AI to be the main “can’t-do-without” mechanism for all modern online retail. Data labeling: use casesOf course, no AI is possible without relevant, accurately labeled data. Work on your ML model as much as you like, but without the right data, there’s only so much your model can do. Because of this, data labeling has become the bedrock of AI, and thus also e-commerce. Crowdsourcing, considered the quickest and most cost-effective method of data labeling that is also scalable, is how a lot of the data is labeled today. Which is in part what makes the AI-ecommerce alliance so vibrant. Knowing that, let’s see in more detail what exactly data labeling is doing to improve e-commerce through these recent use cases:
AliExpress is one of the world’s e-commerce leaders. However, serious localization problems arose when AliExpress attempted to translate its platform into Russian, which resulted in numerous inaccurate product descriptions. With a giant catalog, the company needed a quick and reliable solution. Toloka managed to offer an innovative crowd-based solution – instead of using MT or CAT combined with human-validated translations as is normally the case, Toloka’s crowd performers were asked to provide their own translations whenever applicable. A newly suggested version became a fixed multiple-choice option that the next crowd performer had to either choose and hence approve or replace with their own version. The cycle continued until the performers as a group had no further improvements to suggest. The end result proved to be affordable and, above all, very effective.
Ozon’s inventory rivals that of AliExpress and offers more than 9 million SKUs across 24 different categories. The task Toloka was given had to do with evaluating the quality of the company’s search and determining the most effective product ranking model for the cataloged items. The performers had to rate search engine results from best to worst in order to identify filter issues on the website and provide a fine-grained analysis from the perspective of UI and UX. As a result, Ozon’s search engine is more powerful today than it’s ever been, notes Ozon’s tech department.
Yandex.Market is an e-commerce marketplace with over one million items available in its product catalog. When the company needed to tune its product recommendation engine, they developed a data labeling pipeline in Toloka to get the data they needed. An effective recommender system needs vast amounts of labeled data to support its ML model. Yandex.Market started out by using automated solutions to train their recommendation model, but the algorithm was not performing well enough. They developed a new strategy using the Toloka platform:
After integrating Toloka, the accuracy of the Yandex.Market recommender system went from a modest 40% to 90% overall, while recall rose from 20% to a solid 74% for accessories and 90% for related items. Recommendations stay up to date — now that the pipeline is in place, it is quick and easy to get new labels and retrain the system whenever a new category of products is introduced. The marketplace is currently looking into using Toloka to boost other aspects of their business.
Neatsy is a relatively new but already popular app that e-commerce marketplaces and D2C manufacturers use to offer their customers a handy option of 3D-scanning their feet to find the best fitting shoes. To make this feature a tangible reality, the app’s 3D scanner that works as a type of neural network needed over 50,000 labeled images in order to train itself to do a better job of separating human feet from the floor. Toloka was called to action again, and 3 weeks later the job was done. Neatsy confirmed that the app’s time to market was accelerated significantly, while its 3D scanner became 12% more accurate. The app is now trusted by famous brands like Nike, Reebok, Adidas, Puma, and Vans among others. ConclusionLooking at these examples, it becomes clear that AI will continue to play a huge role in the realm of online retail at an ever-increasing speed. As data-labeling techniques, namely crowdsourcing, will continue to become more and more time- and cost-effective, scalable, and capable of providing high-quality results, a great many more opportunities for business will fall into our lap. Both the AI and e-commerce markets will expand in parallel, bouncing off of each other to the point when e-commerce will become inseparable from the machine intellect at its core. So it’s becoming increasingly important to stay ahead of the AI curve with high-performance data to grow any e-commerce business. Learn more about intelligent data solutions to transform e-commerce from Toloka. *This post was written by Toloka’s team. We thank Toloka for their ongoing support of TheSequence. |
Older messages
🔄🔄 Edge#159: MLOps Full Recap
Tuesday, January 25, 2022
Dive in!
👷♀️🧑🏻🎓👩💻👨🏻🏫 The MoE Momentum
Sunday, January 23, 2022
Weekly news digest curated by the industry insiders
📌 Learn from 40+ AI experts at mlcon 2.0 ML dev conf <Feb22-23>
Friday, January 21, 2022
Our partner cnvrg.io is hosting another incredible virtual conference mlcon 2.0! It is FREE
🥸 Edge#158: Microsoft KEAR is a Deep Learning Model for Common Sense Reasoning
Thursday, January 20, 2022
What's New in AI, a deep dive into one of the freshest research papers or technology frameworks that are worth your attention. Our goal is to keep you up to date with new developments in AI in a
🎙Yinhan Liu/CTO of BirchAI about applying ML in the healthcare industry
Wednesday, January 19, 2022
On what healthcare companies spend tens of billions of dollars?
You Might Also Like
Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator
Friday, February 14, 2025
What came before the golem? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Defining Your Paranoia Level: Navigating Change Without the Overkill
Friday, February 14, 2025
We've all been there: trying to learn something new, only to find our old habits holding us back. We discussed today how our gut feelings about solving problems can sometimes be our own worst enemy
5 ways AI can help with taxes 🪄
Friday, February 14, 2025
Remotely control an iPhone; 💸 50+ early Presidents' Day deals -- ZDNET ZDNET Tech Today - US February 10, 2025 5 ways AI can help you with your taxes (and what not to use it for) 5 ways AI can help
Recurring Automations + Secret Updates
Friday, February 14, 2025
Smarter automations, better templates, and hidden updates to explore 👀 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The First Provable AI-Proof Game: Introducing Butterfly Wings 4
Friday, February 14, 2025
Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? undefined The Market Today #01 Instagram (Meta) 714.52 -0.32%
GCP Newsletter #437
Friday, February 14, 2025
Welcome to issue #437 February 10th, 2025 News BigQuery Cloud Marketplace Official Blog Partners BigQuery datasets now available on Google Cloud Marketplace - Google Cloud Marketplace now offers
Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰
Friday, February 14, 2025
Discover how the share of US wealth held by the top 1% has evolved from 1989 to 2024 in this infographic. View Online | Subscribe | Download Our App Download our app to see thousands of new charts from
The Great Social Media Diaspora & Tapestry is here
Friday, February 14, 2025
Apple introduces new app called 'Apple Invites', The Iconfactory launches Tapestry, beyond the traditional portfolio, and more in this week's issue of Creativerly. Creativerly The Great
Daily Coding Problem: Problem #1689 [Medium]
Friday, February 14, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a linked list, sort it in O(n log n) time and constant space. For example,
📧 Stop Conflating CQRS and MediatR
Friday, February 14, 2025
Stop Conflating CQRS and MediatR Read on: my website / Read time: 4 minutes The .NET Weekly is brought to you by: Step right up to the Generative AI Use Cases Repository! See how MongoDB powers your