📝 Guest post: Data Labeling and Its Role in E-commerce Today – Recent Use Cases*
Was this email forwarded to you? Sign up here In TheSequence Guest Post, our partners explain in detail what machine learning (ML) challenges they help deal with. In this post, Toloka’s team offers you an insightful overview of the data labeling use cases in e-commerce. AI and e-commerce are now inextricably intertwined – unbeknownst to most, the latter can no longer maintain a competitive edge without the former. According to Statista, this is so much the case that over 70% of all surveyed e-commerce business executives in Europe and North America believe AI to be the main “can’t-do-without” mechanism for all modern online retail. Data labeling: use casesOf course, no AI is possible without relevant, accurately labeled data. Work on your ML model as much as you like, but without the right data, there’s only so much your model can do. Because of this, data labeling has become the bedrock of AI, and thus also e-commerce. Crowdsourcing, considered the quickest and most cost-effective method of data labeling that is also scalable, is how a lot of the data is labeled today. Which is in part what makes the AI-ecommerce alliance so vibrant. Knowing that, let’s see in more detail what exactly data labeling is doing to improve e-commerce through these recent use cases:
AliExpress is one of the world’s e-commerce leaders. However, serious localization problems arose when AliExpress attempted to translate its platform into Russian, which resulted in numerous inaccurate product descriptions. With a giant catalog, the company needed a quick and reliable solution. Toloka managed to offer an innovative crowd-based solution – instead of using MT or CAT combined with human-validated translations as is normally the case, Toloka’s crowd performers were asked to provide their own translations whenever applicable. A newly suggested version became a fixed multiple-choice option that the next crowd performer had to either choose and hence approve or replace with their own version. The cycle continued until the performers as a group had no further improvements to suggest. The end result proved to be affordable and, above all, very effective.
Ozon’s inventory rivals that of AliExpress and offers more than 9 million SKUs across 24 different categories. The task Toloka was given had to do with evaluating the quality of the company’s search and determining the most effective product ranking model for the cataloged items. The performers had to rate search engine results from best to worst in order to identify filter issues on the website and provide a fine-grained analysis from the perspective of UI and UX. As a result, Ozon’s search engine is more powerful today than it’s ever been, notes Ozon’s tech department.
Yandex.Market is an e-commerce marketplace with over one million items available in its product catalog. When the company needed to tune its product recommendation engine, they developed a data labeling pipeline in Toloka to get the data they needed. An effective recommender system needs vast amounts of labeled data to support its ML model. Yandex.Market started out by using automated solutions to train their recommendation model, but the algorithm was not performing well enough. They developed a new strategy using the Toloka platform:
After integrating Toloka, the accuracy of the Yandex.Market recommender system went from a modest 40% to 90% overall, while recall rose from 20% to a solid 74% for accessories and 90% for related items. Recommendations stay up to date — now that the pipeline is in place, it is quick and easy to get new labels and retrain the system whenever a new category of products is introduced. The marketplace is currently looking into using Toloka to boost other aspects of their business.
Neatsy is a relatively new but already popular app that e-commerce marketplaces and D2C manufacturers use to offer their customers a handy option of 3D-scanning their feet to find the best fitting shoes. To make this feature a tangible reality, the app’s 3D scanner that works as a type of neural network needed over 50,000 labeled images in order to train itself to do a better job of separating human feet from the floor. Toloka was called to action again, and 3 weeks later the job was done. Neatsy confirmed that the app’s time to market was accelerated significantly, while its 3D scanner became 12% more accurate. The app is now trusted by famous brands like Nike, Reebok, Adidas, Puma, and Vans among others. ConclusionLooking at these examples, it becomes clear that AI will continue to play a huge role in the realm of online retail at an ever-increasing speed. As data-labeling techniques, namely crowdsourcing, will continue to become more and more time- and cost-effective, scalable, and capable of providing high-quality results, a great many more opportunities for business will fall into our lap. Both the AI and e-commerce markets will expand in parallel, bouncing off of each other to the point when e-commerce will become inseparable from the machine intellect at its core. So it’s becoming increasingly important to stay ahead of the AI curve with high-performance data to grow any e-commerce business. Learn more about intelligent data solutions to transform e-commerce from Toloka. *This post was written by Toloka’s team. We thank Toloka for their ongoing support of TheSequence.You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
🔄🔄 Edge#159: MLOps Full Recap
Tuesday, January 25, 2022
Dive in!
👷♀️🧑🏻🎓👩💻👨🏻🏫 The MoE Momentum
Sunday, January 23, 2022
Weekly news digest curated by the industry insiders
📌 Learn from 40+ AI experts at mlcon 2.0 ML dev conf <Feb22-23>
Friday, January 21, 2022
Our partner cnvrg.io is hosting another incredible virtual conference mlcon 2.0! It is FREE
🥸 Edge#158: Microsoft KEAR is a Deep Learning Model for Common Sense Reasoning
Thursday, January 20, 2022
What's New in AI, a deep dive into one of the freshest research papers or technology frameworks that are worth your attention. Our goal is to keep you up to date with new developments in AI in a
🎙Yinhan Liu/CTO of BirchAI about applying ML in the healthcare industry
Wednesday, January 19, 2022
On what healthcare companies spend tens of billions of dollars?
You Might Also Like
🔒 The Vault Newsletter: November issue 🔑
Monday, November 25, 2024
Get the latest business security news, updates, and advice from 1Password. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
🧐 The Most Interesting Phones You Didn't See in 2024 — Making Reddit Faster on Older Devices
Monday, November 25, 2024
Also: Best Black Friday Deals So Far, and More! How-To Geek Logo November 25, 2024 Did You Know If you look closely over John Lennon's shoulder on the iconic cover of The Beatles Abbey Road album,
JSK Daily for Nov 25, 2024
Monday, November 25, 2024
JSK Daily for Nov 25, 2024 View this email in your browser A community curated daily e-mail of JavaScript news JavaScript Certification Black Friday Offer – Up to 54% Off! Certificates.dev, the trusted
Ranked | How Americans Rate Business Figures 📊
Monday, November 25, 2024
This graphic visualizes the results of a YouGov survey that asks Americans for their opinions on various business figures. View Online | Subscribe Presented by: Non-consensus strategies that go where
Spyglass Dispatch: Apple Throws Their Film to the Wolves • The AI Supercomputer Arms Race • Sony's Mobile Game • The EU Hunts Bluesky • Bluesky Hunts User Trust • 'Glicked' Pricked • One Massive iPad
Monday, November 25, 2024
Apple Throws Their Film to the Wolves • The AI Supercomputer Arms Race • Sony's Mobile Game • The EU Hunts Bluesky • Bluesky Hunts User Trust • 'Glicked' Pricked • One Massive iPad The
Daily Coding Problem: Problem #1619 [Hard]
Monday, November 25, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given two non-empty binary trees s and t , check whether tree t has exactly the
Unpacking “Craft” in the Software Interface & The Five Pillars of Creative Flow
Monday, November 25, 2024
Systems Over Substance, Anytype's autumn updates, Ghost's progress with its ActivityPub integration, and a lot more in this week's issue of Creativerly. Creativerly Unpacking “Craft” in the
What Investors Want From AI Startups in 2025
Monday, November 25, 2024
Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 25, 2024? The HackerNoon
GCP Newsletter #426
Monday, November 25, 2024
Welcome to issue #426 November 25th, 2024 News LLM Official Blog Vertex AI Announcing Mistral AI's Large-Instruct-2411 on Vertex AI - Google Cloud has announced the availability of Mistral AI's
⏳ 36 Hours Left: Help Get "The Art of Data" Across the Finish Line 🏁
Monday, November 25, 2024
Visual Capitalist plans to unveal its secrets behind data storytelling, but only if the book hits its minimum funding goal. View Online | Subscribe | Download Our App We Need Your Help Only 36 Hours