🎙 Hyun Kim/CEO of Superb AI About Challenges with Data Labeling in Computer Vision
Was this email forwarded to you? Sign up here It’s so inspiring to learn from practitioners and thinkers. Getting to know the experience gained by researchers, engineers, and entrepreneurs doing real ML work is an excellent source of insight and inspiration. Share this interview if you like it. No subscription is needed. 👤 Quick bio / Hyun Kim
Hyun Kim (HK): I am the co-founder and CEO of Superb AI, an ML DataOps platform that helps computer vision teams automate and manage the full data pipeline: from ingestion and labeling to data quality assessment and delivery. I initially studied Biomedical Engineering and Electrical Engineering at Duke but shifted from genetic engineering to robotics and deep learning. I then pursued a Ph.D. in computer science at Duke with a focus on Robotics and Deep Learning but ended up taking leave to further immerse myself in the world of AI R&D at a corporate research lab. During this time, I started to experience the bottlenecks and obstacles that many companies still face to this day: data labeling and management were very manual, and the available solutions were nowhere near sufficient. 🛠 ML Work
HK: When you look at some of the amazing AI technologies that are being developed in research and compare them to the current array of public applications of AI, one starts to wonder what can be done to accelerate the adoption of cutting edge AI in current real-world environments. That is, in essence, the vision and mission of Superb AI. We still feel, to this day, that a big reason why deep tech has not been able to permeate itself faster is inherently due to the cumbersome nature of data operations. Data is what will continue to drive AI development and deployment, and there is still a lot of reticence on how to build highly efficient data pipelines, from start to finish. We aim to change that for the better.
HK: First, I’d like to preface this with a quick comparison between unstructured and structured data. For structured or non-computer vision use cases, it’s relatively easier to automate labeling. This can come in the form of hand-designing a set of rules or heuristics that can ultimately be used to define labeling functions and/or “weak classifiers” for auto-labeling. We have seen some amazing developments in this space through programmatic labeling, for example. However, this approach can not be applied to computer vision as it’s impractical to hand-design rules for object detection. This is due to the massive amounts of visual variance within the same object class that cannot be covered via a set of rules. Also, the definition/data labeling specifications are different for every organization, even for the same object class. For example, even in a very well known use-case such as autonomous driving, different companies will exhibit organizational nuances regarding data labeling and heuristics. Some teams will want to include side mirrors, some organizations are building use cases that need to include open trunks, and so on. These visual variances within what is perceived to be the same object class can not be covered with a simple set of rules. The environment of the real world constantly changes, as should the dependencies and requirements of these detection models. Video adds are another layer of challenges because each object needs to be tracked across frames. We have seen some naive approaches to this challenge, such as linear interpolation. However, because objects typically do not move at constant speeds, interpolation does not satisfy the requirements of most use cases. This often leads to more time spent QA’ing errors from this approach. Rather than using linear interpolation, our video labeling AI automatically tracks objects across frames. Especially in circumstances where objects appear and disappear between frames, our AI is able to track those objects, and our platform makes it almost too easy to track those instances as one single object. We also provide the ability to customize our AI based on organizational requirements. We’ve seen cases where teams need the ability to define temporal variables, such as assigning a max number of frames after an object disappears for the platform to assign that object a new ID. Outside of just images and video, we also see a lot of unique challenges with other popular data types, such as 3D point cloud. Of course, we will be releasing automation tools that will help clients build better 3D detection models faster, but like everything else, we are assessing the challenges that 3D point cloud brings and approaching it in a specific way to this use case.
HK: Our core focus has and will continue to be automation. We also think that the term AI or automation, even in our industry, has been overused, and there is just a lot of skepticism in the market due to this. It’s a shame because intelligent automation, we feel, is the key for the computer vision industry in taking that next leap. So market education, or re-education I should say, is a tangential challenge we face today. Our automation journey started with labeling and is currently sitting as our flagship product. The core tech behind our customizable auto-labels utilizes few-shot and transfer learning, allowing teams to take both simple or complex use cases and train a very capable auto-label using just 100 data points or images. We also layer on uncertainty estimation AI to help teams quickly identify labels for QA because an auto-label is only as good as its QA process. We released this about a year ago, and as expected, there has been some great feedback on how labeling automation has helped address cold start problems, edge case labeling, economies of scale, you name it. But we are entering the next phase of automation, which is around data quality and curation. It’s not enough anymore to just label a large dataset and brute force it into model training. Teams require the practice of not just ensuring data quality but which labels are most beneficial for their model, how to find more of these high-quality and relevant labels, and most importantly, how to iterate faster. This piece of data preparation is going to be a critical pillar for any DataOps program for computer vision teams and, yes, we will be introducing some exciting new products in this realm that uses cutting-edge automation technology.
HK: We think data curation is massively underrated and something that many teams do not currently do, and if they do, not very efficiently. There are a couple of reasons for this, with the main one being that data curation is a very time-consuming step. Curation, at its essence, should enable ML teams to understand the collected data, identify important subsets and edge cases, and curate custom training datasets to put back into their models. This is oversimplified, but the importance of curation can not be emphasized enough. A sophisticated curation workflow will, in my opinion, guide us to an era of less dependence on large volumes of data and shift focus to the quality of data. This is where the power of iteration will be unlocked.
HK: Within computer vision, if one were to use symbolic representations for automated data labeling, a few prerequisites need to be addressed, such as heavy data pre-processing and leaning on ML/AI methods. Basically, you won’t be able to encode rules such as “if pixel color is X, then classify as Y” as a base. However, deep learning models can create powerful features and embeddings that can be used as metadata to which teams can apply rules and heuristics. So, in short, yes, rule-based or heuristic-based auto-labeling for computer vision may work, but a foundational piece of deep ML will be required. These are concepts our R&D has been exploring for quite some time, and we should be able to determine whether or not productizing this concept will become a reality.
HK: I love this question because we have been using self-supervised learning in our stack for quite some time. In research and the real world, it has been evident that models trained on self-supervised learning are more robust and perform better than those pre-trained only with labeled data. In addition, being able to intelligently handle large amounts of unlabeled data will become very important and mission-critical, especially as teams continue to put an emphasis on economies of scale when building and deploying computer vision systems. More specifically, allowing teams to quickly and accurately understand gaps in datasets, identifying what to collect more of, what is causing poor model performance, how to bridge the gap between model and data observability; these are all things that we think are critical components to curating large volumes of unlabeled data and core pieces to our upcoming Curation product. Subscribing you support our mission to simplify AI education, one newsletter at a time. You can also give TheSequence as a gift. 💥 Miscellaneous – a set of rapid-fire questions
HK: Simpson’s Paradox. I remember being baffled when I first learned about it as a high school student in my statistics class. It also reminds me to be very careful and unbiased when interpreting data.
HK: The Turing Test was a simple and elegant way for us to conceptualize AI and build a human-like conversational AI. And I think we’re close to passing the Turing Test with state-of-the-art works like GPT-3. But practically, AGI should be able to do much more than just trick a human evaluator – it should be able to do everything. I had to do a bit of research, but there are clever alternatives like the Wozniak Test, where a robot makes coffee in a stranger’s home. It’s a funny test, but a true AGI should be able to pass a mixture of all these alternative tests!
HK: Assuming the person already has the math and statistics background, I’d recommend Deep Learning by Ian Goodfellow, Yoshua Bengio and Aaron Courville. I’d also recommend PRML (Pattern Recognition and Machine Learning) by Christopher Bishop, but I’ve seen people (including myself, to be honest!) find Deep Learning focused books more interesting than those on classical machine learning.
HK: Probably not. But I think we’re getting better at approximating NP problems with deep learning, so how’s P ≈ NP? :) You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
🕸 Edge#185: Centralized vs. Decentralized Distributed Training Architectures
Tuesday, April 26, 2022
In this issue: we overview Centralized vs. Decentralized Distributed Training Architectures; we explain GPipe, an Architecture for Training Large Scale Neural Networks; we explore TorchElastic, a
🦾 Serverless ML Execution
Sunday, April 24, 2022
Weekly news digest curated by the industry insiders
📌 Event: Join us at apply() – the ML Data Engineering Community Conference
Friday, April 22, 2022
It's free
🧑🎨 Edge#184: Inside DALL-E 2: OpenAI’s Upgraded Supermodel that can Generate Artistic Images from Text
Thursday, April 21, 2022
The new model outperforms its predecessor by generating higher quality images from highly complex language descriptions
📝 Guest post: How to setup MLOps at a reasonable scale: tips, tool stacks, and templates from companies that did
Wednesday, April 20, 2022
No subscription is needed
You Might Also Like
🔒 The Vault Newsletter: November issue 🔑
Monday, November 25, 2024
Get the latest business security news, updates, and advice from 1Password. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
🧐 The Most Interesting Phones You Didn't See in 2024 — Making Reddit Faster on Older Devices
Monday, November 25, 2024
Also: Best Black Friday Deals So Far, and More! How-To Geek Logo November 25, 2024 Did You Know If you look closely over John Lennon's shoulder on the iconic cover of The Beatles Abbey Road album,
JSK Daily for Nov 25, 2024
Monday, November 25, 2024
JSK Daily for Nov 25, 2024 View this email in your browser A community curated daily e-mail of JavaScript news JavaScript Certification Black Friday Offer – Up to 54% Off! Certificates.dev, the trusted
Ranked | How Americans Rate Business Figures 📊
Monday, November 25, 2024
This graphic visualizes the results of a YouGov survey that asks Americans for their opinions on various business figures. View Online | Subscribe Presented by: Non-consensus strategies that go where
Spyglass Dispatch: Apple Throws Their Film to the Wolves • The AI Supercomputer Arms Race • Sony's Mobile Game • The EU Hunts Bluesky • Bluesky Hunts User Trust • 'Glicked' Pricked • One Massive iPad
Monday, November 25, 2024
Apple Throws Their Film to the Wolves • The AI Supercomputer Arms Race • Sony's Mobile Game • The EU Hunts Bluesky • Bluesky Hunts User Trust • 'Glicked' Pricked • One Massive iPad The
Daily Coding Problem: Problem #1619 [Hard]
Monday, November 25, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given two non-empty binary trees s and t , check whether tree t has exactly the
Unpacking “Craft” in the Software Interface & The Five Pillars of Creative Flow
Monday, November 25, 2024
Systems Over Substance, Anytype's autumn updates, Ghost's progress with its ActivityPub integration, and a lot more in this week's issue of Creativerly. Creativerly Unpacking “Craft” in the
What Investors Want From AI Startups in 2025
Monday, November 25, 2024
Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 25, 2024? The HackerNoon
GCP Newsletter #426
Monday, November 25, 2024
Welcome to issue #426 November 25th, 2024 News LLM Official Blog Vertex AI Announcing Mistral AI's Large-Instruct-2411 on Vertex AI - Google Cloud has announced the availability of Mistral AI's
⏳ 36 Hours Left: Help Get "The Art of Data" Across the Finish Line 🏁
Monday, November 25, 2024
Visual Capitalist plans to unveal its secrets behind data storytelling, but only if the book hits its minimum funding goal. View Online | Subscribe | Download Our App We Need Your Help Only 36 Hours