🎙 Hyun Kim/CEO of Superb AI About Challenges with Data Labeling in Computer Vision
Was this email forwarded to you? Sign up here It’s so inspiring to learn from practitioners and thinkers. Getting to know the experience gained by researchers, engineers, and entrepreneurs doing real ML work is an excellent source of insight and inspiration. Share this interview if you like it. No subscription is needed. 👤 Quick bio / Hyun Kim
Hyun Kim (HK): I am the co-founder and CEO of Superb AI, an ML DataOps platform that helps computer vision teams automate and manage the full data pipeline: from ingestion and labeling to data quality assessment and delivery. I initially studied Biomedical Engineering and Electrical Engineering at Duke but shifted from genetic engineering to robotics and deep learning. I then pursued a Ph.D. in computer science at Duke with a focus on Robotics and Deep Learning but ended up taking leave to further immerse myself in the world of AI R&D at a corporate research lab. During this time, I started to experience the bottlenecks and obstacles that many companies still face to this day: data labeling and management were very manual, and the available solutions were nowhere near sufficient. 🛠 ML Work
HK: When you look at some of the amazing AI technologies that are being developed in research and compare them to the current array of public applications of AI, one starts to wonder what can be done to accelerate the adoption of cutting edge AI in current real-world environments. That is, in essence, the vision and mission of Superb AI. We still feel, to this day, that a big reason why deep tech has not been able to permeate itself faster is inherently due to the cumbersome nature of data operations. Data is what will continue to drive AI development and deployment, and there is still a lot of reticence on how to build highly efficient data pipelines, from start to finish. We aim to change that for the better.
HK: First, I’d like to preface this with a quick comparison between unstructured and structured data. For structured or non-computer vision use cases, it’s relatively easier to automate labeling. This can come in the form of hand-designing a set of rules or heuristics that can ultimately be used to define labeling functions and/or “weak classifiers” for auto-labeling. We have seen some amazing developments in this space through programmatic labeling, for example. However, this approach can not be applied to computer vision as it’s impractical to hand-design rules for object detection. This is due to the massive amounts of visual variance within the same object class that cannot be covered via a set of rules. Also, the definition/data labeling specifications are different for every organization, even for the same object class. For example, even in a very well known use-case such as autonomous driving, different companies will exhibit organizational nuances regarding data labeling and heuristics. Some teams will want to include side mirrors, some organizations are building use cases that need to include open trunks, and so on. These visual variances within what is perceived to be the same object class can not be covered with a simple set of rules. The environment of the real world constantly changes, as should the dependencies and requirements of these detection models. Video adds are another layer of challenges because each object needs to be tracked across frames. We have seen some naive approaches to this challenge, such as linear interpolation. However, because objects typically do not move at constant speeds, interpolation does not satisfy the requirements of most use cases. This often leads to more time spent QA’ing errors from this approach. Rather than using linear interpolation, our video labeling AI automatically tracks objects across frames. Especially in circumstances where objects appear and disappear between frames, our AI is able to track those objects, and our platform makes it almost too easy to track those instances as one single object. We also provide the ability to customize our AI based on organizational requirements. We’ve seen cases where teams need the ability to define temporal variables, such as assigning a max number of frames after an object disappears for the platform to assign that object a new ID. Outside of just images and video, we also see a lot of unique challenges with other popular data types, such as 3D point cloud. Of course, we will be releasing automation tools that will help clients build better 3D detection models faster, but like everything else, we are assessing the challenges that 3D point cloud brings and approaching it in a specific way to this use case.
HK: Our core focus has and will continue to be automation. We also think that the term AI or automation, even in our industry, has been overused, and there is just a lot of skepticism in the market due to this. It’s a shame because intelligent automation, we feel, is the key for the computer vision industry in taking that next leap. So market education, or re-education I should say, is a tangential challenge we face today. Our automation journey started with labeling and is currently sitting as our flagship product. The core tech behind our customizable auto-labels utilizes few-shot and transfer learning, allowing teams to take both simple or complex use cases and train a very capable auto-label using just 100 data points or images. We also layer on uncertainty estimation AI to help teams quickly identify labels for QA because an auto-label is only as good as its QA process. We released this about a year ago, and as expected, there has been some great feedback on how labeling automation has helped address cold start problems, edge case labeling, economies of scale, you name it. But we are entering the next phase of automation, which is around data quality and curation. It’s not enough anymore to just label a large dataset and brute force it into model training. Teams require the practice of not just ensuring data quality but which labels are most beneficial for their model, how to find more of these high-quality and relevant labels, and most importantly, how to iterate faster. This piece of data preparation is going to be a critical pillar for any DataOps program for computer vision teams and, yes, we will be introducing some exciting new products in this realm that uses cutting-edge automation technology.
HK: We think data curation is massively underrated and something that many teams do not currently do, and if they do, not very efficiently. There are a couple of reasons for this, with the main one being that data curation is a very time-consuming step. Curation, at its essence, should enable ML teams to understand the collected data, identify important subsets and edge cases, and curate custom training datasets to put back into their models. This is oversimplified, but the importance of curation can not be emphasized enough. A sophisticated curation workflow will, in my opinion, guide us to an era of less dependence on large volumes of data and shift focus to the quality of data. This is where the power of iteration will be unlocked.
HK: Within computer vision, if one were to use symbolic representations for automated data labeling, a few prerequisites need to be addressed, such as heavy data pre-processing and leaning on ML/AI methods. Basically, you won’t be able to encode rules such as “if pixel color is X, then classify as Y” as a base. However, deep learning models can create powerful features and embeddings that can be used as metadata to which teams can apply rules and heuristics. So, in short, yes, rule-based or heuristic-based auto-labeling for computer vision may work, but a foundational piece of deep ML will be required. These are concepts our R&D has been exploring for quite some time, and we should be able to determine whether or not productizing this concept will become a reality.
HK: I love this question because we have been using self-supervised learning in our stack for quite some time. In research and the real world, it has been evident that models trained on self-supervised learning are more robust and perform better than those pre-trained only with labeled data. In addition, being able to intelligently handle large amounts of unlabeled data will become very important and mission-critical, especially as teams continue to put an emphasis on economies of scale when building and deploying computer vision systems. More specifically, allowing teams to quickly and accurately understand gaps in datasets, identifying what to collect more of, what is causing poor model performance, how to bridge the gap between model and data observability; these are all things that we think are critical components to curating large volumes of unlabeled data and core pieces to our upcoming Curation product. Subscribing you support our mission to simplify AI education, one newsletter at a time. You can also give TheSequence as a gift. 💥 Miscellaneous – a set of rapid-fire questions
HK: Simpson’s Paradox. I remember being baffled when I first learned about it as a high school student in my statistics class. It also reminds me to be very careful and unbiased when interpreting data.
HK: The Turing Test was a simple and elegant way for us to conceptualize AI and build a human-like conversational AI. And I think we’re close to passing the Turing Test with state-of-the-art works like GPT-3. But practically, AGI should be able to do much more than just trick a human evaluator – it should be able to do everything. I had to do a bit of research, but there are clever alternatives like the Wozniak Test, where a robot makes coffee in a stranger’s home. It’s a funny test, but a true AGI should be able to pass a mixture of all these alternative tests!
HK: Assuming the person already has the math and statistics background, I’d recommend Deep Learning by Ian Goodfellow, Yoshua Bengio and Aaron Courville. I’d also recommend PRML (Pattern Recognition and Machine Learning) by Christopher Bishop, but I’ve seen people (including myself, to be honest!) find Deep Learning focused books more interesting than those on classical machine learning.
HK: Probably not. But I think we’re getting better at approximating NP problems with deep learning, so how’s P ≈ NP? :) You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Key phrases
Older messages
🕸 Edge#185: Centralized vs. Decentralized Distributed Training Architectures
Tuesday, April 26, 2022
In this issue: we overview Centralized vs. Decentralized Distributed Training Architectures; we explain GPipe, an Architecture for Training Large Scale Neural Networks; we explore TorchElastic, a
🦾 Serverless ML Execution
Sunday, April 24, 2022
Weekly news digest curated by the industry insiders
📌 Event: Join us at apply() – the ML Data Engineering Community Conference
Friday, April 22, 2022
It's free
🧑🎨 Edge#184: Inside DALL-E 2: OpenAI’s Upgraded Supermodel that can Generate Artistic Images from Text
Thursday, April 21, 2022
The new model outperforms its predecessor by generating higher quality images from highly complex language descriptions
📝 Guest post: How to setup MLOps at a reasonable scale: tips, tool stacks, and templates from companies that did
Wednesday, April 20, 2022
No subscription is needed
You Might Also Like
Stripe makes more changes
Thursday, April 25, 2024
TikTok is in trouble, and net neutrality is back View this email online in your browser By Christine Hall Thursday, April 25, 2024 Welcome back to TechCrunch PM, your home for all things startups,
💎 Issue 414 - From a Lorry Driver to Ruby on Rails Developer at 38
Thursday, April 25, 2024
This week's Awesome Ruby Newsletter Read this email on the Web The Awesome Ruby Newsletter Issue » 414 Release Date Apr 25, 2024 Your weekly report of the most popular Ruby news, articles and
💻 Issue 414 - JavaScript Features That Most Developers Don’t Know
Thursday, April 25, 2024
This week's Awesome Node.js Weekly Read this email on the Web The Awesome Node.js Weekly Issue » 414 Release Date Apr 25, 2024 Your weekly report of the most popular Node.js news, articles and
💻 Issue 407 - The Performance Impact of C++'s `final` Keyword
Thursday, April 25, 2024
This week's Awesome .NET Weekly Read this email on the Web The Awesome .NET Weekly Issue » 407 Release Date Apr 25, 2024 Your weekly report of the most popular .NET news, articles and projects
💻 Issue 414 - Everyone Has JavaScript, Right?
Thursday, April 25, 2024
This week's Awesome JavaScript Weekly Read this email on the Web The Awesome JavaScript Weekly Issue » 414 Release Date Apr 25, 2024 Your weekly report of the most popular JavaScript news, articles
📱 Issue 408 - All web browsers on iOS are just Safari with different design
Thursday, April 25, 2024
This week's Awesome iOS Weekly Read this email on the Web The Awesome iOS Weekly Issue » 408 Release Date Apr 25, 2024 Your weekly report of the most popular iOS news, articles and projects Popular
💧 Don't Bother Liquid Cooling Your AMD CPU — Why You Should Keep Using Live Photos on iPhone
Thursday, April 25, 2024
Also: We review the Unistellar Odyssey iPhone Telescope, and More! How-To Geek Logo April 25, 2024 Did You Know Charles Darwin and Abraham Lincoln were both born on the same day: February 12, 1809. 💻
💻 Issue 332 - 🥇The first framework that lets you visualize your React/NodeJS app 🤯
Thursday, April 25, 2024
This week's Awesome React Weekly Read this email on the Web The Awesome React Weekly Issue » 332 Release Date Apr 25, 2024 Your weekly report of the most popular React news, articles and projects
💻 Issue 409 - Sized, DynSized, and Unsized by Niko Matsakis
Thursday, April 25, 2024
This week's Awesome Rust Weekly Read this email on the Web The Awesome Rust Weekly Issue » 409 Release Date Apr 25, 2024 Your weekly report of the most popular Rust news, articles and projects
📱 Issue 411 - AI Starts to Sift Through String Theory's Near-Endless Possibilities
Thursday, April 25, 2024
This week's Awesome Swift Weekly Read this email on the Web The Awesome Swift Weekly Issue » 411 Release Date Apr 25, 2024 Your weekly report of the most popular Swift news, articles and projects