🎙Yinhan Liu/CTO of BirchAI about applying ML in the healthcare industry
Was this email forwarded to you? Sign up here It’s so inspiring to learn from practitioners. Getting to know the experience gained by researchers, engineers, and entrepreneurs doing real ML work is an excellent source of insight and inspiration. Share this interview if you find it enriching. No subscription is needed. 👤 Quick bio / Yinhan Liu
Yinhan Liu (YL): I started my undergrad as a Chemical Engineering major and added a math major – not focused at all on CS. I didn’t get my start in the field until I took an ML class during my first semester of grad school, which inspired me to spend a lot of personal time reading AI-related papers. I eventually made my way to Facebook AI Research, where I had the opportunity to work with some great people at an important time in NLP history. But, while I enjoyed the research side of things, I wanted to have a more direct impact on people. So, I decided to co-found BirchAI at AI2 with trusted colleagues I had known for 5 to 10 years. I’m now its CTO, leading Engineering and Science. 🛠 ML Work
YL: BirchAI is focused on applying AI to complex audio processes in healthcare – an area that Sumant (COO), Kevin (CEO), and I have been thinking about for a long time. There’s much more beyond this, but our initial focus is on automating complex After Call Work in healthcare call centers – think of a patient calling in about an issue with her pacemaker. The healthcare industry faces several related business challenges that drive our ML challenges. For example, humans vary a lot in terms of how they understand, classify, and summarize detailed healthcare conversations. For BirchAI, that means that IF the data is labeled, it is usually labeled poorly. We have developed effective workarounds that have allowed us to achieve very high accuracy at scale. That leads us to another point: the notion of “Explainable Human”. Many customers initially maintain that their call center teams already achieve consistency and accuracy of 98 or 99%. Invariably we see that is not true. Companies think they know how employees are doing the work. But it is based on crude, low-volume, and manual sampling methods of Quality Assessment that fail to understand the semantic richness of conversations at scale and how that dialogue should be characterized. The BirchAI product highlights this variance and gives us the means to drive and maintain consistency and accuracy at a previously unattainable scale. Healthcare companies spend tens of billions of dollars trying to address these questions – we are addressing those at scale.
YL: Our first challenge is that our data is not labeled – and large-scale pre-trained models do not work out of the box. We have built a complex AI-based pipeline to label data we use to train at scale and then reach a high degree of accuracy. Another challenge is that these problems cannot be met with a single module – so we’ve used a multi-modal approach to create a robust pipeline of models for our product.
YL: Previous NLP technology was essentially as developed as it was going to get, yet it was not accurate or robust enough to meet customer needs for most healthcare use cases. As a result, many processes are still done manually. But pre-trained models with a transformer architecture now provide a higher performance starting point, and there is much more to be discovered. We intimately understand those opportunities in areas like voice and document AI, and we are actively exploiting those to build game-changing products in healthcare.
YL: 1. The first big problem we needed to overcome was Speech to Text – we have found that the off-the-shelf APIs do not provide a good enough input for our downstream models. That’s why we built our own STT that consistently outperforms the other STT models we can see. Of course, we will continue to improve this model, which is flexible enough to allow that. 2. Another problem has been how to optimize our models. We are not a consulting shop – we are a product company. How do we maximize production performance with the fewest possible models? For example, we have a large medical device customer with a single, high-quality dialogue summarization model working across four different products. We are starting to deploy that and are excited to see how we can extend that across all their products. 3. At the core of our capabilities has been the ability to use AI itself to create high-quality, large-scale, labeled data. This is similar in concept to back-translation, where we use AI models to create labels at scale, and we then train other AI models using those labels and other data. We’ve had great success with this and see many possibilities for the approach.
YL: It’s not enough to create a huge model that uses massive amounts of expensive computing to create a result out in dev. We’ve been lucky enough to recruit a great founding engineer, Gaurav Shegokar, who really understands how to optimize inference time and accuracy – or infrastructure and performance. That blend of traditional software and AI at scale is a key characteristic we look for in new engineering hires. 💥 Miscellaneous – a set of rapid-fire questions
Twin Primes.
Introduction to Statistical Learning. It tells you everything you need to get started!
We use a bit of a Turing Test approach when we show people our dialogue summaries. After more than 50 interactions, more people identify our BirchAI-generated summary as created by a human than the correct one. So yes, I guess it is still relevant.
No, I don’t believe so. You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
➰➰ Edge#157: CI/CD in ML Solutions
Tuesday, January 18, 2022
In this issue: we explore CI/CD in ML Solutions; we discuss Amazon's continual learning architecture that manages the ML models lifecycle; we overview CML, an open-source library for enabling CI/CD
🚘 Uber Continues its Open-Source ML Traction
Sunday, January 16, 2022
Weekly news digest curated by the industry insiders
📥 Download your AI Infrastructure report from Forrester Research*
Friday, January 14, 2022
Courtesy of Run:AI
📌 Event: Join us at apply() – the ML Data Engineering Community Meetup
Thursday, January 13, 2022
It's free
📊 👩💻🥸 Edge#156: The ML Powering LinkedIn’s Recruiting Recommendation System
Thursday, January 13, 2022
Deep dive into an incredibly sophisticated series of search and recommendation algorithms
You Might Also Like
Software Testing Weekly - Issue 247
Tuesday, November 26, 2024
QA Job Hunting Resources 📚 View on the Web Archives ISSUE 247 November 26th 2024 COMMENT Welcome to the 247th issue! Today, I'd like to highlight a fantastic set of QA Job Hunting Resources.
🔒 The Vault Newsletter: November issue 🔑
Monday, November 25, 2024
Get the latest business security news, updates, and advice from 1Password. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
🧐 The Most Interesting Phones You Didn't See in 2024 — Making Reddit Faster on Older Devices
Monday, November 25, 2024
Also: Best Black Friday Deals So Far, and More! How-To Geek Logo November 25, 2024 Did You Know If you look closely over John Lennon's shoulder on the iconic cover of The Beatles Abbey Road album,
JSK Daily for Nov 25, 2024
Monday, November 25, 2024
JSK Daily for Nov 25, 2024 View this email in your browser A community curated daily e-mail of JavaScript news JavaScript Certification Black Friday Offer – Up to 54% Off! Certificates.dev, the trusted
Ranked | How Americans Rate Business Figures 📊
Monday, November 25, 2024
This graphic visualizes the results of a YouGov survey that asks Americans for their opinions on various business figures. View Online | Subscribe Presented by: Non-consensus strategies that go where
Spyglass Dispatch: Apple Throws Their Film to the Wolves • The AI Supercomputer Arms Race • Sony's Mobile Game • The EU Hunts Bluesky • Bluesky Hunts User Trust • 'Glicked' Pricked • One Massive iPad
Monday, November 25, 2024
Apple Throws Their Film to the Wolves • The AI Supercomputer Arms Race • Sony's Mobile Game • The EU Hunts Bluesky • Bluesky Hunts User Trust • 'Glicked' Pricked • One Massive iPad The
Daily Coding Problem: Problem #1619 [Hard]
Monday, November 25, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given two non-empty binary trees s and t , check whether tree t has exactly the
Unpacking “Craft” in the Software Interface & The Five Pillars of Creative Flow
Monday, November 25, 2024
Systems Over Substance, Anytype's autumn updates, Ghost's progress with its ActivityPub integration, and a lot more in this week's issue of Creativerly. Creativerly Unpacking “Craft” in the
What Investors Want From AI Startups in 2025
Monday, November 25, 2024
Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 25, 2024? The HackerNoon
GCP Newsletter #426
Monday, November 25, 2024
Welcome to issue #426 November 25th, 2024 News LLM Official Blog Vertex AI Announcing Mistral AI's Large-Instruct-2411 on Vertex AI - Google Cloud has announced the availability of Mistral AI's