Tomasz Tunguz - The Challenge of the AI Demo
Tomasz TunguzVenture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. The Challenge of the AI Demo
The AI Demo isn’t easy. Many of the major AI companies have demoed their AI systems, first starting with pre-recorded, & now pushing into live demos. They don’t always work. Multiply Murphy’s Law by a non-deterministic system & it’s not unreasonable to expect AI demos to nearly always hiccup. Demo disruptions aren’t disaster. These systems are early & changing rapidly. They might suggest the system requires work & tuning, not a fundamental challenge. But, they can be problematic in proofs-of concept. Proofs of concept are extended demonstrations of the software. Well-structured PoCs align on success criteria at the outset. These criteria enable vendors & customers to agree on what success looks like. Worflow proofs-of-concept are relatively straightforward. They are deterministic. Can I process a loan application in 5 minutes? Yes or no. But as AI applications shift to selling outcomes implicitly or explicitly, the PoC becomes a testing ground of those outcomes. Non-determinism means sometimes the PoC won’t produce the required wow moment. This also means the PoC criteria must be more flexible. How does a buyer evaluate a probabilistic system? Do we compare it to human performance? Speaking to some practitioners, they’ve shared with us human labelers typically agree on 60-70% of the time. Does a AI robot need to be as accurate as a human assuming it will be much less expensive? Or will we expect more as we do in self-driving cars? If AI systems require human assistance, then the ROI of the system must include some human operating expense - whether explicit or implicit. Some teams will want to benchmark systems in parallel to determine the relative performance. With most startups building atop existing models & setting aside differences in fine-tuning, the ultimate performance should be relatively comparable, provided they use the same data sets. Will startups compete on access to different data sets? Today, there are more questions than answers about how to sell AI agent systems. We’re hosting an event on the evening of Sep 10th in San Francisco to interview leaders in the space moderated by Dave Morse, former CRO at Hebbia & VPS/VPCS at ScaleAI to talk about some of these questions. If you’re interested to attend, see the details here. |
Older messages
Which Design Era Are We In?
Tuesday, September 3, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. Which Design Era Are We In? When the internet became popular
The Dislocation Between Public & Private Web3 Markets
Friday, August 30, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. The Dislocation Between Public & Private Web3 Markets With
Higher Levels of Abstraction
Monday, August 26, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. Higher Levels of Abstraction Over the weekend, Andrej Karpathy
What Has Your GPU Done For You Today?
Friday, August 23, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. What Has Your GPU Done For You Today? A year ago, enterprises
Things that Used to be Impossible, but are Now Really Hard
Tuesday, August 20, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. Things that Used to be Impossible, but are Now Really Hard
You Might Also Like
Fidloop, AnimeGlobe, and Property Vue
Monday, September 16, 2024
Helping renovation contractors manage their projects & clients BetaList BetaList Daily Win a $100 Amazon gift card? Fill out this one-question survey about domain names. Marc Property Vue Helping
emerging energy trends
Monday, September 16, 2024
dive into the trends our experts are tracking into 2025 and beyond Hi there, Want to learn about the latest energy trends our expert is tracking into 2025 and beyond? Then this webinar is for you. CB
2024 Theory Ventures Go-to-Market Survey: Optimism Rises Amid Changing Market Dynamics
Monday, September 16, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. 2024 Theory Ventures Go-to-Market Survey: Optimism Rises Amid
😓 Tired of Empty Promises?
Monday, September 16, 2024
+ (only) 3 guarantees I make
🦄 Job offboarding, improved
Monday, September 16, 2024
When guides offboarded employees through post-employment life.
12 Silicon Valley Startups Raised $466.9M - Week of September 9, 2024
Monday, September 16, 2024
Launch of 🍓 Unicorn Founders' Average Age is 35 💰 Tech for America Recap 🇺🇸 What Kind of Founders I Want to Invest In? 💪 SB-1047 for AI ️👮 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
⏰ Only 1 Week Left!
Monday, September 16, 2024
Ready to build the life you dream of? Hey Friend , The countdown is on—just 1 week to go until the Start Your Ecommerce Business Summit kicks off! If you're ready to take control of your future,
Atomico's overhaul
Monday, September 16, 2024
Plus: The Paris-based startups to watch; latest deals View in browser Groupe ADP logo Good morning there, Last Monday, Atomico announced a fresh $1.24bn raised across two new funds; over 50% more than
✉️ How to Write Emails That Actually Convert
Sunday, September 15, 2024
Let's face it: most cold emails end up in the trash. But a well-crafted one could have a major impact on your startup and help you get new customers. This Week at YC September 15th, 2024 ✨ The
#197 | The next industrial revolution, Matching AI budgets with hype, & more
Sunday, September 15, 2024
Sept 15th | The latest from Insight, BBG, Flare, F-Prime, USV, and others ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏