Tomasz Tunguz - The Challenge of the AI Demo
Tomasz TunguzVenture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. The Challenge of the AI Demo
The AI Demo isn’t easy. Many of the major AI companies have demoed their AI systems, first starting with pre-recorded, & now pushing into live demos. They don’t always work. Multiply Murphy’s Law by a non-deterministic system & it’s not unreasonable to expect AI demos to nearly always hiccup. Demo disruptions aren’t disaster. These systems are early & changing rapidly. They might suggest the system requires work & tuning, not a fundamental challenge. But, they can be problematic in proofs-of concept. Proofs of concept are extended demonstrations of the software. Well-structured PoCs align on success criteria at the outset. These criteria enable vendors & customers to agree on what success looks like. Worflow proofs-of-concept are relatively straightforward. They are deterministic. Can I process a loan application in 5 minutes? Yes or no. But as AI applications shift to selling outcomes implicitly or explicitly, the PoC becomes a testing ground of those outcomes. Non-determinism means sometimes the PoC won’t produce the required wow moment. This also means the PoC criteria must be more flexible. How does a buyer evaluate a probabilistic system? Do we compare it to human performance? Speaking to some practitioners, they’ve shared with us human labelers typically agree on 60-70% of the time. Does a AI robot need to be as accurate as a human assuming it will be much less expensive? Or will we expect more as we do in self-driving cars? If AI systems require human assistance, then the ROI of the system must include some human operating expense - whether explicit or implicit. Some teams will want to benchmark systems in parallel to determine the relative performance. With most startups building atop existing models & setting aside differences in fine-tuning, the ultimate performance should be relatively comparable, provided they use the same data sets. Will startups compete on access to different data sets? Today, there are more questions than answers about how to sell AI agent systems. We’re hosting an event on the evening of Sep 10th in San Francisco to interview leaders in the space moderated by Dave Morse, former CRO at Hebbia & VPS/VPCS at ScaleAI to talk about some of these questions. If you’re interested to attend, see the details here. |
Older messages
Which Design Era Are We In?
Tuesday, September 3, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. Which Design Era Are We In? When the internet became popular
The Dislocation Between Public & Private Web3 Markets
Friday, August 30, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. The Dislocation Between Public & Private Web3 Markets With
Higher Levels of Abstraction
Monday, August 26, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. Higher Levels of Abstraction Over the weekend, Andrej Karpathy
What Has Your GPU Done For You Today?
Friday, August 23, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. What Has Your GPU Done For You Today? A year ago, enterprises
Things that Used to be Impossible, but are Now Really Hard
Tuesday, August 20, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. Things that Used to be Impossible, but are Now Really Hard
You Might Also Like
The Rarity Shibboleth
Monday, September 9, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. The Rarity Shibboleth Large language models are wonderful at
There and Focused
Monday, September 9, 2024
Simplify your report, avoiding manual tasks and use AI to duplicate you BetaList BetaList Daily There Simplify your report, avoiding manual tasks and use AI to duplicate you Focused All-in-one
🦄 We have a BIG announcement...
Monday, September 9, 2024
We've been quietly working on something for months 👀
Must-see: 8 biggest mistakes new ecomm founders make
Monday, September 9, 2024
#2 is deadly! You would want to avoid it like a plague Hey Friend , As a new entrepreneur, you will make mistakes and you will stumble. That's just the reality of the job. The goal isn't to
14 Silicon Valley Startups Raised $677.4M - Week of September 2, 2024
Monday, September 9, 2024
📺 Solana vs Ethereum is like Android vs iOS 💪 Founder's Mode by Paul Graham 💰 Magnificent 7 are Distorting VC market 🤼♂️ Clash of the Tech Titans ⛓️ Pavel Durov's Declaration ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
SoftBank on AI
Monday, September 9, 2024
Plus: Vinted's €5bn valuation; and how not to overhire View in browser Sponsor Card - Up Round-23 Good morning there, SoftBank's back. The Japanese investor — which has been laying low for the
3 hours left.
Sunday, September 8, 2024
you ready? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
🧡 How YC Was Created with Jessica Livingston
Sunday, September 8, 2024
Familiar with YC's lore? Co-founder Jessica Livingston shares the stories and decisions that would form the foundations of YC as we know it today. This Week at YC September 8th, 2024 ✨ The Latest
Amazon & Google’s finserv moves
Sunday, September 8, 2024
where tech giants are investing, acquiring, and partnering – and how it's shaping the future of fintech Hi there, Amazon and Google are embedding themselves in the digital financial ecosystem. And
Homie and sync. labs
Sunday, September 8, 2024
AI project manager to speed up software development BetaList BetaList Daily Homie AI project manager to speed up software development sync. labs sync the lips of any video to any audio with one click