Tomasz Tunguz - Agentic Systems' Sales Cycles
Tomasz TunguzVenture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. Agentic Systems' Sales Cycles
As software startups begin to sell agentic systems, the procurement process will change. Unlike classical software, where the application either meets the criteria (price, integration into other software, particular features) or doesn’t, agentic systems operate on a performance continuum. Here’s a recent evaluation table for Codestral, Mistral’s open-source code generation AI. All of these benchmarks are machine-generated : HumanEval & HumanEvalFIM are not human testers - but open-source projects that evaluate AI code.1 This type of evaluation works well for broad sense of relative performance. But what if a business writes code in a particular language? Or with particular performance characteristics in mind? What if an AI-powered customer support agent needs to be able to manage very technical telecom queries? Or a marketing AI needs to be culturally sensitive to a particular region? The generic tests probably won’t work, which translates to slower sales cycles as prospective buyers understand the system’s performance in their own context. In addition, agentic systems in the future will operate for longer periods of time without human intervention. The greater the autonomy, the greater the potential for errors. Benchmarks may not be enough; buyers may want to see how the system performs in their own context over time. Startups - as they always do - will find ways to accelerate the evaluation. They might develop their own standards much the way that OpenAI has, or partner with third-parties to offer those third party evaluations for particular use-cases. Imagine a modern day Gartner for Agentic Systems, a company that maintains a diverse pool of human evaluators & computer scientists skilled in various the evaluation of agentic products. Alternatively, the most sophisticated organizations could create standards that then become broadly adopted. Banks could publish open-source standards for regulator-compliant customer support chatbots. This purchasing behavior does exist elsewhere. Backtesting is the norm in trading algorithms & marketing optimization. Within the most sophisticated security organizations, security labs exist to test machine learning-based security products and performance before deploying them. In certain cases, the business need will overwhelm the procurement process. This happens in classic software & it will happen with AI but it’s rarer. However the problem is solved, agentic systems will evolve the procurement process & startups will need to navigate it. 1 OpenAI created both of these tests to measure the accuracy of its code generation model & now it’s a standard for evaluating AI code generation models. |
Older messages
The Future of Blockchain Data : Our Investment in Allium
Thursday, July 18, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. The Future of Blockchain Data : Our Investment in Allium Large
Punctuated Equilibrium in AI : Is it Better to Be A First Mover or A Last Mover?
Monday, July 15, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. Punctuated Equilibrium in AI : Is it Better to Be A First Mover
AI Agent Pricing
Friday, July 12, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. AI Agent Pricing In a world where AI agents are 2.5-3x as
AI Pricing Strategies for SaaS Companies Offering Copilots
Thursday, July 11, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. AI Pricing Strategies for SaaS Companies Offering Copilots
Select avg(Moby Dick) limit 2 sentences
Monday, July 8, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. Select avg(Moby Dick) limit 2 sentences The SQL statement
You Might Also Like
Is 'climate tech' dead?
Friday, December 20, 2024
Plus: Europe's fastest-growing startup teams; latest deals View in browser Logo - Zoom_flagship (1) Good morning there, Across the last few weeks, I've seen a lot of 'RIP climate tech'
15 predictions for 2025
Friday, December 20, 2024
PLUS: What I got right and wrong about 2024 Platformer Platformer 15 predictions for 2025 PLUS: What I got right and wrong about 2024 By Casey Newton • 19 Dec 2024 View in browser View in browser (
SaaSHub Weekly - Dec 19
Thursday, December 19, 2024
SaaSHub Weekly - Dec 19 Featured and useful products 2FAS logo 2FAS Simple 2FA Authenticator - Generate Two Factor Authentication tokens. #Two Factor Authentication #Security & Privacy #Identity
95 new Shopify apps for you 🌟
Thursday, December 19, 2024
New Shopify apps hand-picked for you 🙌 Week 50 Dec 9, 2024 - Dec 16, 2024 New Shopify apps hand-picked for you 🙌 What's New at Shopify? 🌱 Handle the Canadian Tax Holiday Automatically with Shopify
🗞 What's New: Twitter's founder just launched a new social app
Thursday, December 19, 2024
Also: OpenAI big spenders get o1 API access ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
SaaS Startup Ideas
Thursday, December 19, 2024
framework for startup ideas ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
🌅The Power of Mindset in Shaping Entrepreneurial Success
Thursday, December 19, 2024
This week, we're diving deep into the role mindset plays in entrepreneurial success ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Duolingo for stock market 📈👀
Thursday, December 19, 2024
Just hunted a radical new product that asks what if Bloomberg was made in 2024 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
don’t get off the rollercoaster
Thursday, December 19, 2024
Read time: 51 sec. Someone in Starter Story Academy DM'd me the other day: “Building a business feels like a rollercoaster. Is it supposed to be this way??” YES. 100%. Here's what nobody tells
Modern Meditations: Caleb Watney
Thursday, December 19, 2024
What DOGE should do, the potential in far-UV, and how billionaires should spend their money ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏