Open Source Models : What Can We Determine from Download Patterns?
Tomasz TunguzVenture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. Open Source Models : What Can We Determine from Download Patterns?
Open source models have become a critical part of the AI landscape. I was curious about the trends in the open source ecosystem, so I analyzed HuggingFace data on the top 300 open source models, both by overall usage & also the top of the trending list. Open source models are governed by open source licenses. Similar to regular open source software, Apache & MIT dominate the licenses by model count. 76% of the top models choose one of these licenses. Apache is nearly twice as popular as MIT. But the concentration is greater when viewing the share by downloads. Models with Apache or MIT licenses represent 92% of downloaded models last month. Stability, Facebook, & Microsoft top the creator list of open source models by count. So does TheBloke, an engineer who quantizes (or compresses) open source models.
But the download data shows very different patterns. Meta’s models recorded 30% of downloads, driven by its word2vec model for speech recognition. Then OpenAI & Google not far behind.
The most popular models by downloads are models for training other models, called Fill-Mask models. Then speech recognition. Third is text classification (LLMs are very good at this.) Text generation is fifth. How about popularity? HuggingFace likes of a model are completely uncorrelated to downloads with an R^2 of 0.06. Overall, we can conclude more lax licenses dominate the top models. Meta, Google, Microsoft, Stability, & OpenAI are important players within the open source ecosystem. Speech is the most popular end-user application of open source models by downloads in the last month, superseded by testing - which makes sense given how many companies are building or testing LLMs. Given all the innovation in the space, in a quarter or two, this data might be very different. Who do you think will top the charts at the end of 2024? |
Older messages
The 10x Salesperson
Friday, January 12, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. The 10x Salesperson The 10x programmer - engineers whose
Gordian Knots in Software Engineering
Tuesday, January 9, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. Gordian Knots in Software Engineering Measuring engineering
Make Hay When the Sun Shines : Liquidity in Startup Exits
Monday, January 8, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. Make Hay When the Sun Shines : Liquidity in Startup Exits
2024 Predictions
Tuesday, January 2, 2024
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. 2024 Predictions Every year I make a list of predictions &
Why Startup M&A in 2024 Will Rebound
Thursday, December 21, 2023
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. Why Startup M&A in 2024 Will Rebound Earlier this week,
You Might Also Like
🚀 Ready to scale? Apply now for the TinySeed SaaS Accelerator
Friday, February 14, 2025
What could $120K+ in funding do for your business?
📂 How to find a technical cofounder
Friday, February 14, 2025
If you're a marketer looking to become a founder, this newsletter is for you. Starting a startup alone is hard. Very hard. Even as someone who learned to code, I still believe that the
AI Impact Curves
Friday, February 14, 2025
Tomasz Tunguz Venture Capitalist If you were forwarded this newsletter, and you'd like to receive it in the future, subscribe here. AI Impact Curves What is the impact of AI across different
15 Silicon Valley Startups Raised $302 Million - Week of February 10, 2025
Friday, February 14, 2025
💕 AI's Power Couple 💰 How Stablecoins Could Drive the Dollar 🚚 USPS Halts China Inbound Packages for 12 Hours 💲 No One Knows How to Price AI Tools 💰 Blackrock & G42 on Financing AI
The Rewrite and Hybrid Favoritism 🤫
Friday, February 14, 2025
Dogs, Yay. Humans, Nay͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
🦄 AI product creation marketplace
Friday, February 14, 2025
Arcade is an AI-powered platform and marketplace that lets you design and create custom products, like jewelry.
Crazy week
Friday, February 14, 2025
Crazy week. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
join me: 6 trends shaping the AI landscape in 2025
Friday, February 14, 2025
this is tomorrow Hi there, Isabelle here, Senior Editor & Analyst at CB Insights. Tomorrow, I'll be breaking down the biggest shifts in AI – from the M&A surge to the deals fueling the
Six Startups to Watch
Friday, February 14, 2025
AI wrappers, DNA sequencing, fintech super-apps, and more. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
How Will AI-Native Games Work? Well, Now We Know.
Friday, February 14, 2025
A Deep Dive Into Simcluster ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏