🎙 Brian Venturo/CoreWeave about GPU-first ML infrastructures
Was this email forwarded to you? Sign up here 🎙 Brian Venturo/CoreWeave about GPU-first ML infrastructuresHow cryptocurrency mining led the team to challenge “big 3” cloud providersIt’s inspiring to learn from practitioners. Getting to know the experience gained by researchers, engineers, and entrepreneurs doing real ML work is an excellent source of insight and inspiration. Share this interview if you find it enriching. No subscription is needed. 👤 Quick bio / Brian Venturo
Brian Venturo (BV): I’m Brian Venturo, Co-Founder and CTO of CoreWeave. Prior to CoreWeave, I spent over a decade building and running hedge funds focused on energy markets. In 2016, Mike Intrator (CEO), Brannin McBee (CSO), and I bought our first GPU and began experimenting with cryptocurrency mining. Over the next few years, as a hobby became our sole business focus, we built a large-scale infrastructure spanning seven facilities and inched closer towards our goal of building a cloud infrastructure that provided the world’s creators and innovators access to scalable infrastructure with approachable prices – something that the industry was largely missing. Machine learning and batch processing were the first high-performance computing use cases we served, and something that we wanted to support at scale as we continued building CoreWeave Cloud. I’m happy to announce that we just raised $50 million (some news for your Sunday Scope!) to accelerate the growth of the business. 🛠 ML Work
BV: When we began building CoreWeave Cloud, we set out to help empower engineers and creators to access compute on-demand at a massive scale for GPU accelerated use cases. We were all too familiar with the inflexibility and high cost of compute on legacy cloud providers and believed that we could help our clients create world-changing technology more effectively by removing barriers to scale. Machine learning and batch processing are classic examples of this – we're consistently blown away by what our clients can do when they’re able to train, iterate, fine-tune, serve models, and analyze data faster. The challenges that our clients face with the “big 3” cloud providers can be summarized across three themes:
BV: This is top of mind as we finished building our state-of-the-art NVIDIA A100 distributed training cluster this year. Our partners at Eleuther AI are currently using it to train GPT-NeoX-20B, which we expect to be the largest open-source language model when it’s completed later this year. Training – at any scale – is complex from a technical perspective, and for that reason, we feel that it’s really important to provide clients with options. A few examples include:
Possibly even more so than training, hardware selection has a huge impact on inference workloads, as performance-adjusted cost benchmarking becomes critically important for our clients serving models at scale. We recently released benchmarks across five GPU types for our managed inference service for Eleuther AI’s GPT-J-6B model.
BV: My personal view is that the training market will become more fragmented from the model serving side over the next few years. I think we’re going to see a few large groups, whether they are private institutions or crowdsourced groups, training mega-scale models with a goal of either selling them to a large public cloud under a monopolistic arrangement or open-sourcing them for the world at large to use. I have concerns about the large cloud providers attempting to corner certain portions of the market with proprietary hardware for specific use cases and models that they own. For open-source models, I expect there to be a lot of smaller groups that need limited amounts of compute to fine-tune the models, but the largest demand is going to be for flexible compute to serve these models at scale. If I were to make a bet, it’s that flexible compute will continue to dominate the landscape given that it’s easier to source, use broadly, and build engineering teams to support it.
BV: We have countless conversations with clients who are looking to optimize for cost but haven’t optimized their models to fit in more economical GPUs. Sometimes, the team behind a project may be so overwhelmed that they can’t focus the time, which is where we collect data to inform our product roadmap of how we can be more helpful to clients in the future. It’s impossible for us to optimize every model serving pipeline, but I think there is an opportunity for us to create tools for clients to get a better “bang for their buck” at scale. We also see a lot of movement in this area from the framework developers. For example, TorchScript brought PyTorch up to the efficient execution of TensorFlow saved models. Models that can be converted to NVIDIA TensorRT often gain substantial improvements in inference times. Clients who are able to invest the time – like AI Dungeon and Novel AI – often see massive improvements in performance-adjusted cost.
BV: Regarding crossing the chasm you described, some teams are already there and looking for a software provider that delivers an out-of-the-box solution, taking care of hardware and infrastructure under the hood. There are a ton of interesting companies providing solutions for MLOps, a space that is absolutely exploding, and one you covered thoughtfully in TheSequence yesterday. I don’t think there’s a “one size fits all” solution here, nor is a potential solution to the problem – to the extent a problem exists – that specific. For larger, complex models, you are always going to want to do some hardware-specific tuning. 💥 Miscellaneous – a set of rapid-fire questions
Easy. Achilles and the Tortoise. Makes my mind shudder.
I am a believer in learning that the water is cold after jumping in. Learning through practice is all I’ve ever known.
Maybe. I do think that the basic imitation game in the Turing test can be overcome by an NLP model at some point in the not too far future. NLP models can already readily have a legible conversation with a human. They are still, however, a supercomputer generating answers based on what it has learned from humans. I do believe we need a deeper, non-language-based test to truly determine if an AI can actually think and draw conclusions on its own. Think something like the story in the movie Ex Machina.
I hope not for Bitcoin’s sake. |
Older messages
🔥 Edge#139: MLOps – one of the hottest topics in the ML space
Tuesday, November 9, 2021
A new series on TheSequence
➗✖️ OpenAI New NLP Challenge: Mathematical Reasoning
Sunday, November 7, 2021
Weekly news digest curated by the industry insiders
📝 Guest post: How to build SuperData for AI [Full Checklist]*
Friday, November 5, 2021
Read it without a subscription
🏷 Edge#138: Toloka App Services Aims to Make Data Labeling Easier for AI Startups
Thursday, November 4, 2021
New tools on the market
📌 Event: MLOps Cocktails Done Right: How to Mix Data Science, ML Engineering, and DevOps*
Wednesday, November 3, 2021
[FREE Virtual Event]
You Might Also Like
Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator
Friday, February 14, 2025
What came before the golem? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Defining Your Paranoia Level: Navigating Change Without the Overkill
Friday, February 14, 2025
We've all been there: trying to learn something new, only to find our old habits holding us back. We discussed today how our gut feelings about solving problems can sometimes be our own worst enemy
5 ways AI can help with taxes 🪄
Friday, February 14, 2025
Remotely control an iPhone; 💸 50+ early Presidents' Day deals -- ZDNET ZDNET Tech Today - US February 10, 2025 5 ways AI can help you with your taxes (and what not to use it for) 5 ways AI can help
Recurring Automations + Secret Updates
Friday, February 14, 2025
Smarter automations, better templates, and hidden updates to explore 👀 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The First Provable AI-Proof Game: Introducing Butterfly Wings 4
Friday, February 14, 2025
Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? undefined The Market Today #01 Instagram (Meta) 714.52 -0.32%
GCP Newsletter #437
Friday, February 14, 2025
Welcome to issue #437 February 10th, 2025 News BigQuery Cloud Marketplace Official Blog Partners BigQuery datasets now available on Google Cloud Marketplace - Google Cloud Marketplace now offers
Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰
Friday, February 14, 2025
Discover how the share of US wealth held by the top 1% has evolved from 1989 to 2024 in this infographic. View Online | Subscribe | Download Our App Download our app to see thousands of new charts from
The Great Social Media Diaspora & Tapestry is here
Friday, February 14, 2025
Apple introduces new app called 'Apple Invites', The Iconfactory launches Tapestry, beyond the traditional portfolio, and more in this week's issue of Creativerly. Creativerly The Great
Daily Coding Problem: Problem #1689 [Medium]
Friday, February 14, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a linked list, sort it in O(n log n) time and constant space. For example,
📧 Stop Conflating CQRS and MediatR
Friday, February 14, 2025
Stop Conflating CQRS and MediatR Read on: my website / Read time: 4 minutes The .NET Weekly is brought to you by: Step right up to the Generative AI Use Cases Repository! See how MongoDB powers your