📝 Guest post: How to setup MLOps at a reasonable scale: tips, tool stacks, and templates from companies that did
Was this email forwarded to you? Sign up here In TheSequence Guest Post, our partners explain what ML and AI challenges they help deal with. In this article, neptune.ai discusses how to setup MLOps at a reasonable scale: tips, tool stacks, and templates from companies that did We wrote about what MLOps at a reasonable scale is and why it is important to you. But the big question we didn’t talk about there was: How do reasonable scale companies actually set it up (and how should you do it)? In this issue, we’ll go over resources to help you build a pragmatic MLOps stack that will work for your use case. Let’s start with some tips. MLOps tipsRecently we interviewed a few ML practitioners about setting up MLOps. “My number 1 tip is that MLOps is not a tool. It is not a product. It describes attempts to automate and simplify the process of building AI-related products and services. Therefore, spend time defining your process, then find tools and techniques that fit that process. For example, the process in a bank is wildly different from that of a tech startup. So the resulting MLOps practices and stacks end up being very different too.” – Phil Winder, CEO at Winder Research So before everything, be pragmatic and think about your use case, your workflow, your needs. Not “industry best practices”. No reasonable scale ML discussion is complete without Jacopo Tagliabue, Head of AI at Coveo, who coined the term. In his pivotal blog post, he suggests a mindset shift that we think is crucial (especially early in your MLOps journey):
You can watch him go deep into the subject in this Stanford Sys seminar video. The third tip we want you to remember comes from Orr Shilon, ML engineering team lead at Lemonade. In this episode of mlops.community podcast, he talks about platform thinking. He suggests that their focus on automation and pragmatically leveraging tools wherever possible were key to doing things efficiently in MLOps. With this approach, at one point, his team of two ML engineers managed to support the entire data science team of 20+ people. That is some infrastructure leverage. Now, let’s look at example MLOps stacks! MLOps tool stacksThere are many tools that play in many MLOps categories though it is sometimes hard to understand who does what.
From our research into how reasonable scale teams set up their stacks, we found out that: Pragmatic teams don’t do everything, they focus on what they actually need. For example, the team over at Continuum Industries needed to get a lot of visibility into testing and evaluation suites of their optimization algorithms. So they connected Neptune with GitHub actions CICD to visualize and compare various test runs.
GreenSteam needed something that would work in a hybrid monolith-microservice environment. Because of their custom deployment needs, they decided to go with Argo pipelines for workflow orchestration and deploy things with FastAPI. Their swit:
Those teams didn’t solve everything deeply but pinpointed what they needed and did that very well. If you’d like to see more examples of how teams set up their MLOps, Stephen Oladele, our Developer Advocate, did a great job researching and writing down setups of 8 more companies. Also, if you want to go deeper, there is a slack channel where people share and discuss their MLOps stacks. So if you’d like to see even more stacks:
Okay, stacks are great, but you probably want some templates, too. MLOps templatesThe best reasonable scale MLOps template comes from, you guessed it, Jacopo Tagliabue and collaborators. In this open-source GitHub repository, they put together an end-to-end (Metaflow-based) implementation of an intent prediction and session recommendation. It shows how to connect main pillars of MLOps and have an end-to-end working MLOps system you can build on. It is an excellent starting point that lets you use the default or pick and choose tools for each component. One more great resource that is worth mentioning is the MLOps Infrastructure Stack article. In that article, they explain how:
It comes with a nice graphical template from folks over at Valohai. They explain general considerations, tool categories, and example tool choices for each component. Overall a really good read. What should you do next?Okay, now use these resources and go build your MLOps stack! If you need some help, we’re putting together a resource where we:
Check it out and let us know what you think in the mlops.community slack #neptune-ai channel. *This post was written by the neptune.ai team. We thank neptune.ai for their ongoing support of TheSequence.You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
⚙️ Edge#183: Data vs Model Parallelism in Distributed Training
Tuesday, April 19, 2022
In this issue: we explore data vs model parallelism in distributed training; we discuss how AI training scales; we overview Microsoft DeepSpeed, a training framework powering some of the largest neural
🛍 Machine Learning at Shopify
Sunday, April 17, 2022
Weekly news digest curated by the industry insiders
🐣 Flash 50% OFF
Saturday, April 16, 2022
Only 36 hours left!
🌄 A New Series About High Scale ML Training
Tuesday, April 12, 2022
+SeedRL, +Horovod
🎥 How to achieve 1M+ record/second Kafka ingest without sacrificing query latency
Monday, April 11, 2022
Register Now
You Might Also Like
Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator
Friday, February 14, 2025
What came before the golem? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Defining Your Paranoia Level: Navigating Change Without the Overkill
Friday, February 14, 2025
We've all been there: trying to learn something new, only to find our old habits holding us back. We discussed today how our gut feelings about solving problems can sometimes be our own worst enemy
5 ways AI can help with taxes 🪄
Friday, February 14, 2025
Remotely control an iPhone; 💸 50+ early Presidents' Day deals -- ZDNET ZDNET Tech Today - US February 10, 2025 5 ways AI can help you with your taxes (and what not to use it for) 5 ways AI can help
Recurring Automations + Secret Updates
Friday, February 14, 2025
Smarter automations, better templates, and hidden updates to explore 👀 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The First Provable AI-Proof Game: Introducing Butterfly Wings 4
Friday, February 14, 2025
Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? undefined The Market Today #01 Instagram (Meta) 714.52 -0.32%
GCP Newsletter #437
Friday, February 14, 2025
Welcome to issue #437 February 10th, 2025 News BigQuery Cloud Marketplace Official Blog Partners BigQuery datasets now available on Google Cloud Marketplace - Google Cloud Marketplace now offers
Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰
Friday, February 14, 2025
Discover how the share of US wealth held by the top 1% has evolved from 1989 to 2024 in this infographic. View Online | Subscribe | Download Our App Download our app to see thousands of new charts from
The Great Social Media Diaspora & Tapestry is here
Friday, February 14, 2025
Apple introduces new app called 'Apple Invites', The Iconfactory launches Tapestry, beyond the traditional portfolio, and more in this week's issue of Creativerly. Creativerly The Great
Daily Coding Problem: Problem #1689 [Medium]
Friday, February 14, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a linked list, sort it in O(n log n) time and constant space. For example,
📧 Stop Conflating CQRS and MediatR
Friday, February 14, 2025
Stop Conflating CQRS and MediatR Read on: my website / Read time: 4 minutes The .NET Weekly is brought to you by: Step right up to the Generative AI Use Cases Repository! See how MongoDB powers your