📝 Guest Post: Guide to Building an ML Platform*
Was this email forwarded to you? Sign up here In this guest post, Stephen Oladele, Developer Advocate and MLOps Technical Content Creator, together with neptune.ai, dive into the topic of building Machine Learning platforms and share some best practices followed by the industry. Make sure to continue reading! Machine learning (ML) platforms are increasingly seen as the solution to consolidating all the components of the ML model lifecycle, from experimentation to production. These platforms not only provide your team with the tools and infrastructure they need to build and operate models at scale but also apply standard engineering and MLOps principles to all use cases. However, there's a catch: understanding what makes a successful ML platform and building one is no easy task. With a plethora of tools, frameworks, practices, and technologies available, it can be overwhelming to know where to begin. This guide is designed to help you navigate through the process and understand the key factors that contribute to a successful machine learning platform. What is a machine learning platform?An ML platform standardizes the technology stack for your data team around best practices to reduce incidental complexities with machine learning and better enable capabilities for data science teams across projects and workflows. Why are you building an ML platform? We ask this during product demos, user and support calls, and on our MLOps Live podcast. Generally, people say they do MLOps to make the development and maintenance of production machine learning seamless and efficient. Machine learning operations (MLOps) should be easier with ML platforms at all stages of a machine learning project’s life cycle, from prototyping to production at scale, as the number of models in production grows from one or a few to tens, hundreds, or thousands that have a positive effect on the business. An ML platform should be designed to:
But how to do it? MLOps best practices, learnings, and considerations from ML platform expertsWe have distilled some of the best practices and learnings from ML platform teams into the following points. Embrace iteration on your ML platformSimilar to any other software system, creating your ML platform shouldn't be a one-off task. As your business needs, infrastructure, teams, and workflows evolve, you should keep making changes to your ML platform. Initially, you may not have a clear vision of what your ideal ML platform should look like. However, by building something that works and consistently improving it, you should be able to create a platform that supports your data scientists and provides business value. Isaac Vidas, ML Platform Lead at Shopify, shared at Ray Summit 2022 that Shopify’s ML Platform had to go through three different iterations: “Our ML platform has gone through three iterations in the past. The first iteration was built on an in-house PySpark solution. The second iteration was built as a wrapper around the Google AI Platform (Vertex AI), which ran as a managed service on Google Cloud. We reevaluated our machine learning platform last year based on various requirements gathered from our users and data scientists, as well as business goals. We decided to build the third iteration on top of open source tools and technologies around our platform goals, with a focus on scalability, fast iterations, and flexibility.” Take, for example, Airbnb. They have built and iterated on their ML platform up to three times over the course of the entire project. The platform should evolve as the number of use cases your team solves increases. Be transparent to your users about true infrastructure costsAnother good idea is to make sure that all of your data scientists can see the cost estimate for every job they run in their workspace. This could help them learn how to manage costs better and use resources efficiently. “We recently included cost estimations (in every user workspace). This means the user is very familiar with the amount of money it takes to run their jobs. We can also have an estimation for the maximum workspace age cost, because we know the amount of time the workspace will run…” — Isaac Vidas, ML Platform Lead at Shopify, in Ray Summit 2022 Documentation is important on and within your platformDocumentation is crucial for any software, including ML platforms. It should be intuitive and comprehensive to facilitate ease of use and adoption by your users. To ensure clarity, you can explicitly specify which parts of the platform are not yet perfected and make it easy for users to differentiate between errors due to their own workflows and those of the platform. Quick-start guides and easy-to-read how-tos can aid in the successful adoption of the platform. Within the platform, it should also be easy for users to document their workflows. For instance, adding a notes section to the interface for the experiment management component could benefit data scientists. Documentation should start from the architecture and design phases, which enables you to:
Tooling and standardization are keyStandardizing workflows and tools on your platform can increase team efficiency, enable the use of the same workflows for multiple projects, simplify the development and deployment of ML services, and improve collaboration. Learn more from Uber Engineering’s former senior software engineer, Achal Shah. Be tool agnosticBe tool-agnostic to facilitate faster adoption and cross-functional team collaboration. Integrating your platform with the organization's existing stack eliminates the need for users to learn entirely new tools to improve their productivity. Starting from scratch in this manner is bound to be a lost cause. Make your platform portableEnsure that your platform is portable across different infrastructures to avoid difficulty moving it to a new one if it initially runs on the organization's infrastructure layer. Most open-source, end-to-end platforms are portable, and you can use their solutions or design principles as a guide to build your own platform. What’s next?Best practices are just the tip of the iceberg and only a small part of the full Guide to Building an ML Platform that you can find on Neptune’s MLOps Blog. It’s a huge resource that talks about:
There are also a ton of links to additional resources, like articles, podcasts, whitepapers, and more. *This post was written by Stephen Oladele, Developer Advocate and MLOps Technical Content Creator, for neptune.ai. We thank neptune.ai for their ongoing support of TheSequence.You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
LLaMA is Meta AI's New LLM that Matchest GPT-3.5 Across Many Tasks Despite Being Quite Smaller
Thursday, March 23, 2023
The model is significatively smaller than GPT-3.5 but matches its performance on many important LLM benchmarks.
Edge 275: Understanding Vertical Federated Learning
Tuesday, March 21, 2023
Vertical federated learning, Google's research about using federated learning to optimize mobile keyword predictions and the Flower framework.
Results of the Survey: 📝 How is MLOps more than just tools?
Monday, March 20, 2023
Hi there! As some of you may recall from our previous posts, TheSequence recently conducted a survey titled “How is MLOps more than just tools?” In this survey, we asked ML engineers, data scientists,
Another Monster Generative AI Week
Sunday, March 19, 2023
Sundays, The Sequence Scope brings a summary of the most important research papers, technology releases and VC funding deals in the artificial intelligence space.
📌 Webinar: See How Tecton Enables Data Teams to Shift Notebook Development Into Production
Friday, March 17, 2023
Join us at Tecton's webinar! Tecton recently rolled out version 0.6, which includes new capabilities to simplify and accelerate feature engineering at scale. The update introduces notebook-driven
You Might Also Like
Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator
Friday, February 14, 2025
What came before the golem? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Defining Your Paranoia Level: Navigating Change Without the Overkill
Friday, February 14, 2025
We've all been there: trying to learn something new, only to find our old habits holding us back. We discussed today how our gut feelings about solving problems can sometimes be our own worst enemy
5 ways AI can help with taxes 🪄
Friday, February 14, 2025
Remotely control an iPhone; 💸 50+ early Presidents' Day deals -- ZDNET ZDNET Tech Today - US February 10, 2025 5 ways AI can help you with your taxes (and what not to use it for) 5 ways AI can help
Recurring Automations + Secret Updates
Friday, February 14, 2025
Smarter automations, better templates, and hidden updates to explore 👀 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The First Provable AI-Proof Game: Introducing Butterfly Wings 4
Friday, February 14, 2025
Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? undefined The Market Today #01 Instagram (Meta) 714.52 -0.32%
GCP Newsletter #437
Friday, February 14, 2025
Welcome to issue #437 February 10th, 2025 News BigQuery Cloud Marketplace Official Blog Partners BigQuery datasets now available on Google Cloud Marketplace - Google Cloud Marketplace now offers
Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰
Friday, February 14, 2025
Discover how the share of US wealth held by the top 1% has evolved from 1989 to 2024 in this infographic. View Online | Subscribe | Download Our App Download our app to see thousands of new charts from
The Great Social Media Diaspora & Tapestry is here
Friday, February 14, 2025
Apple introduces new app called 'Apple Invites', The Iconfactory launches Tapestry, beyond the traditional portfolio, and more in this week's issue of Creativerly. Creativerly The Great
Daily Coding Problem: Problem #1689 [Medium]
Friday, February 14, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a linked list, sort it in O(n log n) time and constant space. For example,
📧 Stop Conflating CQRS and MediatR
Friday, February 14, 2025
Stop Conflating CQRS and MediatR Read on: my website / Read time: 4 minutes The .NET Weekly is brought to you by: Step right up to the Generative AI Use Cases Repository! See how MongoDB powers your