TheSequence - 🛍 Machine Learning at Shopify
Was this email forwarded to you? Sign up here 📝 EditorialBuilding large machine learning (ML) architectures remains unexplored territory for most companies. Despite the massive adoption of ML frameworks and platforms, most companies still apply ML in constrained and relatively small-scale scenarios. As an industry, we are still figuring out the best practices for ML infrastructures that can run large numbers of ML models from experimentation to production. Not surprisingly, the best inspiration for large-scale ML architectures comes from technology giants that are running some of the largest ML infrastructures in the world. Companies like Uber, LinkedIn, Meta, and Airbnb have been very transparent about their architectures used to run ML workloads and have even open-sourced many of its components. This week, we have another technology powerhouse to draw inspiration from: Shopify. A few days ago, Shopify published some details about Merlin, the platform powering its internal ML solutions. Merlin is based on a very modern architecture optimized for rapid experimentation and scale. At a high level, Merlin shares some similarities with architectures such as Uber’s Michelangelo or Airbnb’s Bighead but it also has some very unique characteristics. For instance, Merlin uses Ray as its fundamental engine for ML scalability. Merlin also uses Pano, a custom feature store that persists and enables features across all ML models. Another interesting area of innovation of Merlin is its native integration with Notebook environments and the consistency of its project structure. Even though it remains close-sourced, the initial details of the architecture can serve as inspiration to organizations building ML solutions at scale. 🔺🔻TheSequence Scope – our Sunday edition with the industry’s development overview – is free. To receive high-quality content about the most relevant developments in the ML world every Tuesday and Thursday, please subscribe to TheSequence Edge 🔺🔻 🗓 Next week in TheSequence Edge: Edge#183: we explore data vs model parallelism in distributed training; discuss how AI training scales; overview Microsoft DeepSpeed, a training framework powering some of the largest neural networks in the world. Edge#184: we look inside DALL-E 2 and learn how OpenAI upgraded its supermodel that can generate artistic images from text. Subscribe if you haven’t yet Now, let’s review the most important developments in the AI industry this week 🔎 ML ResearchConverse Salesforce Research published a paper detailing Converse, a framework for building modular, task-oriented chatbots →read more on the Salesforce Research blog Zero-Shot Task-Oriented Dialogue Google Research published two papers outlining methods for task-oriented conversational agents that can transfer knowledge across different tasks →read more on the Google Research blog Contrastive Learning Stanford University published a detailed blog post explaining the underpinnings of contrastive learning →read more on the Stanford University blog Contrastive Learning on Image-Text Data Google Research published a paper proposing a contrastive learning method that matches text to pretrained images but does so in a way that can transfer knowledge across different tasks →read more on the Google Research blog 🛠 Real World MLShopify Merlin Shopify published a blog post detailing Merlin, an internal platform that powers their ML pipelines →read more on the Shopify Engineering blog Feathr Linked open-sourced Feathr, a feature store used in their internal ML applications →read more on the LinkedIn Engineering blog Presto on Kafka Uber published a blog post illustrating their architecture for running SQL queries using Presto over Kafka data streams →read more in the Uber Engineering blog ✏️ A Survey: Data Labeling for ML, part 4Please take a very simple survey to help us prepare an article about data labeling. It will take about 2-3 minutes. As a thank you, we will send you a cheat sheet with 40+ free ML & data science books and courses! We appreciate your help. 🤖 Cool AI Tech ReleasesMoViNets TensorFlow open-sourced MoViNets, a collection of mobile optimized video classification models →read more on the TensorFlow blog 💸 Money in AIML&AI
AI-powered:
You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
🐣 Flash 50% OFF
Saturday, April 16, 2022
Only 36 hours left!
🌄 A New Series About High Scale ML Training
Tuesday, April 12, 2022
+SeedRL, +Horovod
🎥 How to achieve 1M+ record/second Kafka ingest without sacrificing query latency
Monday, April 11, 2022
Register Now
💃 New Week, New AI Super Model
Sunday, April 10, 2022
Weekly news digest curated by the industry insiders
🗂 Edge#180: A Deep Dive Into SuperAnnotate, End-to-End Platform for Building and Managing SuperData, the Ground T…
Friday, April 8, 2022
We keep covering the best data annotation services for you
You Might Also Like
Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator
Friday, February 14, 2025
What came before the golem? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Defining Your Paranoia Level: Navigating Change Without the Overkill
Friday, February 14, 2025
We've all been there: trying to learn something new, only to find our old habits holding us back. We discussed today how our gut feelings about solving problems can sometimes be our own worst enemy
5 ways AI can help with taxes 🪄
Friday, February 14, 2025
Remotely control an iPhone; 💸 50+ early Presidents' Day deals -- ZDNET ZDNET Tech Today - US February 10, 2025 5 ways AI can help you with your taxes (and what not to use it for) 5 ways AI can help
Recurring Automations + Secret Updates
Friday, February 14, 2025
Smarter automations, better templates, and hidden updates to explore 👀 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The First Provable AI-Proof Game: Introducing Butterfly Wings 4
Friday, February 14, 2025
Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? undefined The Market Today #01 Instagram (Meta) 714.52 -0.32%
GCP Newsletter #437
Friday, February 14, 2025
Welcome to issue #437 February 10th, 2025 News BigQuery Cloud Marketplace Official Blog Partners BigQuery datasets now available on Google Cloud Marketplace - Google Cloud Marketplace now offers
Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰
Friday, February 14, 2025
Discover how the share of US wealth held by the top 1% has evolved from 1989 to 2024 in this infographic. View Online | Subscribe | Download Our App Download our app to see thousands of new charts from
The Great Social Media Diaspora & Tapestry is here
Friday, February 14, 2025
Apple introduces new app called 'Apple Invites', The Iconfactory launches Tapestry, beyond the traditional portfolio, and more in this week's issue of Creativerly. Creativerly The Great
Daily Coding Problem: Problem #1689 [Medium]
Friday, February 14, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a linked list, sort it in O(n log n) time and constant space. For example,
📧 Stop Conflating CQRS and MediatR
Friday, February 14, 2025
Stop Conflating CQRS and MediatR Read on: my website / Read time: 4 minutes The .NET Weekly is brought to you by: Step right up to the Generative AI Use Cases Repository! See how MongoDB powers your