TheSequence - 🔂 Edge#217: ML Testing Series – Recap
Was this email forwarded to you? Sign up here Last week we finished our mini-series about ML testing, one of the most critical elements of the ML models’ lifecycle. Here is a full recap for you to catch up with the topics we covered. As the proverb (and many ML people) says: Repetition is the mother of learning ;) The essence of ML testing is to execute explicit checks that validate the behavior of an ML model. This approach contrasts with testing in traditional software applications. In a web or mobile application, users provide tests in the form of logic and data and fine-tune the system’s behavior. The cycle is different in ML, where a test starts with the expected behavior and the corresponding dataset, with the model’s logic as an output. Plenty of taxonomies can be used to organize ML testing techniques. A very general approach segments testing techniques into two main groups relative to the ML model lifecycle:
Typically, both types of tests should be incorporated into an MLOps pipeline. Also, tests should include both code and data. Over the years, there have been a lot of different ML test techniques that have been widely covered in research. Examples include invariance tests, minimum functionality tests, directional tests and many others. Forward this email to those who might benefit from reading it or give a gift subscription. → In Edge#209 (read it without a subscription): we explore how Uber backtests time-series forecasting models at scale; and discuss Deepchecks, an ML testing platform you should know about. → In Edge#211: we discuss what to test in ML models; explain how Meta uses A/B testing to improve Facebook’s newsfeed algorithm; and explore Meta’s Ax, a framework for A/B testing in PyTorch. → In Edge#213: we overview the fundamental types of tests to be applied to trained models; explain how Meta uses Bayesian Optimization to conduct better experiments in ML models; and explore TensorFlow’s What-If Tool, one of the most commonly used testing tools in the machine learning space. → In Edge#215: we discuss Pre-Train Model Testing; overview the pillars of robust machine learning; and explore Great Expectations, one of the most complete data validation frameworks used in ML pipelines. Next week we are going back to deep learning theory. Our next mini-series will cover a new generation of text-image models and their underlying techniques. Fascinating! You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
📙 Free book: Meet the Data Science Innovators
Monday, August 15, 2022
Learn from top data science leaders, who share their insights on their groundbreaking innovations, their careers, and the data science profession. Who's doing the most innovative things in data
😴 ❌ Don’t Sleep on JAX
Sunday, August 14, 2022
Weekly news digest curated by the industry insiders
📌 Event: Last chance to register for conference on scalable AI – Aug 23-24 in San Francisco!
Friday, August 12, 2022
The world's top minds in AI and distributed computing are coming to Ray Summit — August 23-24 in San Francisco. Join the global Ray community for two days of keynotes, training, and technical
🐈⬛ Edge#216: DeepMind’s New Super Model can Generalize Across Multiple Tasks on Different Domains
Friday, August 12, 2022
Gato is able to master tasks such as image classification, question answering or controlling a robotic arm
🏛 Edge#215: Pre-Train Model Testing and the Pillars of Robust ML
Tuesday, August 9, 2022
In this issue: we discuss Pre-Train Model Testing; we overview the pillars of robust machine learning; we explore Great Expectations
You Might Also Like
Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator
Friday, February 14, 2025
What came before the golem? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Defining Your Paranoia Level: Navigating Change Without the Overkill
Friday, February 14, 2025
We've all been there: trying to learn something new, only to find our old habits holding us back. We discussed today how our gut feelings about solving problems can sometimes be our own worst enemy
5 ways AI can help with taxes 🪄
Friday, February 14, 2025
Remotely control an iPhone; 💸 50+ early Presidents' Day deals -- ZDNET ZDNET Tech Today - US February 10, 2025 5 ways AI can help you with your taxes (and what not to use it for) 5 ways AI can help
Recurring Automations + Secret Updates
Friday, February 14, 2025
Smarter automations, better templates, and hidden updates to explore 👀 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The First Provable AI-Proof Game: Introducing Butterfly Wings 4
Friday, February 14, 2025
Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? undefined The Market Today #01 Instagram (Meta) 714.52 -0.32%
GCP Newsletter #437
Friday, February 14, 2025
Welcome to issue #437 February 10th, 2025 News BigQuery Cloud Marketplace Official Blog Partners BigQuery datasets now available on Google Cloud Marketplace - Google Cloud Marketplace now offers
Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰
Friday, February 14, 2025
Discover how the share of US wealth held by the top 1% has evolved from 1989 to 2024 in this infographic. View Online | Subscribe | Download Our App Download our app to see thousands of new charts from
The Great Social Media Diaspora & Tapestry is here
Friday, February 14, 2025
Apple introduces new app called 'Apple Invites', The Iconfactory launches Tapestry, beyond the traditional portfolio, and more in this week's issue of Creativerly. Creativerly The Great
Daily Coding Problem: Problem #1689 [Medium]
Friday, February 14, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a linked list, sort it in O(n log n) time and constant space. For example,
📧 Stop Conflating CQRS and MediatR
Friday, February 14, 2025
Stop Conflating CQRS and MediatR Read on: my website / Read time: 4 minutes The .NET Weekly is brought to you by: Step right up to the Generative AI Use Cases Repository! See how MongoDB powers your