Was this email forwarded to you? Sign up here

🔂 Edge#217: ML Testing Series – Recap

Aug 16

Last week we finished our mini-series about ML testing, one of the most critical elements of the ML models’ lifecycle. Here is a full recap for you to catch up with the topics we covered. As the proverb (and many ML people) says: Repetition is the mother of learning ;)

The essence of ML testing is to execute explicit checks that validate the behavior of an ML model. This approach contrasts with testing in traditional software applications. In a web or mobile application, users provide tests in the form of logic and data and fine-tune the system’s behavior. The cycle is different in ML, where a test starts with the expected behavior and the corresponding dataset, with the model’s logic as an output.

Plenty of taxonomies can be used to organize ML testing techniques. A very general approach segments testing techniques into two main groups relative to the ML model lifecycle:

Pre-Train Tests: Designed to find problems that can help optimize the training workflow.
Post-Train Tests: The most important types of tests in ML. Post-train tests are designed to check the behavior of ML models.

Diagram

Description automatically generated — Image Credit: https://www.jeremyjordan.me/testing-ml/

Typically, both types of tests should be incorporated into an MLOps pipeline. Also, tests should include both code and data. Over the years, there have been a lot of different ML test techniques that have been widely covered in research. Examples include invariance tests, minimum functionality tests, directional tests and many others.

Forward this email to those who might benefit from reading it or give a gift subscription.

Share

→ In Edge#209 (read it without a subscription): we explore how Uber backtests time-series forecasting models at scale; and discuss Deepchecks, an ML testing platform you should know about.

→ In Edge#211: we discuss what to test in ML models; explain how Meta uses A/B testing to improve Facebook’s newsfeed algorithm; and explore Meta’s Ax, a framework for A/B testing in PyTorch.

→ In Edge#213: we overview the fundamental types of tests to be applied to trained models; explain how Meta uses Bayesian Optimization to conduct better experiments in ML models; and explore TensorFlow’s What-If Tool, one of the most commonly used testing tools in the machine learning space.

→ In Edge#215: we discuss Pre-Train Model Testing; overview the pillars of robust machine learning; and explore Great Expectations, one of the most complete data validation frameworks used in ML pipelines.

Next week we are going back to deep learning theory. Our next mini-series will cover a new generation of text-image models and their underlying techniques. Fascinating!

Subscribe if you haven't yet

Like

Comment

Share

You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities.

TheSequence - 🔂 Edge#217: ML Testing Series – Recap

🔂 Edge#217: ML Testing Series – Recap

Older messages

📙 Free book: Meet the Data Science Innovators

😴 ❌ Don’t Sleep on JAX

📌 Event: Last chance to register for conference on scalable AI – Aug 23-24 in San Francisco!

🐈‍⬛ Edge#216: DeepMind’s New Super Model can Generalize Across Multiple Tasks on Different Domains

🏛 Edge#215: Pre-Train Model Testing and the Pillars of Robust ML

You Might Also Like

Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator

Defining Your Paranoia Level: Navigating Change Without the Overkill

5 ways AI can help with taxes 🪄

Recurring Automations + Secret Updates

The First Provable AI-Proof Game: Introducing Butterfly Wings 4

GCP Newsletter #437

Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰

The Great Social Media Diaspora & Tapestry is here

Daily Coding Problem: Problem #1689 [Medium]

📧 Stop Conflating CQRS and MediatR