The Sequence Chat: Hugging Face's Lewis Tunstall on ZEPHYR , RLHF and LLM Innovation
Was this email forwarded to you? Sign up here The Sequence Chat: Hugging Face's Lewis Tunstall on ZEPHYR , RLHF and LLM InnovationOne of the creators of ZEPHYR discusses ideas and lessons learned building LLMs at scale.Quick bioLewis Tunstall is a Machine Learning Engineer in the research team at Hugging Face and is the co-author of the bestseller “NLP with Transformers” book. He has previously built machine learning-powered applications for start-ups and enterprises in the domains of natural language processing, topological data analysis, and time series. He holds a PhD in Theoretical Physics, was a 2010 Fulbright Scholar and has held research positions in Australia, the USA, and Switzerland. His current work focuses on building tools and recipes to align language models with human and AI preferences through techniques like reinforcement learning.
My path to working in AI is somewhat unconventional and began when I was wrapping up a postdoc in theoretical particle physics around 2016. At the time, a friend of mine was studying algorithms to estimate the background for proton collisions at the Large Hadron Collider, and one day he showed me a script of TensorFlow code that trained a neural network to classify these events. I was surprised to learn that a few lines of code could outperform features that had been carefully designed by physicists over many years. This sparked my curiosity, and I started poking around trying to understand what this deep learning stuff was all about. Since I didn’t have much programming experience (theorists only need pen and paper!), I teamed up with a few physics friends to enter a Kaggle competition on predicting Russian housing prices. This was a great learning experience and taught me a lot about Python and XGBoost -- in those days, most Kaggle competitions were tabular! I had so much fun tinkering with code and data that I decided to pivot from academia to industry and haven’t looked back. Currently I am a machine learning engineer in the research team at Hugging Face, where I focus on aligning language models to follow human instructions via techniques like Reinforcement Learning from Human Feedback (RLHF). 🛠 AI Work
Zephyr was inspired by two trends which emerged in the AI community over the last few months. On the one hand, people figured out that you could fine-tune a pretty good chat model by distilling a dataset of conversations from more capable models like GPT-3.5 or GPT-4. This meant you could skip the costly human annotation step altogether and focus on generating data for specific tasks like coding or function calling. In parallel, many researchers were exploring simpler alternatives to RLHF, which is the alignment technique behind ChatGPT and Claude. A team at Stanford proposed a novel technique called Direct Preference Optimization (DPO), which removed reinforcement learning entirely from the alignment process and required far less compute to run. We thought it was interesting to combine these ideas and apply DPO to a dataset called UltraFeedback, which contains a diverse set of model responses that are ranked by GPT-4 according to criteria like helpfulness. The result was Zephyr 7B, which was a surprisingly capable model for its size.
When Mistral 7B was released, we knew from various benchmarks that it was the best base model at the 7B parameter scale, which is great for fine-tuning because you can iterate fast and even run the models on your laptop! And in our initial experiments, we found that Mistral chat models were far more fluent and capable than previous iterations we’d trained with Llama2 and Falcon. However, as I write this, the latest release from Mistral is Mixtral 8x7B, which appears to be the first open model to truly match the performance of GPT-3.5. It seems likely that a clever mix of fine-tuning and data distillation will produce a whole new set of capable chat models built on Mixtral, which is a very exciting development in the community.
Most alignment techniques for language models involve two steps; first you teach a base model to follow instructions, followed by a second step where you optimize the model to according to a set of ranked preferences and techniques like reinforcement learning or DPO. In the case of Zephyr, we first fine-tuned Mistral 7B on a dataset called UltraChat, which simulates millions of conversations between two GPT-3.5 models. However, we found that the resulting model had an annoying personality (i.e. it would often refuse to answer simple commands), so we heavily filtered the dataset to focus on helpful responses. We then took this model and optimized it with DPO on the UltraFeedback dataset I referred to earlier. Now, evaluating chat models is a tricky business and the gold standard is human evaluation which is very costly to perform. Instead, we adopted what is now becoming a common practice to evaluate chat models with GPT-4. Although this method has various flaws, it does provide a decent proxy for human evaluation, and we used the popular MT-Bench and AlpacaEval benchmarks to guide our experiments.
Earlier in the year, we had actually experimented with collecting human feedback from a data vendor, but found the process was both time consuming and costly to oversee. Based on this experience, we felt AI feedback was a more accessible route for both our small team and as a means to popularize a method that the community could also adopt.
InstructGPT was trained in a few different ways to Zephyr. For one, the InstructGPT datasets were single-turn human-annotated instructions, while Zephyr was trained on a large corpus of synthetic multi-turn dialogues. Another difference is that InstructGPT was aligned along various axes like helpfulness, honesty, and harmlessness, which often leads to a tension between the model’s capabilities and its tendency to hedge answers. By contrast, we focused on training Zephyr for helpfulness, which tends to also be what the community enjoys about open chat models.
Haha, with the current rate of progress it’s hard enough to predict one week into the future! But if I have to look into a crystal ball, then my current best guess is that we’ll see synthetic data become an integral part of how we fine-tune and pretrain language models. It’s also pretty clear that multimodality is the next frontier, both to instill new capabilities in models, but also as a potent source of new data from images, audio, and video. Figuring out how to align these models to a set of preferences across multiple modalities will take some tinkering to work out but is certainly a fun challenge!
Although there are now quite a few technical books covering transformers, our book was written with AI developers in mind, which means we focus on explaining the concepts through code you can run on Google Colab. Our book is also perhaps the only one to cover pretraining a language model in depth, which was rather prescient since we wrote it a year before the open LLM revolution kicked off. Thom Wolf is also a co-author, so where better to learn transformers than from the person who created the Hugging Face Transformers library? 💥 Miscellaneous – a set of rapid-fire questions
As a former physicist, I find applications of deep learning to accelerate scientific discovery to be especially exciting! Chris Bishop has a wonderful lecture on this topic where he frames AI as the “fifth paradigm” of science, with a focus on using AI to accelerate numerical simulations for complex systems like the weather. If I wasn’t so busy playing with LLMs, I would likely be working in this field.
My favorite mathematician is John von Neumann, mostly because I didn't really understand quantum mechanics until I read his excellent textbook on the subject. You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
Edge 352: Inside the Embeddings Architecture Powering Job Recommendations at LinkedIn
Friday, December 15, 2023
Some insights about one of the largest embedding architectures ever built.
💡 Discover key GenAI trends from the annual ML Insider report
Friday, December 15, 2023
Remember, you participated in the ML Insider Survey? Now it's time to get your copy of the ML Insider 2023 Report! Discover insights on the state of machine learning and generative AI, and find out
Edge 351: A Summary of Our Series About Fine-Tuning in Foundation Models
Tuesday, December 12, 2023
This series explored PEFT, LoRa, QLoRA, RLHF, RLAIF, Constitutional AI and many more of the top fine-tuning methods in foundation model apps.
📝 Guest Post: Do We Still Need Vector Databases for RAG with OpenAI's Built-In Retrieval?
Monday, December 11, 2023
In this guest post, Jael Gu, an algorithm engineer at Zilliz, will delve into the constraints of OpenAI's built-in retrieval and walk you through creating a customized retriever using Milvus, an
Gemini and Mistral MoE: Both Impactul Altough Very Different Releases
Sunday, December 10, 2023
Next Week in The Sequence: Edge 351: Presents a detailed summary of our series about fine-tuning in foundation models. Edge 352: Will dive into LinkedIn's embedding architecure that power its
You Might Also Like
Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator
Friday, February 14, 2025
What came before the golem? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Defining Your Paranoia Level: Navigating Change Without the Overkill
Friday, February 14, 2025
We've all been there: trying to learn something new, only to find our old habits holding us back. We discussed today how our gut feelings about solving problems can sometimes be our own worst enemy
5 ways AI can help with taxes 🪄
Friday, February 14, 2025
Remotely control an iPhone; 💸 50+ early Presidents' Day deals -- ZDNET ZDNET Tech Today - US February 10, 2025 5 ways AI can help you with your taxes (and what not to use it for) 5 ways AI can help
Recurring Automations + Secret Updates
Friday, February 14, 2025
Smarter automations, better templates, and hidden updates to explore 👀 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The First Provable AI-Proof Game: Introducing Butterfly Wings 4
Friday, February 14, 2025
Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? undefined The Market Today #01 Instagram (Meta) 714.52 -0.32%
GCP Newsletter #437
Friday, February 14, 2025
Welcome to issue #437 February 10th, 2025 News BigQuery Cloud Marketplace Official Blog Partners BigQuery datasets now available on Google Cloud Marketplace - Google Cloud Marketplace now offers
Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰
Friday, February 14, 2025
Discover how the share of US wealth held by the top 1% has evolved from 1989 to 2024 in this infographic. View Online | Subscribe | Download Our App Download our app to see thousands of new charts from
The Great Social Media Diaspora & Tapestry is here
Friday, February 14, 2025
Apple introduces new app called 'Apple Invites', The Iconfactory launches Tapestry, beyond the traditional portfolio, and more in this week's issue of Creativerly. Creativerly The Great
Daily Coding Problem: Problem #1689 [Medium]
Friday, February 14, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a linked list, sort it in O(n log n) time and constant space. For example,
📧 Stop Conflating CQRS and MediatR
Friday, February 14, 2025
Stop Conflating CQRS and MediatR Read on: my website / Read time: 4 minutes The .NET Weekly is brought to you by: Step right up to the Generative AI Use Cases Repository! See how MongoDB powers your