Edge 458: From Pre-training to Post-training. Inside the Amazing Tülu 3 Framework
Was this email forwarded to you? Sign up here Edge 458: From Pre-training to Post-training. Inside the Amazing Tülu 3 FrameworkA major release by AI2, includes the major components to build post-training pipelines.An interesting trend taking place in the generative AI space is the shift from pretraining to post-training. Mo9st of the emphasis in the first wave of large foundation models was in the pretraining recipes but that seems to be rapidly changing as it has become incredibly simpler to train models in the entire internet raw datasets. Given the rapid shift, we are currently lacking solid post-training frameworks that are up to the standards of production-ready models. This is the focus of a new release by the team at Allen AI with a new framework called Tülu 3. Tülu 3 represents a significant stride in the domain of open-source large language models (LLMs). It distinguishes itself by placing a paramount emphasis on post-training techniques to refine pre-trained LLMs and unlock a broader array of capabilities. The post-training process, carefully designed and openly shared, lies at the heart of Tülu 3’s value proposition. This process aims to bridge the gap between open and closed post-training recipes, often shrouded in secrecy, and propel the open-source community towards achieving state-of-the-art performance. Overview...Subscribe to TheSequence to unlock the rest.Become a paying subscriber of TheSequence to get access to this post and other subscriber-only content. A subscription gets you:
|
Older messages
Edge 456: Inside the Toughest Math Benchmark Ever Built
Thursday, December 19, 2024
FrontierMath pushes the boundaries of mathematical reasoning in foundation models. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The Most Amazing Week in Gen AI Releases
Thursday, December 19, 2024
OpenAI, Google, Microsoft, Cohere and others shipped new models. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
📽 Webinar: How To Maximize Model Accuracy
Thursday, December 19, 2024
Struggling to keep your production ML models accurate without an endless budget? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 457: Can we Distill Specific Knowledge in LLMs? An Intro to Attention-Based Distillation
Thursday, December 19, 2024
One of the most interesting distillation techniques for foundation models. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The Sequence Chat: Can AI Solve The Riemann Hypothesis? Some Ideas About the Progress and Limitations of AI in Sci…
Thursday, December 19, 2024
AI has proven that can help advance scientific fields but how far can that go and what are the pragmatic limitations? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
You Might Also Like
The 4 easiest ways to test Linux
Thursday, December 19, 2024
🚫 TP-Link ban; Favorite wearables of 2024; Gemini 2.0 Flash -- ZDNET ZDNET Tech Today - US December 19, 2024 VirtualBox running OpenMandriva as a guest OS. The 4 easiest ways to test Linux on your old
⚙️ AI startup secures $300 million
Thursday, December 19, 2024
Plus: Medical AI assistants
Post from Syncfusion Blogs on 12/19/2024
Thursday, December 19, 2024
New blogs from Syncfusion Vite.js: Build Faster Frontends By Nishani Dissanayake Learn how Vite.js can speed up your frontend development process with its instant server start and lightning-fast HMR.
Better Than the Apple Watch?
Thursday, December 19, 2024
Introducing ScanWatch Nova Brilliant Edition: Watchmaking excellence coupled with powerful health scans and phenomenal battery life. Effortlessly tracking your every move, ScanWatch Nova Brilliant
Fortinet Warns of Critical FortiWLM Flaw: Update Required to Prevent Exploitation
Thursday, December 19, 2024
THN Daily Updates Newsletter cover Microsoft 365 Excel ($14.99 Value) FREE for a Limited Time Unlock the full potential of Microsoft 365 Excel with this extensive guide, crafted for both beginners and
🎂 Celebrating One Year of Our App!
Thursday, December 19, 2024
From over 300k active users to millions of views, dive into the numbers that made this year on our data storytelling app unforgettable. View Online | Subscribe | Download Our App CELEBRATING A YEAR OF
Spyglass Dispatch: iOS 18.2 • Google v. OpenAI/Microsoft • New FTC Head • GM Crashes Cruise • Sora Slaps
Thursday, December 19, 2024
iOS 18.2 • Google v. OpenAI/Microsoft • New FTC Head • GM Crashes Cruise • Sora Slaps The Spyglass Dispatch is a newsletter sent on weekdays featuring links and commentary on timely topics found around
Daily Coding Problem: Problem #1634 [Medium]
Thursday, December 19, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Facebook. Given a start word, an end word, and a dictionary of valid words, find the
Charted | The Top Performing S&P 500 Stocks in the Last Two Decades 📈
Thursday, December 19, 2024
This infographic ranks the top performing S&P 500 stocks over four different time periods, providing unique historical insight. View Online | Subscribe | Download Our App Presented by: Defiance