Edge 432: NVIDIA Created Minitron by Distilling Llama 3.1
Was this email forwarded to you? Sign up here Edge 432: NVIDIA Created Minitron by Distilling Llama 3.1The two resulting models of 8B and parameters respectively highlight the potential of distillation.We are regularly dazzled by the advancements in large language models(LLMs) particularly the ones with a massive number of parameters. However, executing 70B+ parameter models for inference results cost prohibited for most organizations. As a result, we have seen a growing influence of smaller language models(SLMs) that make it more cost effective to execute inference workloads. However, there is not always possible to pretrain SLMs from scratch as there are major challenges in terms of data collection, pretraining pipelines and many others. A popular alternative have been to start with larger LLMs and distill them to smaller models. Pruning and distillation are two of the most popular techniques in this area. Recently, NVIDIA released two models called Minitron-8B and Minitron-4B based on distilled versions of Llama 3.1–450B. Minitron focuses on reducing the size of AI models through pruning and distillation, making them more efficient without sacrificing too much accuracy. Pruning reduces a model’s size by either cutting layers (depth pruning) or removing neurons, attention heads, or embedding channels (width pruning). To recover some lost accuracy, retraining is often necessary after pruning. How did they do it?... Subscribe to TheSequence to unlock the rest.Become a paying subscriber of TheSequence to get access to this post and other subscriber-only content. A subscription gets you:
|
Older messages
Edge 431: Meet the Multimodal State Space Models
Tuesday, September 17, 2024
Extending SSMs behind language. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Some Non-Obvious Points About OpenAI 01
Sunday, September 15, 2024
Plus some major funding rounds by World Labs and Glean , Mistral's new release and more. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 430: Learn About The AI Scientist, The Model that can Conduct Long Term Scientific Experimentation
Thursday, September 12, 2024
The framework combines different generative AI models to streamline scientific research from idea to paper. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The Sequence Chat: Lewis Tunstall, Hugging Face, On Building the Model that Won the AI Math Olympiad
Thursday, September 12, 2024
Details about NuminaMath, its architecture, training process and even things that didn't work. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 429: MambaByte and the Idea of Tokenization-Free SSMs
Tuesday, September 10, 2024
Can SSMs operated on raw data instead of tokens? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
You Might Also Like
Daily Coding Problem: Problem #1646 [Medium]
Monday, December 23, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Facebook. Write a function that rotates a list by k elements. For example, [1, 2, 3, 4,
GCP Newsletter #430
Monday, December 23, 2024
Welcome to issue #430 December 23rd, 2024 News Event Official Blog Calling all devs: Code the future of baseball with Google Cloud and MLB - Google Cloud and MLB are hosting a hackathon where
⏯️ Make a Holiday Guest Profile for Your Streaming Services — What Is Linux Mint?
Monday, December 23, 2024
Also: I Played the Worst Mobile Games So You Don't Have To, and More! How-To Geek Logo December 23, 2024 Did You Know The giant splashes of color that make poinsettias a popular holiday decoration
Ranked | The Most Satisfying vs. Most Reliable Car Brands in 2024 🚙
Monday, December 23, 2024
The most reliable car brands are rarely the most satisfying to own, according to recent Consumer Reports survey data. View Online | Subscribe | Download Our App Presented by: Find the megatrends
Bitcoin Enthusiasts Are Letting Altcoins Pass by
Monday, December 23, 2024
Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, December 23, 2024? The
Last Minute Gifts from Walmart
Monday, December 23, 2024
ZDNET ZDNET Sponsored Message In Partnership with Walmart December 23, 2024 exclusive offer Walmart Last-minute gifts from Walmart Shop Now Walmart The tech you've been wishing for–at everyday low
15 ways AI saved me weeks of work in 2024
Monday, December 23, 2024
ZDNET's product of the year; Windows 11 24H2 bug list updated -- ZDNET ZDNET Tech Today - US December 23, 2024 AI applications on various devices. 15 surprising ways I used AI to save me weeks of
Distributed Locking: A Practical Guide
Monday, December 23, 2024
If you're wondering how and when distributed locking can be useful, here's the practical guide. I explained why distributed locking is needed in real-world scenarios. Explored how popular tools
⚡ THN Weekly Recap: Top Cybersecurity Threats, Tools and Tips
Monday, December 23, 2024
Your one-stop-source for last week's top cybersecurity headlines. The Hacker News THN Weekly Recap The online world never takes a break, and this week shows why. From ransomware creators being
⚙️ OpenA(G)I?
Monday, December 23, 2024
Plus: The Genesis Project