Edge 432: NVIDIA Created Minitron by Distilling Llama 3.1
Was this email forwarded to you? Sign up here Edge 432: NVIDIA Created Minitron by Distilling Llama 3.1The two resulting models of 8B and parameters respectively highlight the potential of distillation.We are regularly dazzled by the advancements in large language models(LLMs) particularly the ones with a massive number of parameters. However, executing 70B+ parameter models for inference results cost prohibited for most organizations. As a result, we have seen a growing influence of smaller language models(SLMs) that make it more cost effective to execute inference workloads. However, there is not always possible to pretrain SLMs from scratch as there are major challenges in terms of data collection, pretraining pipelines and many others. A popular alternative have been to start with larger LLMs and distill them to smaller models. Pruning and distillation are two of the most popular techniques in this area. Recently, NVIDIA released two models called Minitron-8B and Minitron-4B based on distilled versions of Llama 3.1–450B. Minitron focuses on reducing the size of AI models through pruning and distillation, making them more efficient without sacrificing too much accuracy. Pruning reduces a model’s size by either cutting layers (depth pruning) or removing neurons, attention heads, or embedding channels (width pruning). To recover some lost accuracy, retraining is often necessary after pruning. How did they do it?... Subscribe to TheSequence to unlock the rest.Become a paying subscriber of TheSequence to get access to this post and other subscriber-only content. A subscription gets you:
|
Older messages
Edge 431: Meet the Multimodal State Space Models
Tuesday, September 17, 2024
Extending SSMs behind language. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Some Non-Obvious Points About OpenAI 01
Sunday, September 15, 2024
Plus some major funding rounds by World Labs and Glean , Mistral's new release and more. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 430: Learn About The AI Scientist, The Model that can Conduct Long Term Scientific Experimentation
Thursday, September 12, 2024
The framework combines different generative AI models to streamline scientific research from idea to paper. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The Sequence Chat: Lewis Tunstall, Hugging Face, On Building the Model that Won the AI Math Olympiad
Thursday, September 12, 2024
Details about NuminaMath, its architecture, training process and even things that didn't work. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 429: MambaByte and the Idea of Tokenization-Free SSMs
Tuesday, September 10, 2024
Can SSMs operated on raw data instead of tokens? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
You Might Also Like
From Oil to AI: The Middle East's Rise as a New AI and Tech Hub - Sync #485
Sunday, September 22, 2024
Plus: Google to flag AI-generated images; new AI laws in California; Cruise is back in Bay Area; drones to fly between hospitals in London; 23andMe independent board directors resigns; and more! ͏ ͏ ͏
From Oil to AI: The Middle East's Rise as a New AI and Tech Hub - Sync #485
Sunday, September 22, 2024
Plus: Google to flag AI-generated images; new AI laws in California; Cruise is back in Bay Area; drones to fly between hospitals in London; 23andMe independent board directors resigns; and more! ͏ ͏ ͏
Laravel 11.23, Pest v3, Laravel Herd, and more! №531
Sunday, September 22, 2024
Your Laravel week in review ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
This Week's Daily Tip Roundup
Sunday, September 22, 2024
Missed some of this week's tips? No problem. We've compiled all of them here in one convenient place for you to enjoy. Happy learning! iPhoneLife Logo View In Browser Your Tip of the Day is
The Big Bucks in Gen AI Investments
Sunday, September 22, 2024
Two massive strategic VC funds were announced this week. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Flexible AirTags/Mental time travel/Memorable audible books
Sunday, September 22, 2024
Recomendo - issue #429 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Kotlin Weekly #425
Sunday, September 22, 2024
ISSUE #425 22th of September 2024 Announcements Kotlin Unit Testing Survey JetBrains is running a survey to understand how they can enhance your experience with writing unit tests. If you are
👋 Goodbye to the Launcher That Changed How I Use Android — Switching From Google to Proton
Saturday, September 21, 2024
Also: I Love These Digital Notetaking Features, and More! How-To Geek Logo September 21, 2024 Did You Know Andy Warhol, an American artist best known as the leading figure of the pop art movement in
⚙️ Make beautiful presentations with Gamma
Saturday, September 21, 2024
Up your presentation game with Gamma
Daily Coding Problem: Problem #1563 [Medium]
Saturday, September 21, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Facebook. There is an N by M matrix of zeroes. Given N and M, write a function to count