Edge 429: MambaByte and the Idea of Tokenization-Free SSMs
Was this email forwarded to you? Sign up here Edge 429: MambaByte and the Idea of Tokenization-Free SSMsCan SSMs operated on raw data instead of tokens?In this issue:
💡 ML Concept of the Day: Tokenization-Free SSMs with MambaByteTokenizers are one of the key components of transformer models. The core idea of tokenizers is to provide a structured syntactic understanding by creating encodings that represent words, subwords or characters. Tokenization helps transformers to not have to learn this structure from the ground up but introduced challenges such as processing long sequences, hallucinations based on the token structure, the memory scaling limitations and, obviously, the pre-processing overhead required to build those tokenizers. The main alternative have been to build models that operate on raw text directly but those haven’t been particularly successful. State Space Models(SSMs) offer a viable alternative to traditional transformer models with a fixed memory and efficient decoding mechanisms. MambaByte is one of the most interesting methods building on those ideas by proposing a token-free SSM based on the Mamba architecture that can directly operate on raw data. Instead of bre4aking inputs into tokens, MambaByte treats it as a continuous stream of data which leads to richer semantic interactions... Subscribe to TheSequence to unlock the rest.Become a paying subscriber of TheSequence to get access to this post and other subscriber-only content. A subscription gets you:
|
Older messages
Sakana AI
Sunday, September 8, 2024
A new $100 million round for the creators of The AI Scientist ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 428: Inside PrompPoet: Character.ai's Framework for Prompt Engineering
Thursday, September 5, 2024
The open source framework abstracts the core building blocks for prompt creation, optimization and management. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 427: Jamba Combines SSMs, Transformers and MOEs in a Single Model
Tuesday, September 3, 2024
Can a hybrid design outperform each one of the baseline architectures? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Cerebras Inference and the Challenges of Challenging NVIDIA’s Dominance
Sunday, September 1, 2024
Why does NVIDIA remains virtually unchallenged in the AI chip market? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
📝 Guest Post: Will Retrieval Augmented Generation (RAG) Be Killed by Long-Context LLMs?*
Friday, August 30, 2024
Pursuing innovation and supremacy in AI shows no signs of slowing down. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
You Might Also Like
Apache Tomcat Vulnerability CVE-2024-56337 Exposes Servers to RCE Attacks
Tuesday, December 24, 2024
THN Daily Updates Newsletter cover The Data Science Handbook, 2nd Edition ($60.00 Value) FREE for a Limited Time Practical, accessible guide to becoming a data scientist, updated to include the latest
Edge 459: Quantization Plus Distillation
Tuesday, December 24, 2024
Some insights into quantized distillation ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Prepare for a Lifetime of Adventure with Rosetta Stone
Tuesday, December 24, 2024
The Perfect Gift For Every Traveler on Your List Rosetta Stone makes it easy to connect with the world in a whole new way. With a Lifetime Unlimited plan, users can access 25 languages to prepare for
Tuesday Triage #232
Tuesday, December 24, 2024
Your weekly crème de la crème of the Internet is here! The 232nd edition featuring fish traps, little Mussolinis, and volvelles. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Elastic Community Newsletter
Tuesday, December 24, 2024
Check out the latest from the Elastic Community ㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤ ㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤ ㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤ elastic | Search. Observe. Protect community-newsletter-header-img.png
Daily Coding Problem: Problem #1646 [Medium]
Monday, December 23, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Facebook. Write a function that rotates a list by k elements. For example, [1, 2, 3, 4,
GCP Newsletter #430
Monday, December 23, 2024
Welcome to issue #430 December 23rd, 2024 News Event Official Blog Calling all devs: Code the future of baseball with Google Cloud and MLB - Google Cloud and MLB are hosting a hackathon where
⏯️ Make a Holiday Guest Profile for Your Streaming Services — What Is Linux Mint?
Monday, December 23, 2024
Also: I Played the Worst Mobile Games So You Don't Have To, and More! How-To Geek Logo December 23, 2024 Did You Know The giant splashes of color that make poinsettias a popular holiday decoration
Ranked | The Most Satisfying vs. Most Reliable Car Brands in 2024 🚙
Monday, December 23, 2024
The most reliable car brands are rarely the most satisfying to own, according to recent Consumer Reports survey data. View Online | Subscribe | Download Our App Presented by: Find the megatrends
Bitcoin Enthusiasts Are Letting Altcoins Pass by
Monday, December 23, 2024
Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, December 23, 2024? The