Edge 429: MambaByte and the Idea of Tokenization-Free SSMs
Was this email forwarded to you? Sign up here Edge 429: MambaByte and the Idea of Tokenization-Free SSMsCan SSMs operated on raw data instead of tokens?In this issue:
💡 ML Concept of the Day: Tokenization-Free SSMs with MambaByteTokenizers are one of the key components of transformer models. The core idea of tokenizers is to provide a structured syntactic understanding by creating encodings that represent words, subwords or characters. Tokenization helps transformers to not have to learn this structure from the ground up but introduced challenges such as processing long sequences, hallucinations based on the token structure, the memory scaling limitations and, obviously, the pre-processing overhead required to build those tokenizers. The main alternative have been to build models that operate on raw text directly but those haven’t been particularly successful. State Space Models(SSMs) offer a viable alternative to traditional transformer models with a fixed memory and efficient decoding mechanisms. MambaByte is one of the most interesting methods building on those ideas by proposing a token-free SSM based on the Mamba architecture that can directly operate on raw data. Instead of bre4aking inputs into tokens, MambaByte treats it as a continuous stream of data which leads to richer semantic interactions... Subscribe to TheSequence to unlock the rest.Become a paying subscriber of TheSequence to get access to this post and other subscriber-only content. A subscription gets you:
|
Older messages
Sakana AI
Sunday, September 8, 2024
A new $100 million round for the creators of The AI Scientist ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 428: Inside PrompPoet: Character.ai's Framework for Prompt Engineering
Thursday, September 5, 2024
The open source framework abstracts the core building blocks for prompt creation, optimization and management. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 427: Jamba Combines SSMs, Transformers and MOEs in a Single Model
Tuesday, September 3, 2024
Can a hybrid design outperform each one of the baseline architectures? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Cerebras Inference and the Challenges of Challenging NVIDIA’s Dominance
Sunday, September 1, 2024
Why does NVIDIA remains virtually unchallenged in the AI chip market? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
📝 Guest Post: Will Retrieval Augmented Generation (RAG) Be Killed by Long-Context LLMs?*
Friday, August 30, 2024
Pursuing innovation and supremacy in AI shows no signs of slowing down. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
You Might Also Like
AI + high-stakes poker + Google's prompt cheat sheet
Tuesday, October 8, 2024
and a google prompt cheat sheet ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
👍 How My Phone Became My Favorite Game Console — Why Desktop Linux Matters
Tuesday, October 8, 2024
Also: iPhone Mirroring Is Here and Mostly Works, and More! How-To Geek Logo October 8, 2024 Did You Know At the end of the song "Sweet Child O' Mine," found on Guns N' Roses'
Software Testing Weekly - Issue 240
Tuesday, October 8, 2024
How Sonos Lost $200M: A Hard Lesson in Quality 🚨 View on the Web Archives ISSUE 240 October 8th 2024 COMMENT Welcome to the 240th issue! Back in June, I shared with you about the big problem with a new
Immutable Types, DuckDB & Pyodide, Free Threaded, and More
Tuesday, October 8, 2024
Differences Between Python's Mutable and Immutable Types #650 – OCTOBER 8, 2024 VIEW IN BROWSER The PyCoder's Weekly Logo Differences Between Python's Mutable and Immutable Types In this
Ranked | The Costliest Hurricanes to Hit the U.S. ☔
Tuesday, October 8, 2024
As of 2023, Hurricane Katrina is the costliest natural disaster in US history, causing over $200 billion in damages in 2024 dollars. View Online | Subscribe | Download Our App Presented by: NEW REPORT:
Daily Coding Problem: Problem #1572 [Easy]
Tuesday, October 8, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Yelp. Given a mapping of digits to letters (as in a phone number), and a digit string,
The Race for Server Space
Tuesday, October 8, 2024
Apple's Leak, Disney's Star Wars, Google's Epic Fail, OpenAI's Space Race The Race for Server Space Apple's Leak, Disney's Star Wars, Google's Epic Fail, OpenAI's Space
Microsoft goes Go for SQL Server's CLI
Tuesday, October 8, 2024
Plus new ways to deploy Go apps, reflecting on reflection, and Windows gets high resolution timers in Go. | Together with Frontend Masters logo #526 — October 8, 2024 Unsub | Web Version Go Weekly
⚙️ Nvidia's new Agents
Tuesday, October 8, 2024
Plus: Chipmaker delivers 100k GPUs
How Does Visual Capitalist Work With Clients? 💪
Tuesday, October 8, 2024
Here's how organizations can partner with Visual Capitalist to leverage world-class data storytelling, and its strong audience and reach. View Online | Subscribe | Download Our App For 13 years,