Edge 429: MambaByte and the Idea of Tokenization-Free SSMs
Was this email forwarded to you? Sign up here Edge 429: MambaByte and the Idea of Tokenization-Free SSMsCan SSMs operated on raw data instead of tokens?In this issue:
💡 ML Concept of the Day: Tokenization-Free SSMs with MambaByteTokenizers are one of the key components of transformer models. The core idea of tokenizers is to provide a structured syntactic understanding by creating encodings that represent words, subwords or characters. Tokenization helps transformers to not have to learn this structure from the ground up but introduced challenges such as processing long sequences, hallucinations based on the token structure, the memory scaling limitations and, obviously, the pre-processing overhead required to build those tokenizers. The main alternative have been to build models that operate on raw text directly but those haven’t been particularly successful. State Space Models(SSMs) offer a viable alternative to traditional transformer models with a fixed memory and efficient decoding mechanisms. MambaByte is one of the most interesting methods building on those ideas by proposing a token-free SSM based on the Mamba architecture that can directly operate on raw data. Instead of bre4aking inputs into tokens, MambaByte treats it as a continuous stream of data which leads to richer semantic interactions... Subscribe to TheSequence to unlock the rest.Become a paying subscriber of TheSequence to get access to this post and other subscriber-only content. A subscription gets you:
|
Older messages
Sakana AI
Sunday, September 8, 2024
A new $100 million round for the creators of The AI Scientist ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 428: Inside PrompPoet: Character.ai's Framework for Prompt Engineering
Thursday, September 5, 2024
The open source framework abstracts the core building blocks for prompt creation, optimization and management. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 427: Jamba Combines SSMs, Transformers and MOEs in a Single Model
Tuesday, September 3, 2024
Can a hybrid design outperform each one of the baseline architectures? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Cerebras Inference and the Challenges of Challenging NVIDIA’s Dominance
Sunday, September 1, 2024
Why does NVIDIA remains virtually unchallenged in the AI chip market? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
📝 Guest Post: Will Retrieval Augmented Generation (RAG) Be Killed by Long-Context LLMs?*
Friday, August 30, 2024
Pursuing innovation and supremacy in AI shows no signs of slowing down. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
You Might Also Like
Recording: 'Data Storytelling: What Organizations Need to Know Going Into 2025'
Friday, November 22, 2024
Thank you for your interest in our latest webinar. As promised here is your recording of the event. View email in browser Recording Now Available Thank you for your interest in receiving a recording of
💻 Issue 437 - Introducing local Azure Service Bus Emulator
Thursday, November 21, 2024
This week's Awesome .NET Weekly Read this email on the Web The Awesome .NET Weekly Issue » 437 Release Date Nov 21, 2024 Your weekly report of the most popular .NET news, articles and projects
💎 Issue 444 - Why did people rub snow on frozen feet? (2017)
Thursday, November 21, 2024
This week's Awesome Ruby Newsletter Read this email on the Web The Awesome Ruby Newsletter Issue » 444 Release Date Nov 21, 2024 Your weekly report of the most popular Ruby news, articles and
💻 Issue 444 - JavaScript Dos and Donts
Thursday, November 21, 2024
This week's Awesome JavaScript Weekly Read this email on the Web The Awesome JavaScript Weekly Issue » 444 Release Date Nov 21, 2024 Your weekly report of the most popular JavaScript news, articles
📱 Issue 438 - Reverse Engineering iOS 18 Inactivity Reboot
Thursday, November 21, 2024
This week's Awesome iOS Weekly Read this email on the Web The Awesome iOS Weekly Issue » 438 Release Date Nov 21, 2024 Your weekly report of the most popular iOS news, articles and projects Popular
💻 Issue 362 - React Anti-Pattern: Stop Passing Setters Down the Components Tree
Thursday, November 21, 2024
This week's Awesome React Weekly Read this email on the Web The Awesome React Weekly Issue » 362 Release Date Nov 21, 2024 Your weekly report of the most popular React news, articles and projects
💻 Issue 444 - Building simple event-driven applications with Pub/Sub
Thursday, November 21, 2024
This week's Awesome Node.js Weekly Read this email on the Web The Awesome Node.js Weekly Issue » 444 Release Date Nov 21, 2024 Your weekly report of the most popular Node.js news, articles and
📱 Issue 441 - Shift Left Is the Tip of the Iceberg
Thursday, November 21, 2024
This week's Awesome Swift Weekly Read this email on the Web The Awesome Swift Weekly Issue » 441 Release Date Nov 21, 2024 Your weekly report of the most popular Swift news, articles and projects
💻 Issue 439 - Async/Await Is Real And Can Hurt You
Thursday, November 21, 2024
This week's Awesome Rust Weekly Read this email on the Web The Awesome Rust Weekly Issue » 439 Release Date Nov 21, 2024 Your weekly report of the most popular Rust news, articles and projects
📲 Why I Ditched Linux for Samsung DeX — Buy This Instead of a Gaming Headset
Thursday, November 21, 2024
Also: Taking Instagram Stories to the Next Level, and More! How-To Geek Logo November 21, 2024 Did You Know Thurl Ravenscroft was both the voice behind the Christmas song "You're a Mean One,