Edge 429: MambaByte and the Idea of Tokenization-Free SSMs
Was this email forwarded to you? Sign up here Edge 429: MambaByte and the Idea of Tokenization-Free SSMsCan SSMs operated on raw data instead of tokens?In this issue:
💡 ML Concept of the Day: Tokenization-Free SSMs with MambaByteTokenizers are one of the key components of transformer models. The core idea of tokenizers is to provide a structured syntactic understanding by creating encodings that represent words, subwords or characters. Tokenization helps transformers to not have to learn this structure from the ground up but introduced challenges such as processing long sequences, hallucinations based on the token structure, the memory scaling limitations and, obviously, the pre-processing overhead required to build those tokenizers. The main alternative have been to build models that operate on raw text directly but those haven’t been particularly successful. State Space Models(SSMs) offer a viable alternative to traditional transformer models with a fixed memory and efficient decoding mechanisms. MambaByte is one of the most interesting methods building on those ideas by proposing a token-free SSM based on the Mamba architecture that can directly operate on raw data. Instead of bre4aking inputs into tokens, MambaByte treats it as a continuous stream of data which leads to richer semantic interactions... Subscribe to TheSequence to unlock the rest.Become a paying subscriber of TheSequence to get access to this post and other subscriber-only content. A subscription gets you:
|
Older messages
Sakana AI
Sunday, September 8, 2024
A new $100 million round for the creators of The AI Scientist ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 428: Inside PrompPoet: Character.ai's Framework for Prompt Engineering
Thursday, September 5, 2024
The open source framework abstracts the core building blocks for prompt creation, optimization and management. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 427: Jamba Combines SSMs, Transformers and MOEs in a Single Model
Tuesday, September 3, 2024
Can a hybrid design outperform each one of the baseline architectures? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Cerebras Inference and the Challenges of Challenging NVIDIA’s Dominance
Sunday, September 1, 2024
Why does NVIDIA remains virtually unchallenged in the AI chip market? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
📝 Guest Post: Will Retrieval Augmented Generation (RAG) Be Killed by Long-Context LLMs?*
Friday, August 30, 2024
Pursuing innovation and supremacy in AI shows no signs of slowing down. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
You Might Also Like
Exclusive: Top SaaS Security Threats of 2025—And How to Eliminate Them
Wednesday, September 18, 2024
Learn how 39% of companies are stepping up their defenses.
📳 Your Smart Home Needs Vibration Sensors — 7 Ways to Change How iOS 18 Looks
Wednesday, September 18, 2024
Also: Should You Use AI Photo Editors? How-To Geek Logo September 18, 2024 Did You Know The safety lever on a hand grenade, the lever the soldier holds down after removing the pin, but before throwing
JSK Daily for Sep 18, 2024
Wednesday, September 18, 2024
JSK Daily for Sep 18, 2024 View this email in your browser A community curated daily e-mail of JavaScript news Top 8 React Libraries for Building Beautiful and Functional UIs This article will look at
Daily Coding Problem: Problem #1560 [Medium]
Wednesday, September 18, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a set of closed intervals, find the smallest set of numbers that covers
Nature is Healing
Wednesday, September 18, 2024
Over-Optimized Apple, John Wick AI, Snap's New Reality Nature is Healing Over-Optimized Apple, John Wick AI, Snap's New Reality By MG Siegler • 18 Sept 2024 View in browser View in browser
We tested every iPhone 16 model - buy this one
Wednesday, September 18, 2024
What I love about Pixel 9 Pro; October Prime Day; Best fitness rings -- ZDNET ZDNET Tech Today - US September 18, 2024 placeholder We've used every iPhone 16 model and here's our best buying
Charted | Visualizing S&P 500 Returns After Interest Rate Cuts 📈
Wednesday, September 18, 2024
In the past 50 years, S&P 500 returns following interest rate cuts have varied widely, from +36.5% to -36% a year later. View Online | Subscribe | Download Our App Presented by: The economy is
Top Tech Deals 👀 Garmin Smartwatch, $20 Fire TV Stick, Power Banks, and More!
Wednesday, September 18, 2024
Get a discounted mechanical RGB keyboard, Sony FE lens, portable chargers, and other must-haves. How-To Geek Logo September 18, 2024 Top Tech Deals: Garmin Smartwatch, $20 Fire TV Stick, Power Banks,
⚙️ Intel is trying to make a comeback
Wednesday, September 18, 2024
Plus: Microsoft is launching new AI centers in the Middle East
Whiskey: The Tangible Asset for Your Portfolio
Wednesday, September 18, 2024
Most people fail to diversify their investments. They invest all their money in intangible assets like stocks, bonds, and crypto. The solution - fine whiskey. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏