Edge 406: Inside OpenAI's Recent Breakthroughs in GPT-4 Interpretability
Was this email forwarded to you? Sign up here Edge 406: Inside OpenAI's Recent Breakthroughs in GPT-4 InterpretabilityA new method helps to extract interpretable concepts from large models like GPT-4.Interpretability is one of the crown jewels of modern generative AI. The workings of large frontier models remain largely mysterious compared to other human-made systems. While previous generations of ML saw a boom in interpretability tools and frameworks, most of those techniques have become impractical when applied to massively large neural network. From that perspective, solving interpretability for generative is going to require new methods and potential breakthroughs. A few weeks ago, Anthropic published some research about their work in identifying concepts in LLMs. More recently, OpenAI published a super interesting paper about their work on identifying interpretable features in GPT-4 using a quite novel technique. To interpret LLMs, identifying useful building blocks for their computations is essential. However, the activations within an LLM often display unpredictable patterns, seemingly representing multiple concepts simultaneously. These activations are also densely packed, meaning each activation is constantly engaged with every input. In reality, concepts are usually sparse, with only a few being relevant in any given context. This reality underpins the use of sparse autoencoders, which help identify a few crucial “features” within the network that contribute to any given output. These features exhibit sparse activation patterns, aligning naturally with concepts that humans can easily understand, even without explicit interpretability incentives... Subscribe to TheSequence to unlock the rest.Become a paying subscriber of TheSequence to get access to this post and other subscriber-only content. A subscription gets you:
|
Older messages
Edge 407: LLMs with Infininite Context Windows? Short-Term Memory and Autonomous Agents
Tuesday, June 25, 2024
The role of context windows in LLMs ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
📽 [Virtual Talk] Powering millions of real-time rankings at GetYourGuide
Monday, June 24, 2024
Hi there, Curious about how GetYourGuide, a leading online marketplace for travel excursions, delivers millions of personalized rankings daily, adapting to users' preferences in real time? Join us
Beyond OpenAI: Apple’s On-Device AI Strategy
Sunday, June 23, 2024
Plus a new super coder model, Meta's new AI releases, DeepMind's video-to-audio models and much more. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 404: Inside Anthropic's Dictionary Learning, A Breakthrough in LLM Interpretability
Thursday, June 20, 2024
Arguably one of the most important papers of 2024 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The Sequence Chat: Justin D. Harris - About Building Microsoft CoPilot
Wednesday, June 19, 2024
Quick bio This is your second interview at The Sequence. Please tell us a bit about yourself. Your background, current role and how did you get started in AI? I grew up in the suburbs of Montreal and I
You Might Also Like
Aspire Deployment: Course Updates (coming soon)
Wednesday, October 23, 2024
Hey, it's Milan. Just wanted to share something I'm working on as we're getting closer to the .NET 9 release. I'm working on a brand new chapter for my courses about integrating .NET
📟 Turning Old Tech Into Keychains — How to Use Android's Theft Protection Feature
Tuesday, October 22, 2024
Also: Modern Video Games Are Too Easy, and More! How-To Geek Logo October 22, 2024 Did You Know When Galoob released the "Game Genie" product in the 1990s to allow players on the Nintendo
Unlock Python's Pattern Matching, Combinatoric Iterators, SSH Scripting, and More
Tuesday, October 22, 2024
Structural Pattern Matching in Python #652 – OCTOBER 22, 2024 VIEW IN BROWSER The PyCoder's Weekly Logo Structural Pattern Matching in Python In this tutorial, you'll learn how to harness the
Daily Coding Problem: Problem #1586 [Hard]
Tuesday, October 22, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Airbnb. An 8-puzzle is a game played on a 3 x 3 board of tiles, with the ninth tile
Mapped | The Home Price-to-Income Ratio of Large U.S. Cities 🏘️
Tuesday, October 22, 2024
The top five large US cities have a home price-to-income ratio more than double the national average of 4.7. View Online | Subscribe | Download Our App Presented by Hinrich Foundation NEW REPORT:
Ushering In
Tuesday, October 22, 2024
Netflix's Theatrical Strategy • Blade Runner vs. Elon Musk • Disney vs. App Store • Anthropic's AI PC Control • AirPods Hearing Boost Ushering In Netflix's Theatrical Strategy • Blade
Speeding up with SIMD and Go assembly
Tuesday, October 22, 2024
Plus some Go code generation magic, test parallelism, and working with Excel spreadsheets. | #528 — October 22, 2024 Unsub | Web Version Together with Ardan Labs Go Weekly A Taste of Go Code Generator
LW 155 - Optimizing Shopify Themes for Long Product Descriptions
Tuesday, October 22, 2024
Optimizing Shopify Themes for Long Product Descriptions Shopify Development news and articles
Secure Your Election 2024 eBook at the Best Value Today ⏰
Tuesday, October 22, 2024
Stay informed with our visual guide to the US Presidential Election—exclusively for VC+ members, along with additional updates. View email in browser Now Available: The Visual Guide to the US Election
Startups of The Year: How To Vote
Tuesday, October 22, 2024
Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, October 22, 2024? The HackerNoon