Edge 406: Inside OpenAI's Recent Breakthroughs in GPT-4 Interpretability
Was this email forwarded to you? Sign up here Edge 406: Inside OpenAI's Recent Breakthroughs in GPT-4 InterpretabilityA new method helps to extract interpretable concepts from large models like GPT-4.Interpretability is one of the crown jewels of modern generative AI. The workings of large frontier models remain largely mysterious compared to other human-made systems. While previous generations of ML saw a boom in interpretability tools and frameworks, most of those techniques have become impractical when applied to massively large neural network. From that perspective, solving interpretability for generative is going to require new methods and potential breakthroughs. A few weeks ago, Anthropic published some research about their work in identifying concepts in LLMs. More recently, OpenAI published a super interesting paper about their work on identifying interpretable features in GPT-4 using a quite novel technique. To interpret LLMs, identifying useful building blocks for their computations is essential. However, the activations within an LLM often display unpredictable patterns, seemingly representing multiple concepts simultaneously. These activations are also densely packed, meaning each activation is constantly engaged with every input. In reality, concepts are usually sparse, with only a few being relevant in any given context. This reality underpins the use of sparse autoencoders, which help identify a few crucial “features” within the network that contribute to any given output. These features exhibit sparse activation patterns, aligning naturally with concepts that humans can easily understand, even without explicit interpretability incentives... Subscribe to TheSequence to unlock the rest.Become a paying subscriber of TheSequence to get access to this post and other subscriber-only content. A subscription gets you:
|
Older messages
Edge 407: LLMs with Infininite Context Windows? Short-Term Memory and Autonomous Agents
Tuesday, June 25, 2024
The role of context windows in LLMs ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
📽 [Virtual Talk] Powering millions of real-time rankings at GetYourGuide
Monday, June 24, 2024
Hi there, Curious about how GetYourGuide, a leading online marketplace for travel excursions, delivers millions of personalized rankings daily, adapting to users' preferences in real time? Join us
Beyond OpenAI: Apple’s On-Device AI Strategy
Sunday, June 23, 2024
Plus a new super coder model, Meta's new AI releases, DeepMind's video-to-audio models and much more. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 404: Inside Anthropic's Dictionary Learning, A Breakthrough in LLM Interpretability
Thursday, June 20, 2024
Arguably one of the most important papers of 2024 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The Sequence Chat: Justin D. Harris - About Building Microsoft CoPilot
Wednesday, June 19, 2024
Quick bio This is your second interview at The Sequence. Please tell us a bit about yourself. Your background, current role and how did you get started in AI? I grew up in the suburbs of Montreal and I
You Might Also Like
💰 Getting Your Money's Worth With YouTube TV — What the Copilot Key on Your Laptop Does
Tuesday, July 2, 2024
Also: Dyson Robot Vacuum Review, and More! How-To Geek Logo July 2, 2024 Did You Know Frogs can stay in the tadpole stage of development for extended periods of time if environmental conditions aren
JSK Daily for Jul 2, 2024
Tuesday, July 2, 2024
JSK Daily for Jul 2, 2024 View this email in your browser A community curated daily e-mail of JavaScript news Easily Build an Interactive BPMN Viewer and Editor in Angular A business process model and
Build a Calculator, Satellite Data, Best Practices, and More
Tuesday, July 2, 2024
Build a GUI Calculator With PyQt and Python #636 – JULY 2, 2024 VIEW IN BROWSER The PyCoder's Weekly Logo Build a GUI Calculator With PyQt and Python In this video course, you'll learn how to
Almost All A.I. Investments are Going to Zero
Tuesday, July 2, 2024
Seemingly every investor I talk to these days is struggling with the same basic thing: they believe AI is going to be one of the most transformative technologies of the past several decades – and
Issue 157
Tuesday, July 2, 2024
🧠🤖 The most dystopian thing you'll read this month (probably). How a smart, quiet Boston teenager stole millions in Crypto. Beyond ATS: AI interviewer ushers in new era of tech hiring. ͏ ͏ ͏ ͏ ͏ ͏
Daily Coding Problem: Problem #1484 [Easy]
Tuesday, July 2, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Amazon. Given an integer N , construct all possible binary search trees with N nodes.
Writing generic collection types
Tuesday, July 2, 2024
Plus Russ Cox's new bot, using Go from JavaScript, and the River job queue gets a UI. | #513 — July 2, 2024 Unsub | Web Version Together with Ardan Labs Go Weekly Writing Generic Collection Types:
Mapped | The Growth in U.S. House Prices by State in 2024 🏠
Tuesday, July 2, 2024
As US home values continue to climb, we show the growth in house prices by state in 2024 at a time of persistently high interest rates. View Online | Subscribe Presented by: EnergyX's
Power BI Weekly #265 - 2nd July 2024
Tuesday, July 2, 2024
Power BI Weekly Newsletter Issue #265 powered by endjin Welcome to the 265th edition of Power BI Weekly! A very short edition this week. The only announcement is that of the Power BI Admin portal Usage
The Odin Project: My 111-Day Experience With It
Tuesday, July 2, 2024
Top Tech Content sent at Noon! Join MongoDB's AI Dev Quest Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, July 2, 2024? The HackerNoon