The Sequence Chat: Oren Etzioni – Allen AI, About Advancing Research in Foundation Models
Was this email forwarded to you? Sign up here The Sequence Chat: Oren Etzioni – Allen AI, About Advancing Research in Foundation ModelsAn AI legend discusses cutting edge research in foundation models.Dr. Oren Etzioni is also a Venture Partner at the Madrona Venture Group, and a Technical Director at the AI2 Incubator. He was the Founding CEO of the Allen Institute for AI. His awards include AAAI Fellow and Seattle’s Geek of the Year. He founded several startups including Farecast (acquired by Microsoft). Etzioni has written over 200 technical papers, garnering several awards including the ACL Test of Time Award in 2022. He has also authored commentary for The New York Times, Harvard Business Review, and Nature. Quick bio
I was a professor for most of my career focused on AI, NLP, and Web Search. I launched The Allen Institute of AI (AI2) in 2014 for the late Paul Allen and it’s grown to 250+ and over $100M in annual funding. I’ve always been fascinated by startups, having launched several AI-based startups over the years. AI2 has also created and spun out an incubator that is approaching $1B in the total valuation of startup financing rounds and acquisitions. 🛠 AI Work
Recently the PRIOR team at AI2 released Unified-IO, the first neural model to perform a large and diverse set of AI tasks spanning classical computer vision, image synthesis, vision-and-language, and natural language processing (NLP). We are continuing work in this area and will have a new version of this model available in the near future. We are also investing in generative language models with our new initiative AI2 OLMo, a uniquely open language model intended to benefit the research community by providing access and education around all aspects of model creation.
This work out of AI2’s Mosaic common sense AI team was inspired by the dual process theory of human cognition featured in the well-known book Thinking, Fast and Slow by Daniel Kahneman. This theory proposes two distinct human thinking systems: one characterized by rapid and intuitive thought, and another that emphasizes analytical and deliberate reasoning. Our goal is ultimately to be able to tackle intricate real-world problems using AI models in a much more cost-effective manner than is currently possible, so by building a model framework that could integrate both of these approaches, we thought we might be able to optimize an AI agent’s potential for planning complex interactive tasks while minimizing the cost of its reasoning. Many tasks do not require a deliberate and detailed analysis to perform successfully, so by leveraging a “fast”-thinking approach when appropriate, we thought we could improve the model’s speed and performance overall. We called the framework we designed “SwiftSage.” The “Swift” module is an encoder-decoder based language model designed to quickly process short-term memory content such as previous actions, current observations, and the environment state, which simulates the fast, intuitive thinking characteristics found in the “Fast” mode of thinking. The second module, “Sage,” represents the deliberate process of the second mode of thinking by harnessing the power of large language models (LLMs) like GPT-4. A heuristic algorithm plays a crucial role in the system by determining when our framework should activate or deactivate the Sage module. Our intuitions about this approach were correct; thanks to its dual-system design for fast and slow thinking, SwiftSage dramatically reduces the number of tokens necessary for each action in LLM inference, making it more cost-effective and efficient than the next best system.
AI2 OLMo is a language model currently under development at AI2 that is being deliberately designed to support research by providing access to every element of the system we create, from data to training and evaluation code to model weights. We recognize a real dearth of accessible, understandable language models which is holding the AI research community back from understanding and advancing this critical new technology. The most powerful language models today are released by for-profit organizations and with limited insight into the data and methods used to create them; we aim to change that with OLMo. Our goal is to democratize access to systems like these and advance their development and safety for everyone.
While the performance of modern LLMs is stunning, it is hard to tell how they arrive at their answers, and if their internal reasoning even makes sense. (There have been numerous cases where LLMs' poor reasoning and hallucinations led people into trouble). We have been developing techniques (e.g., Entailer, Reflex) for uncovering the LLMs' "beliefs", and how its answers follow from them via systematic chains of reasoning, which can then be viewed by a user - in other words, providing users a view of the "mental model" that the LLM has about a current problem. This allows us to spot and reject answers derived from faulty chains of reasoning or faulty beliefs, helping engender trust in the model's answers. And it paves the way for users to correct any erroneous LLM beliefs that were uncovered, as users can now see relevant model beliefs, allowing them to teach the system when it goes wrong and improve over time. In addition, modern LLMs still struggle with complex tasks that require multiple, specialized steps to be chained together, for example mathematics (where different math operations are needed), or other data manipulation queries, e.g., "What do the third letters of the words in 'John Greg Calvin Melville Rhon' spell?" (answer: "hello"). However, while LLMs struggle to answer such questions directly, they are proficient at breaking up complex tasks into smaller tasks - a process called task decomposition. And in many cases, the LLM itself can solve those smaller tasks. We have recently developed a technique called Decomposed Prompting to control this process, so the LLM repeatedly decomposes problems where it is stuck, and solves the simpler pieces when it is not, putting a new class of previously unsolvable problems within reach.
Beaker was first launched internally at AI2 in 2017 as an experimentation platform for AI2 researchers to run jobs on the cloud and organize their experiments. Beaker made it substantially easier to manage cloud instances and run large-scale jobs over thousands of nodes, as well as to make sense of the large volume of experiments at AI2. With the rise of deep learning, Beaker evolved to primarily support GPU jobs and manage workloads across our dedicated GPU cluster. 💥 Miscellaneous – a set of rapid-fire questions
Building software agents that utilize tools and learn on the user’s behalf.
It would certainly be a huge benefit to humanity as we fight climate change, the next pandemic, and more.
The transformer is a simple method that is highly scalable. I do believe we will see new architectures in the next three years but will be a bit coy in predicting which one is next.
I think we will see balance—just as we’ve seen in operating systems between Windows and Linux. There will both open and closed models. You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
Edge 319: The Factors Behind In-Context Learning
Tuesday, August 22, 2023
What are the key elements that cause the emergece of in-context-learning in language models and how are they unique.
📺 Webinar: Create better features for your ML models
Monday, August 21, 2023
Transforming raw data into features to power machine learning models is one of the biggest challenges in production ML. Join Tecton CEO, Mike Del Balso, for this webinar on Thursday, August 31, at 9 AM
The NVIDIA GPU Scarcity Madness
Sunday, August 20, 2023
Sundays, The Sequence Scope brings a summary of the most important research papers, technology releases and VC funding deals in the artificial intelligence space.
📍 Hands-On Lab Next Week: Learn how to build great ML features and deploy them to production quickly and reliably
Friday, August 18, 2023
Build and deploy batch, streaming, and real-time ML features in just a few minutes
Inside LLM-AUGMENTER: Microsoft Research’s Reference Architecture to Extend LLMs with Memory, Knowledge, and Exter…
Thursday, August 17, 2023
The architecture showcases the key building blocks of production-ready LLMs.
You Might Also Like
Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator
Friday, February 14, 2025
What came before the golem? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Defining Your Paranoia Level: Navigating Change Without the Overkill
Friday, February 14, 2025
We've all been there: trying to learn something new, only to find our old habits holding us back. We discussed today how our gut feelings about solving problems can sometimes be our own worst enemy
5 ways AI can help with taxes 🪄
Friday, February 14, 2025
Remotely control an iPhone; 💸 50+ early Presidents' Day deals -- ZDNET ZDNET Tech Today - US February 10, 2025 5 ways AI can help you with your taxes (and what not to use it for) 5 ways AI can help
Recurring Automations + Secret Updates
Friday, February 14, 2025
Smarter automations, better templates, and hidden updates to explore 👀 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The First Provable AI-Proof Game: Introducing Butterfly Wings 4
Friday, February 14, 2025
Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? undefined The Market Today #01 Instagram (Meta) 714.52 -0.32%
GCP Newsletter #437
Friday, February 14, 2025
Welcome to issue #437 February 10th, 2025 News BigQuery Cloud Marketplace Official Blog Partners BigQuery datasets now available on Google Cloud Marketplace - Google Cloud Marketplace now offers
Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰
Friday, February 14, 2025
Discover how the share of US wealth held by the top 1% has evolved from 1989 to 2024 in this infographic. View Online | Subscribe | Download Our App Download our app to see thousands of new charts from
The Great Social Media Diaspora & Tapestry is here
Friday, February 14, 2025
Apple introduces new app called 'Apple Invites', The Iconfactory launches Tapestry, beyond the traditional portfolio, and more in this week's issue of Creativerly. Creativerly The Great
Daily Coding Problem: Problem #1689 [Medium]
Friday, February 14, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a linked list, sort it in O(n log n) time and constant space. For example,
📧 Stop Conflating CQRS and MediatR
Friday, February 14, 2025
Stop Conflating CQRS and MediatR Read on: my website / Read time: 4 minutes The .NET Weekly is brought to you by: Step right up to the Generative AI Use Cases Repository! See how MongoDB powers your