One AI for Navigating Any 3D Environment
Was this email forwarded to you? Sign up here One AI for Navigating Any 3D EnvironmentA very impressive new model created by Google DeepMind is able to follow language instructions in any 3D environment.Next Week in The Sequence:
You can subscribe to The Sequence below:📝 Editorial: SIMA, One AI for Navigating Any 3D EnvironmentVideo games have long served as some of the best environments for training AI agents. Since their early days, AI labs like OpenAI and DeepMind have built agents that excel at mastering video games such as Atari, Dota 2, StarCraft, and many others. The principles of many of these agents have been applied in areas such as embodied AI, self-driving cars, and many other domains that require taking action in different environments. However, most of the AI breakthroughs in 3D game environments have been constrained to one or a small number of games. Building that type of AI is really hard, but imagine if we could build agents that can understand many gaming worlds at once and follow instructions like a human player? Last week, Google DeepMind unveiled their work on the Scalable, Instructable, Multiworld Agent (SIMA). The goal of the project was to develop instructable agents that can interact with any 3D environment just like a human by following simple language instructions. This might not seem like a big deal until we consider that the standard way to communicate instructions has been with super expensive reinforcement learning models. Language is the most powerful and yet simple abstraction for communicating instructions about the world or, in this case, a 3D virtual world. The magic of SIMA is its ability to translate those abstract instructions into mouse and keyboard actions used to navigate an environment. DeepMind trained SIMA using a large dataset of gameplay and the corresponding mouse and keyboard interactions. By using language instructions, SIMA was able to master actions in 3D environments it hadn’t seen before. Just like LLMs can apply language across many domains, SIMA can apply actions across many environments. Models like SIMA can have profound implications for embodied AI environments, simulations, and many other settings in which agents need to carry out physical tasks. Another impressive achievement by DeepMind. 🔎 ML ResearchSIMAGoogle DeepMind published a paper introducing Scalable Instructable Multiworld Agent (SIMA), a generalist agent for 3D virtual environments. SIMA is able to understand many 3D worlds and carry out tasks within them —> Read more. Chain Of TableGoogle Research published a paper detailing Chain-of-Table, an LLM reasoning method optimized for table understanding tasks. The model transform tables into smaller and more manageable segments that can be used by the LLMs to orchestrate different tasks —> Read more. DeepSeek-VLDeepSeek-AI published a paper detailing DeepSeek-VL, an open source vision-language model optimized for real world applications. Specifically, DeepSeek-VL excels in areas such as screenshot understanding, OCR, charts and other specific data structures common in vision-language apps —> Read more. LLMs and Graph DataGoogle Research published a paper detailing a method to teach LLMs to reason through graph data. The paper also includes GraphQA, a benchmark designed to evaluate LLMs on graph reasoning tasks —> Read more. DocFormer v2Amazon Science published a paper detailing DocFormerv2, a transformer based model optimized for understanding documents. DocFormverv2 can make sense of visual and textual information in a way that mimics human reasoning —> Read more. 🤖 Cool AI Tech ReleasesCommand-RCohere released Command R, a model optimized for RAG and tool usage —> Read more. Claude 3 Haikuanthropic released Claude 3 Haiku, a faster and more affordable version of its marquee model —> Read more. 🛠 Real World MLMeta’s Gen AI InfrastructureMeta shares some details about the compute infrastructure used in their gen AI workloads —> Read more. Inside EinsteinSalesforce discusses how the Einstein platform manages data and AI workloads —> Read more. ML Infra at NetflixNetflix shares some details about its Metaflow platform and complementary integrations used in its ML workloads —> Read more. LLM-Reviews at YelpYelp details the LLM-based architecture to detect inappropiate language in reviews —> Read more. Semantic Search at WalmartWalmart Global Tech shares some details about their semantic search architecture —> Read more. 📡AI Radar
You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
📌 Exciting news! The speaker lineup for apply() 2024 is now live
Friday, March 15, 2024
The speaker lineup for apply() 2024 is now live and we can't wait to show you! Join industry leaders, starting Wednesday, April 3rd at 9AM PT, from LangChain, Meta, Pinterest, Samsung, Vanguard,
Edge 378: Meet TimesFM: Google's New Foundation Model for Time-Series Forecasting
Friday, March 15, 2024
The model is about 200M parameters and has been trained in over 100 billion data points. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 377: LLM Reasoning with Reinforced Fine-Tuning
Tuesday, March 12, 2024
A very recent LLM reasoning technique created by ByteDance research. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
📝 Guest Post: Evaluating LLM Applications*
Monday, March 11, 2024
To successfully build an AI application, evaluating the performance of large language models (LLMs) is crucial. Given the inherent novelty and complexities surrounding LLMs, this poses a unique
Can I Solve Science?
Sunday, March 10, 2024
A brilliant essay by Stephen Wolfram explores this challenging question. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
You Might Also Like
📧 EF Core Migrations: A Detailed Guide
Saturday, May 18, 2024
EF Core Migrations: A Detailed Guide Read on: my website / Read time: 10 minutes BROUGHT TO YOU BY Low-code Framework for .NET Devs Introducing Shesha, a brand new, open-source, low-code
Slack is under attack … and you don’t want that
Friday, May 17, 2024
Plus: OpenAI is not aligned with its Superalignment team View this email online in your browser By Christine Hall Friday, May 17, 2024 Good afternoon, and welcome back to TechCrunch PM. We made it to
Ilya Sutskever leaves OpenAI - Weekly News Roundup - Issue #467
Friday, May 17, 2024
Plus: Apple is close to using ChatGPT; Microsoft builds its own LLM; China is sending a humanoid robot to space; lab-grown meat is on shelves but there is a catch; hybrid mouse/rat brains; and more! ͏
SWLW #599: Surfing through trade-offs, How to do hard things, and more.
Friday, May 17, 2024
Weekly articles & videos about people, culture and leadership: everything you need to design the org that makes the product. A weekly newsletter by Oren Ellenbogen with the best content I found
💾 There Will Never Be Another Windows XP — Why Ray Tracing is a Big Deal in Gaming
Friday, May 17, 2024
Also: What to Know About Google's Project Astra, and More! How-To Geek Logo May 17, 2024 Did You Know The very first mass-manufactured drinking straw was made of paper coated in wax; the straw was
It's the dawning of the age of AI
Friday, May 17, 2024
Plus: Musk is raging against the machine View this email online in your browser By Haje Jan Kamps Friday, May 17, 2024 Image Credits: Google Welcome to Startups Weekly — Haje's weekly recap of
Daily Coding Problem: Problem #1444 [Medium]
Friday, May 17, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Yahoo. Recall that a full binary tree is one in which each node is either a leaf node,
(Not) Sent From My iPad
Friday, May 17, 2024
The future of computing remains frustrating (Not) Sent From My iPad By MG Siegler • 17 May 2024 View in browser View in browser I tried. I really did. I tried to put together and send this newsletter
iOS Dev Weekly - Issue 661
Friday, May 17, 2024
What's the word on everyone's lips? 🅰️👁️ View on the Web Archives ISSUE 661 May 17th 2024 Comment Did you catch Google I/O this week? It's Always Interesting to see what the Android
Your Google Play recap from I/O 2024
Friday, May 17, 2024
Check out all of our latest updates and announcements Email not displaying correctly? View it online May 2024 Google Play at I/O 2024 Check out the Google Play keynote to discover the latest products