One AI for Navigating Any 3D Environment
Was this email forwarded to you? Sign up here One AI for Navigating Any 3D EnvironmentA very impressive new model created by Google DeepMind is able to follow language instructions in any 3D environment.Next Week in The Sequence:
You can subscribe to The Sequence below:📝 Editorial: SIMA, One AI for Navigating Any 3D EnvironmentVideo games have long served as some of the best environments for training AI agents. Since their early days, AI labs like OpenAI and DeepMind have built agents that excel at mastering video games such as Atari, Dota 2, StarCraft, and many others. The principles of many of these agents have been applied in areas such as embodied AI, self-driving cars, and many other domains that require taking action in different environments. However, most of the AI breakthroughs in 3D game environments have been constrained to one or a small number of games. Building that type of AI is really hard, but imagine if we could build agents that can understand many gaming worlds at once and follow instructions like a human player? Last week, Google DeepMind unveiled their work on the Scalable, Instructable, Multiworld Agent (SIMA). The goal of the project was to develop instructable agents that can interact with any 3D environment just like a human by following simple language instructions. This might not seem like a big deal until we consider that the standard way to communicate instructions has been with super expensive reinforcement learning models. Language is the most powerful and yet simple abstraction for communicating instructions about the world or, in this case, a 3D virtual world. The magic of SIMA is its ability to translate those abstract instructions into mouse and keyboard actions used to navigate an environment. DeepMind trained SIMA using a large dataset of gameplay and the corresponding mouse and keyboard interactions. By using language instructions, SIMA was able to master actions in 3D environments it hadn’t seen before. Just like LLMs can apply language across many domains, SIMA can apply actions across many environments. Models like SIMA can have profound implications for embodied AI environments, simulations, and many other settings in which agents need to carry out physical tasks. Another impressive achievement by DeepMind. 🔎 ML ResearchSIMAGoogle DeepMind published a paper introducing Scalable Instructable Multiworld Agent (SIMA), a generalist agent for 3D virtual environments. SIMA is able to understand many 3D worlds and carry out tasks within them —> Read more. Chain Of TableGoogle Research published a paper detailing Chain-of-Table, an LLM reasoning method optimized for table understanding tasks. The model transform tables into smaller and more manageable segments that can be used by the LLMs to orchestrate different tasks —> Read more. DeepSeek-VLDeepSeek-AI published a paper detailing DeepSeek-VL, an open source vision-language model optimized for real world applications. Specifically, DeepSeek-VL excels in areas such as screenshot understanding, OCR, charts and other specific data structures common in vision-language apps —> Read more. LLMs and Graph DataGoogle Research published a paper detailing a method to teach LLMs to reason through graph data. The paper also includes GraphQA, a benchmark designed to evaluate LLMs on graph reasoning tasks —> Read more. DocFormer v2Amazon Science published a paper detailing DocFormerv2, a transformer based model optimized for understanding documents. DocFormverv2 can make sense of visual and textual information in a way that mimics human reasoning —> Read more. 🤖 Cool AI Tech ReleasesCommand-RCohere released Command R, a model optimized for RAG and tool usage —> Read more. Claude 3 Haikuanthropic released Claude 3 Haiku, a faster and more affordable version of its marquee model —> Read more. 🛠 Real World MLMeta’s Gen AI InfrastructureMeta shares some details about the compute infrastructure used in their gen AI workloads —> Read more. Inside EinsteinSalesforce discusses how the Einstein platform manages data and AI workloads —> Read more. ML Infra at NetflixNetflix shares some details about its Metaflow platform and complementary integrations used in its ML workloads —> Read more. LLM-Reviews at YelpYelp details the LLM-based architecture to detect inappropiate language in reviews —> Read more. Semantic Search at WalmartWalmart Global Tech shares some details about their semantic search architecture —> Read more. 📡AI Radar
You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
📌 Exciting news! The speaker lineup for apply() 2024 is now live
Friday, March 15, 2024
The speaker lineup for apply() 2024 is now live and we can't wait to show you! Join industry leaders, starting Wednesday, April 3rd at 9AM PT, from LangChain, Meta, Pinterest, Samsung, Vanguard,
Edge 378: Meet TimesFM: Google's New Foundation Model for Time-Series Forecasting
Friday, March 15, 2024
The model is about 200M parameters and has been trained in over 100 billion data points. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 377: LLM Reasoning with Reinforced Fine-Tuning
Tuesday, March 12, 2024
A very recent LLM reasoning technique created by ByteDance research. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
📝 Guest Post: Evaluating LLM Applications*
Monday, March 11, 2024
To successfully build an AI application, evaluating the performance of large language models (LLMs) is crucial. Given the inherent novelty and complexities surrounding LLMs, this poses a unique
Can I Solve Science?
Sunday, March 10, 2024
A brilliant essay by Stephen Wolfram explores this challenging question. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
You Might Also Like
📧 Building Async APIs in ASP.NET Core - The Right Way
Saturday, November 23, 2024
Building Async APIs in ASP .NET Core - The Right Way Read on: my website / Read time: 5 minutes The .NET Weekly is brought to you by: Even the smartest AI in the world won't save you from a
WebAIM November 2024 Newsletter
Friday, November 22, 2024
WebAIM November 2024 Newsletter Read this newsletter online at https://webaim.org/newsletter/2024/november Features Using Severity Ratings to Prioritize Web Accessibility Remediation When it comes to
➡️ Why Your Phone Doesn't Want You to Sideload Apps — Setting the Default Gateway in Linux
Friday, November 22, 2024
Also: Hey Apple, It's Time to Upgrade the Macs Storage, and More! How-To Geek Logo November 22, 2024 Did You Know Fantasy author JRR Tolkien is credited with inventing the main concept of orcs and
JSK Daily for Nov 22, 2024
Friday, November 22, 2024
JSK Daily for Nov 22, 2024 View this email in your browser A community curated daily e-mail of JavaScript news React E-Commerce App for Digital Products: Part 4 (Creating the Home Page) This component
Spyglass Dispatch: The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen
Friday, November 22, 2024
The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen The Spyglass Dispatch is a free newsletter sent out daily on
Charted | How the Global Distribution of Wealth Has Changed (2000-2023) 💰
Friday, November 22, 2024
This graphic illustrates the shifts in global wealth distribution between 2000 and 2023. View Online | Subscribe | Download Our App Presented by: MSCI >> Get the Free Investor Guide Now FEATURED
Daily Coding Problem: Problem #1616 [Easy]
Friday, November 22, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Alibaba. Given an even number (greater than 2), return two prime numbers whose sum will
The problem to solve
Friday, November 22, 2024
Use problem framing to define the problem to solve This week, Tom Parson and Krishna Raha share tools and frameworks to identify and address challenges effectively, while Voltage Control highlights
Issue #568: Random mazes, train clock, and ReKill
Friday, November 22, 2024
View this email in your browser Issue #568 - November 22nd 2024 Weekly newsletter about Web Game Development. If you have anything you want to share with our community please let me know by replying to
Whats Next for AI: Interpreting Anthropic CEOs Vision
Friday, November 22, 2024
Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 22, 2024? The HackerNoon