The Sequence Radar #496: Microsoft Muse Can Generate Entire Games After Watching You Play
Was this email forwarded to you? Sign up here The Sequence Radar #496: Microsoft Muse Can Generate Entire Games After Watching You PlayThe new AI model represents a milestone in gameplay idetation.Next Week in The Sequence:Our series about RAG continues with a deep dive into GraphRAG which was recently created by Microsoft. Speaking of Microsoft, we discuss the amazing rStar-Math technique to improve math reasoning in LLMs. The engineering section dives into Composio which has become one of the most popular stacks to integrate tools with LLMs. Finally, the opinion section discusses whether we are seeing a renaissance in reinforcement learning, or not ;) You can subscribe to The Sequence below:📝 Editorial: Microsoft Muse Can Generate Entire Games After Watching You PlayGames have played a monumental role in the evolution of AI. From creating training environments to simulating real world conditions, games represent incredible catalyzers on AI learning. A new field known as world action models is rapidly emerging as a field to combine games and AI. Microsoft just dropped an ecising research in this area with a model that can create games after watching human players. Sounds crazy? Let’s discuss. Microsoft Research has introduced Muse, a cutting-edge generative AI model designed to revolutionize gameplay ideation by generating both game visuals and controller actions. Known as a World and Human Action Model (WHAM), Muse acts as a digital collaborator that understands and extends video game dynamics using real human gameplay data. This initiative reflects a growing trend of integrating AI to enhance—not replace—the creative process within the gaming industry. Muse is trained on human gameplay data from the Xbox game Bleeding Edge, learning from over 1 billion images and controller actions—equivalent to more than seven years of continuous gameplay. By analyzing this extensive dataset, Muse can generate complex gameplay sequences that remain consistent and engaging over extended periods. Its architecture enables it to predict how a game might evolve from an initial sequence, capturing the nuanced interplay of visuals, player actions, and game physics. A standout feature of Muse is its interactive prototype—the WHAM Demonstrator. This tool empowers developers to load visual prompts and generate multiple gameplay continuations, allowing for hands-on exploration of different scenarios. This iterative approach helps developers quickly visualize, tweak, and refine gameplay concepts, unlocking new levels of creative experimentation. To ensure Muse’s output meets the demands of game development, Microsoft Research evaluates the model across three key characteristics:
With Muse, Microsoft is paving the way for a future where AI serves as a creative partner—expanding the boundaries of what’s possible in game design while keeping human creativity at the forefront. 🔎 AI ResearchSWE-LancerIn SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering, OpenAI introduces SWE-Lancer, a benchmark designed to evaluate model performance in real-world, freelance software engineering tasks, mapping model capabilities to actual monetary value and assessing complex, full-stack software engineering and management skills. The SWE-Lancer Diamond is a public evaluation split containing $500,800 worth of tasks. The AI Co-ScientistResearchers from Google Cloud AI Research, Google Research, Google DeepMind, Houston Methodist, Sequome, Fleming Initiative and Imperial College London, and Stanford University introduce an AI co-scientist, a multi-agent system built on Gemini 2.0, designed to assist scientists in generating novel hypotheses and research proposals. The system uses a generate, debate, and evolve approach to improve hypothesis quality, with validations in drug repurposing, novel target discovery, and explaining mechanisms of bacterial evolution and anti-microbial resistance. MuseIn the paper World and Human Action Models towards gameplay ideation, researchers from Microsoft Research introduce Muse, a new generative model called World and Human Action Model (WHAM). Muse is designed to generate consistent and diverse gameplay sequences and persist user modifications, which are identified as critical for creative ideation in game development. AutellixIn the paper Autellix: An Efficient Serving Engine for LLM Agents as General Programs researchers from UC Berkeley, Google DeepMind, and Shanghai Jiao Tong University introduce Autellix, an LLM serving system designed to minimize end-to-end latencies by treating programs as first-class citizens. Autellix intercepts LLM calls, enriches schedulers with program-level context, and uses scheduling algorithms to preempt and prioritize calls based on previously completed calls, improving throughput by 4-15x compared to state-of-the-art systems. Can Small Models Reason?In the paper Small Models Struggle to Learn from Strong Reasoners, researchers from University of Washington, Carnegie Mellon University , Western Washington University uncover that small language models (≤3B parameters) do not consistently benefit from long chain-of-thought (CoT) reasoning or distillation from larger models3. The paper introduces Mix Distillation, a strategy that balances reasoning complexity by combining long and short CoT examples, which significantly improves small model reasoning performance3. Qwen2.5-VLIn the paper Qwen2.5-VL Technical Report researchers from Alibaba Group introduce Qwen2.5-VL, the latest flagship model of the Qwen vision-language series, which demonstrates advancements in visual recognition, object localization, document parsing, and long-video comprehension. The model introduces dynamic resolution processing and absolute time encoding, allowing it to process images of varying sizes and videos of extended durations with second-level event localization. 🤖 AI Tech ReleasesGrok 3xAI unveiled Grok 3, its most advanced model with impressive benchmark results. SmolVLM2Hugging Face open sourced SmolVLM2, a video foundation model that can run in small devices. PaliGemma 2 mixGoogle released PaliGemma 2 mix, a vision language model built on its Gemma family. 📡AI Radar
You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
📖 Mastering LLM Inference
Thursday, February 27, 2025
[Free Guidebook] ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The Sequence Knowledge #497: Microsoft's GraphRAG is One of the Newest RAG Techniques
Thursday, February 27, 2025
The methods enables RAG in an interconnected graph of documents ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The Sequence Engineering #498: Integrating Tools with AI Agents Using Composio
Thursday, February 27, 2025
Hundreds of connectors that can be integrated using a simple programming framework. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The Sequence Knowledge #487: A RAG that Assesees Itself
Friday, February 14, 2025
A technique for robust RAG implementations. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The Sequence Engineering #488: Txtai, Maybe the Simplest Way to do Embeddings
Friday, February 14, 2025
A simple and developer friendly framework for building embeddings into LLM apps. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
You Might Also Like
Software Testing Weekly - Issue 261
Wednesday, March 12, 2025
New AI solutions for testing 👀 View on the Web Archives ISSUE 261 March 12th 2025 COMMENT Welcome to the 261st issue! I have nothing more to add to the genuinely great news that came out recently. I
JSK Daily for Mar 11, 2025
Tuesday, March 11, 2025
JSK Daily for Mar 11, 2025 View this email in your browser A community curated daily e-mail of JavaScript news How to Enforce Type Safety in FormData with TypeScript When working with the FormData
Binary Data, Tail Calls, Pickles, and More
Tuesday, March 11, 2025
Bytes Objects: Handling Binary Data in Python #672 – MARCH 11, 2025 VIEW IN BROWSER The PyCoder's Weekly Logo Bytes Objects: Handling Binary Data in Python In this tutorial, you'll learn about
Shaking The Wasp’s Nest 🐝
Tuesday, March 11, 2025
How Gamergate swarmed into our online lives. Here's a version for your browser. Hunting for the end of the long tail • March 11, 2025 Today In Tedium: You probably have noticed, just like me, that
Daily Coding Problem: Problem #1714 [Easy]
Tuesday, March 11, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. You are given an N by N matrix of random letters and a dictionary of words. Find
Mapped | The State of Democracy Around the World 🌐
Tuesday, March 11, 2025
After a historic election year, we show the state of democracy worldwide as it declines to its lowest level in two decades. View Online | Subscribe | Download Our App NEW REPORT: The Age of Data >
Stories, Free Tool & CRM Template
Tuesday, March 11, 2025
Notion stories, smart tools, and a free template to organize your contacts 🔥 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
LW 173 - How to become a Shopify Developer in 2025
Tuesday, March 11, 2025
How to become a Shopify Developer in 2025 Shopify Development news and articles Issue 173 - 03/11
This free AI tool beats Perplexity
Tuesday, March 11, 2025
Ubuntu vs. Debian; The new HR; YouTube randomizer -- ZDNET ZDNET Tech Today - US March 11, 2025 webfeetgettyimages-10141124 DuckDuckGo's AI beats Perplexity in one big way - and it's free to
⚙️ AI bubble bursts (?)
Tuesday, March 11, 2025
Plus: We talk to the CEO of Read AI