͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Forwarded this email? Subscribe here for more

Was this email forwarded to you? Sign up here

Sakana AI

A new $100 million round for the creators of The AI Scientist

Sep 8

READ IN APP

Next Week in The Sequence:

Edge 429: Our series about state space models(SSMs) continues with an exploration of MambaByte including its original paper. We also discuss the MindsDB platform for building AI systems.
Edge 430: We dive into The AI Scientist, an agent for scientific experimentation.

You can subscribe to The Sequence below:

📝 Editorial: Sakana AI

A few weeks ago, we discussed an interesting AI agent called "The AI Scientist," which was able to conduct complex experiments over the long term. The AI Scientist was created by Sakana AI, one of the most innovative AI labs in the world, which just announced a $100 million Series A funding round this week from marquee investors, including NVIDIA, Khosla Ventures, and NEA.

Two fundamental aspects make Sakana AI stand out. First is its target market: Sakana AI is strategically focused on Japan. The world's leading economies are beginning to realize the importance of building world-class AI labs that develop models optimized for local knowledge. Japan is emerging as a key target, presenting a strong alternative to China in Asia. Unsurprisingly, larger competitors such as OpenAI and Cohere are also expanding their operations in the country.

The second distinguishing feature of Sakana AI is its architecture for foundation models. While most large AI labs are continuing to push the scaling limits of transformers to build larger and more capable models, Sakana AI is experimenting with novel architectural paradigms to develop smaller, more efficient models. The AI Scientist is built on a unique neurosymbolic architecture that combines large language models (LLMs) with more traditional methods. Since its inception, Sakana AI has been vocal about its intention to create AI models based on evolutionary dynamics and collaboration between different expert models. Some of their early models have provided a glimpse into this approach.

Competing with major AI labs has become an almost impossible challenge for startups. However, Sakana AI’s focus on a specific geographic region and smaller models might give it a crucial competitive edge. For now, their innovative models are challenging some of the conventional wisdom in the broader AI landscape.

🔎 ML Research

OLMoE

Allen AI published a paper detailing OLMoE, a fully open source mixture-of-experts(MoE) model. Specifically, the expand on the famous OLMO architecture to build. OLMoE-1B-7B and . OLMoE-1B-7B-Instruct, two MoE models trained on over 5 trillion tokens —> Read more.

AlphaProteo

Google DeepMind published a paper introducing AlphaProteo, a family of ML models for protein design. AlphaProteo can generate 3 to 300 times better binding affinities to target molecules —> Read more.

Agent Q

Researchers from Stanford University and Multion published a paper detailing Agent Q, a framework for building web agents that can plan and heal. Agent Q combines Monte Carlo Tree Search, reinforcement learning and self-critique to build agents that interact with web environments —> Read more.

DPPO

Researchers from Princeton University, MIT and Carnegie Mellon University published a paper discussing diffusion policy policy optimization(DPPO), a framework for fine-tuning difusion-based policies. DPPO excels in continious control and robot learning tasks using reinforcement learning policy gradient method which are a popular policy optimization method → Read more.

Evaluating LLM Jailbreaking

Berkeley AI Research(BAIR) published a paper proposing a technique to evaluate LLM jailbreaking methods. The paper introduces StrongREJECT , a benchmark for evaluating the robustness of jailbreaking methods in LLMs —> Read more.

High Troughput, Long-Context Inference

Together AI published a paper presenting a speculative decoding technique to increase throughput in the long-context and large batch regime. The paper introduces two new algorithms called MagicDec and Adaptive Sequoia Trees respectively in order to improve inference runs over large context windows —> Read more.

🤖 AI Tech Releases

xLAM

Salesforce open sourced xLAM, a series of LLMs optimized for function calling and agentic tasks —> Read more.

Claude Enterprise

Anthropic released an enterprise version of its marquee model —> Read more.

Reflection 70B

HyperWrite AI open sourced Reflection 70B, a Llama based model that top several benchmark leaderboards —> Read more.

📡AI Radar

Super Intelligence, the startup created by OpenAI's former chief scientist Ilya Sutskever, raised $1 billion.
Sakana.ai raised $100 million to expand its AI research lab in Japan.
Mayfield launches $100 million AI incubator to building AI teammates.
Spark Labs raised $50 million for a new AI fund.
You.com raised $50 million for building an AI productivity engine.
All Hands AI raised $5 million for building open source agents for developers.
xAI brought online the Colossus cluster with over 100,000 GPUs.
Stability AI models are now available on Amazon Bedrock.
Fastn raised $2.6 million for its agents platforms for building composable applications.
Paradigm AI raised $2 million to use agents to automate spreadsheet tasks.
Salesforce acquired data security startup Own for $1.9 billion.
Rboblox teased a new generative AI project.

You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities.

Like

Comment

Restack

TheSequence - Sakana AI

Sakana AI

A new $100 million round for the creators of The AI Scientist

Next Week in The Sequence:

You can subscribe to The Sequence below:

📝 Editorial: Sakana AI

🔎 ML Research

OLMoE

AlphaProteo

Agent Q

DPPO

Evaluating LLM Jailbreaking

High Troughput, Long-Context Inference

🤖 AI Tech Releases

xLAM

Claude Enterprise

Reflection 70B

📡AI Radar

Older messages

Edge 428: Inside PrompPoet: Character.ai's Framework for Prompt Engineering

Edge 427: Jamba Combines SSMs, Transformers and MOEs in a Single Model

Cerebras Inference and the Challenges of Challenging NVIDIA’s Dominance

📝 Guest Post: Will Retrieval Augmented Generation (RAG) Be Killed by Long-Context LLMs?*

Edge 426: Reviewing Google DeepMind’s New Tools for AI Interpretability and Guardrailing

You Might Also Like

Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator

Defining Your Paranoia Level: Navigating Change Without the Overkill

5 ways AI can help with taxes 🪄

Recurring Automations + Secret Updates

The First Provable AI-Proof Game: Introducing Butterfly Wings 4

GCP Newsletter #437

Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰

The Great Social Media Diaspora & Tapestry is here

Daily Coding Problem: Problem #1689 [Medium]

📧 Stop Conflating CQRS and MediatR