͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Forwarded this email? Subscribe here for more

Was this email forwarded to you? Sign up here

The Single-Algorithm AI Chip

Plus a tremendous activity in funding activity in generative AI startups.

Jun 30

READ IN APP

A detailed image of an artificial intelligence chip designed for an AI architecture called 'transformer'. The chip is sleek and futuristic with intricate circuitry patterns and the word 'Transformer' etched onto it. It is embedded on a motherboard with glowing blue and green lights highlighting its connections. The background features a high-tech lab environment with holographic screens displaying data and AI algorithms. — Created Using DALL-E

Next Week in The Sequence:

Edge 409: We dive into long-term memory in autonomous agents. The research section reviews Microsoft LONGMEM reference architecture for long-term memory in LLMs. We also provide an introduction to the super popular Pinecone vector database.
Edge 410: We dive into VTC, a super innovative method from UC Berkeley and Stanford for fiar LLM serving.

You can subscribe to The Sequence below:

📝 Editorial: The Single-Algorithm AI Chip

The dominance of the transformer architecture in generative AI represents a pivotal moment for the AI chip industry. This revolution has sparked a renaissance in chip design, propelling NVIDIA to become one of the world's most valuable companies and fueling substantial funding for new AI chip startups. The demand for AI-based hardware seems limitless, driven not only by the rapid pace of AI advancements but also by the slow evolution of AI model architectures beyond transformers.

Simply put, transformer dominance as the preferred architecture in generative AI is the best thing to have happened to the AI chip industry. The rationale is clear: when most AI software innovation centers around a single architecture, it becomes logical for AI chip manufacturers to optimize for that paradigm. Given that AI chip production cycles are significantly longer than software development cycles, such optimization is only feasible if model architectures remain stable for years. Conversely, constant changes in architecture paradigms would render AI chip optimization impractical and economically unviable.

Last week provided a notable example of this market dynamic between AI chips and software: Etched, a new AI chip startup, secured $120 million in funding to develop chips specialized in transformer architectures. Etched's chip, Sohu, is capable of processing 500,000 tokens per second with the throughput of a Llama 70B model, surpassing NVIDIA's Blackwell (B200) GPUs in speed and cost efficiency. Sohu's specialization in a single algorithm allows for a streamlined logic flow, accommodating more mathematical blocks and achieving an impressive 90% FLOPS utilization.

The dominance of transformer architecture empowers startups like Etched to optimize chip designs to compete effectively with established industry giants. The greatest paradox of the AI chip renaissance lies in the fact that innovation is spurred not by rapid AI evolution, but by its deliberate pace.

🌝 Recommended – Finally: Instant, accurate, low-cost GenAI evaluations

Why are Fortune 500 companies everywhere switching to Galileo Luna for enterprise GenAI evaluations?

97% cheaper, 11x faster, and 18% more accurate than GPT-3.5
No ground truth data set needed
Customizable for your specific evaluation requirements

SEE LUNA IN ACTION!

🔎 ML Research

FineWeb

HuggingFace published a paper detailing how they built FineWeb, one of the largest open source datasets for LLM pretraining ever built. FineWeb boosts and impressive 15 trillion tokens from 96 Common Crawl snapshots —> Read more.

Agent Symbolic Learning

Researchers from AIWaves published a paper introducing a technique known as agent symbolic learning aimed to self-improve agents. The core idea is to draw a parallel between an agent pipeline and a neural net and use symbolic optimizers to improve the agent network —> Read more.

APIGen

Salesforce Research published a paper introducing APIGen, a pipeline designed to synthesize function-calling datasets. APIGen was used to train models over 7B parameters based on state-of-the-art benchmarks —> Read more.

MISeD

Google Research published a paper introducing Meeting Information Seeking Dialogs(MISeD), a dataset focused on meeting transcripts. MISeD tries to optimize for finding factual information in meeting transcripts which could be a notoriously difficult task —> Read more.

Olympic Arena

Researchers from Shanghai Jiao Tong University, Generative AI Research Lab published a paper detailing the results of the Olympic Arena superintelligence benchmark. Olympic Arena was designed to evaluate models across many disciplines and modalities —> Read more.

Exams for RAG Pipelines

Amazon Science published a paper discussing a technique to evaluate the accuracy of RAG applications. The methods mimics an exam generation process based on item response theory —> Read more.

🤖 Cool AI Tech Releases

MLflow at SageMaker

Amazon is launching support for Mlflow in its SageMaker platform —> Read more.

Multimodal Arena

Chatbot Arena just added support for multimodal models —> Read more.

Meta LLM Compiler

Meta AI open sourced its LLM Compiler, a family of Code LLama based models with compiter and optimization capabilities —> Read more.

Character Calls

Character AI introduced Character Calls, a voice interaction experience with Characters —> Read more.

🛠 Real World AI

Incident Response at Meta

Meta shares some details about their usage of generative AI for incident response —> Read more.

ETA at Lyft

Lyft discusses the ML techniques used to ensure estimated time of arrival(ETA) reliability for riders —> Read more.

📡AI Radar

Stability AI raised a new round of funding and appointed a new CEO.
Orby AI raised $30 million to build large action models.
Day.ai raised $4 million from Sequoia to build an AI-first CRM.
Axelera raised $68 million for edge AI chips.
Etched raised $120 million to build transformer specialized chips.
Emergence raised $97.2 million for its agent platform.
EvolutionaryScale raised $142 million and launched a new AI model for protein discovery.
Iconic VC firm Kleiner Perkins raised $2 billion for new funds for startups leveraging generative AI for growth.
AI-ecommerce platform Daydream secured $50 million in new funding.
illumex raised $13 million for its data governance infrastructure for generative AI —> Read more.
SoftBank invested in Perplexity at $3 billion valuation.
Hebbia, which uses generative AI to search large documents, raised a $100 million Series B.
Oracle announced a series of in-database LLM capabilities.
SoftBank formed a joint venture with Tempus to invest in healthcare AI.
Andrew Ng is raising $120 million for his next AI fund.
AI low-code platform Creatio raised $200 million in new funding.
Nubank acquired AI-banking platform Hyperplane.
Dappier raised $2 million to build an LLM content marketplace.
Substrate raised $8 million for its modular AI platform.

You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities.

Like

Comment

Restack

TheSequence - The Single-Algorithm AI Chip

The Single-Algorithm AI Chip

Plus a tremendous activity in funding activity in generative AI startups.

Next Week in The Sequence:

You can subscribe to The Sequence below:

📝 Editorial: The Single-Algorithm AI Chip

🌝 Recommended – Finally: Instant, accurate, low-cost GenAI evaluations

🔎 ML Research

FineWeb

Agent Symbolic Learning

APIGen

MISeD

Olympic Arena

Exams for RAG Pipelines

🤖 Cool AI Tech Releases

MLflow at SageMaker

Multimodal Arena

Meta LLM Compiler

Character Calls

🛠 Real World AI

Incident Response at Meta

ETA at Lyft

📡AI Radar

Older messages

📝 Guest Post: Designing Prompts for LLM-as-a-Judge Model Evals*

Edge 406: Inside OpenAI's Recent Breakthroughs in GPT-4 Interpretability

Edge 407: LLMs with Infininite Context Windows? Short-Term Memory and Autonomous Agents

📽 [Virtual Talk] Powering millions of real-time rankings at GetYourGuide

Beyond OpenAI: Apple’s On-Device AI Strategy

You Might Also Like

Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator

Defining Your Paranoia Level: Navigating Change Without the Overkill

5 ways AI can help with taxes 🪄

Recurring Automations + Secret Updates

The First Provable AI-Proof Game: Introducing Butterfly Wings 4

GCP Newsletter #437

Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰

The Great Social Media Diaspora & Tapestry is here

Daily Coding Problem: Problem #1689 [Medium]

📧 Stop Conflating CQRS and MediatR