͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Forwarded this email? Subscribe here for more

Was this email forwarded to you? Sign up here

Four New Major Open Source Foundation Models in a Week

DBRX, Grok 1.5, Samba-CoE and Jamba are all bringing unique innovations to open source generative AI.

Mar 31

READ IN APP

Next Week in The Sequence:

Edge 383: Our new series continues with a deep dive into the core capabililties of autonomous agents. We review a very famous paper about agents simulating human behavior and we dive into the Crew AI framework.
Edge 384: We dive into Genie, Google DeepMind’s model that can generative interactive games from text!

You can subscribe to The Sequence using the link below:

📝 Editorial: Four New Major Open Source Foundation Models in a Week

Open source generative AI is experiencing tremendous momentum, and last week was a major example of this with the release of four major foundation models. By open source, we refer to the weights of the models and not the training datasets or processes. At this time, it's fair to say that the model weights are where most companies draw the line between open source and closed source. Many purists do not consider this true open source, but in a field evolving as rapidly as generative AI, preserving a level of competitive advantage is essential for any company. Let’s just say that the nature of open source is being reimagined for generative AI.

The fast pace of generative AI also makes the open source race even more fascinating. Last week, we witnessed the release of four major open source models, each innovative in its own way:

DBRX: Databricks released DBRX, a new model based on a mixture-of-experts architecture. DBRX contains 16 expert sub-models and dynamically selects the four most relevant for each token.
Grok 1.5: Elon Musk’s X.ai open-sourced Grok 1.5. The new release boasts a 128k context window and impressive reasoning capabilities.
Samba-CoE 0.2: Samba Nova announced Samba-CoE v0.2, which shows impressive performance at 330 tokens per second. The model claims to outperform BRRX, Mistral, and Grok.
Jamba: AI21 Labs open-sourced Jamba, which combines transformers with the increasingly popular structured state space model (SSM) architecture that powers models like Mamba. The SSM architecture gives Jamba very strong context length capabilities, which is evident in the initial benchmarks.

Regardless of where you fall in the commercial vs. open source debate in generative AI, it is undeniable that the latter will play a major role in the mainstream adoption of this technology. This week shows how strong the momentum in open source generative AI is."

With just one week left until apply() ‘24, the premier virtual conference for engineers mastering AI and ML, we wanted to remind you to secure your spot before it's too late!

Date: Wednesday, April 3 / 9:00AM – 5:00PM PT / Virtual

At apply(), our goal is to provide you with the tools and insights you need to conquer AI and ML challenges at production scale. With speakers from LangChain, Meta, Pinterest, Vanguard, Visa, Samsung, NextDoor, and many more in the lineup, this year's event promises to be our best yet. Be sure to join live for the chance to win swag or a giveaway prize!

REGISTER NOW

🔎 ML Research

Can LLMs Explore?

Researchers from Microsoft and Carnegie Mellon University published a paper exploring the intriguing thesis of LLM’s ability to engage in exploration, an ability typically reserved for reinforcement learning models. The research describes environments such as multi-armed bandits in prompts and determine whether LLMs can explore the environment in order to take actions —> Read more.

Tnt-LLM

Microsoft Research published a paper introducing Tnt-LLM, an LLM framework that generates and predict task labels with minimum user involvement. Tnt-LLM is actively used to discover Microsoft CoPilot’s user’s intent —> Read more.

AutoBNN

Google Research published and research adn open sourced AutoBNN, a JAX framework for interpretable time series forecasting models. AutoBNN’s core idea is to combine the interpretability of traditional time series models with the scalability of neural networks in a single architecture —> Read more.

SaLEM

Amazon Science published a paper introducing SaLEM (for salient-layers editing model), a method for editing layers in an LLM. SaLEM’s key contribution is that it can actually select the layers to be edited automatically —> Read more.

SAFE

Google DeepMind published a paper presenting Search-Augmented Factuality Evaluator (SAFE), a method for factual evaluation in LLMs using synthetic data. SAFE breaks down a long LLM response into specific facts and evaluates its individual accuracy —> Read more.

🤖 Cool AI Tech Releases

DBRX

Databricks released DBRX, a new state-of-the-art open source LLM —> Read more.

Jamba

AI21 Labs open sourced Jamba, a new model that augments Structured State Space model (SSM) with elements of the transformer architecture —> Read more.

Samba CoE v0.2

Samba Nova previewed the performance of Samba CoE v0.2, a new version of Samba-1 which scored incredibly high across many benchmarks —> Read more.

Grok 1.5

X.ai released Grok 1.5 with improved content reasoning capabilities and larger content length —> Read more.

Voice Engine

OpenAI published some details about Voice Engine, a new model for creating custom voices —> Read more.

🛠 Real World ML

Video Content Moderation at Yelp

Yelp discusses the ML architecture powering its video content moderation solution —> Read more.

📡AI Radar

DeepMind’s co-founder Demis Hassabis received a Knighthood for his services to AI.
Amazon completed its $4 billion investment in Anthropic.
AI startup studio Super{set} raised $90 million to build new AI companies.
AI data privacy company Skyflow raised $30 million in new funding.
AI risk management platform ValidMind raised $8.1 million.
Eliyan raised a $60 million Series B for its chiplet interconnect technology.
ActiveLoop raised $11 million to manage multimodal datasets for generative AI.
StealthMole raised $7 million for its dark web intelligence platform.
SydeLabs raised $2.5 million for its generative AI security platform.
Lightning AI released Thunder, a new compiler for LLM pretraining.
Cloud data observability platform Observe, raised $115 million in a Series B.
Foundational raised $8 million for its AI data quality platform.

You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities.

Like

Comment

Restack

Four New Major Open Source Foundation Models in a Week

Four New Major Open Source Foundation Models in a Week

DBRX, Grok 1.5, Samba-CoE and Jamba are all bringing unique innovations to open source generative AI.

Next Week in The Sequence:

You can subscribe to The Sequence using the link below:

📝 Editorial: Four New Major Open Source Foundation Models in a Week

🔎 ML Research

Can LLMs Explore?

Tnt-LLM

AutoBNN

SaLEM

SAFE

🤖 Cool AI Tech Releases

DBRX

Jamba

Samba CoE v0.2

Grok 1.5

Voice Engine

🛠 Real World ML

Video Content Moderation at Yelp

📡AI Radar

Older messages

Edge 381: Google DeepMind's PrompBreeder Self-Improves Prompts

Edge 380: A New Series About Autonomous Agents

📝 Guest Post: Zilliz Unveiled Milvus 2.4 at GTC 24, Transforming Vector Databases with GPU Acceleration*

NVIDIA’s GTC in Four Headlines

📌 Exciting lineup for apply() 2024 is now live

You Might Also Like

Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator

Defining Your Paranoia Level: Navigating Change Without the Overkill

5 ways AI can help with taxes 🪄

Recurring Automations + Secret Updates

The First Provable AI-Proof Game: Introducing Butterfly Wings 4

GCP Newsletter #437

Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰

The Great Social Media Diaspora & Tapestry is here

Daily Coding Problem: Problem #1689 [Medium]

📧 Stop Conflating CQRS and MediatR