Was this email forwarded to you? Sign up here

Meta's Coding Language Model Bet

Sundays, The Sequence Scope brings a summary of the most important research papers, technology releases and VC funding deals in the artificial intelligence space.

Aug 27

READ IN APP

Next Week in The Sequence:

Edge 321: Our seriess about advanced methods in foundation model starts diving into the fascinating concepts of memory. We review Microsoft Research’s paper on long-term memory in LLMs and the increasingly popular Pinecone vector database.
The Sequence Chat brings another killer interview.
Edge 322: Reviews ones of the most groundbreaking papers of the last few months in which Google and Stanfod University researchers show how to use generative AI to simulate human behavior

Go subscribe!

📝 Editorial: Meta's Coding Language Model Bet

Coding holds a central position in the race to dominate generative AI. Since the release of OpenAI’s Codex, not to mention GPT-4, the pursuit of coding Language Model (LLMs) supremacy has incorporated models from Amazon, Salesforce, Hugging Face, and innovative startups like Replit. The latest addition to this collection comes from Meta, which unveiled its highly anticipated Code Llama model just last week. By 'releasing,' I mean they have made it open source. In line with their pro-open-source approach that has garnered them immense popularity in the AI community, Meta has published versions of the Code Llama on GitHub, utilizing a license with minimal restrictions for both commercial and research use cases.

But what exactly is Code Llama? As the name suggests, the model is a fine-tuned version of the popular Llama 2 model using coding datasets. The model is offered in three versions with 7B, 13B, and 34B parameters, respectively. Furthermore, the release includes two variations of the model:

Code Llama Python: An even more specialized iteration of Code Llama, fine-tuned on 100B tokens of Python code.
Code Llama Instruct: A version of Code Llama designed to follow instructions, optimized based on feedback from human annotators concerning detailed input-output mappings.

Even the smallest version of Code Llama can run on a single GPU and process up to 100,000 tokens of code input, significantly enhancing accessibility.

The introduction of Code Llama signals once again that Meta is determined to be a strong contender in the generative AI space. With a unique AI research talent pool under Yann LeCun, a culture of engineering, and a commitment to open-source AI, Meta stands out as one of the driving forces shaping the generative AI market. Llama 2 and Code Llama have been incredibly well received by the AI community. Could an 'Image Llama' be next?

📺 Webinar: Create better features for your ML models

Getting high-quality data and transforming them into features for your machine learning models is one of the biggest challenges in ML. Join Tecton CEO Mike Del Balso for this webinar to learn how teams can use feature engineering frameworks to simplify the development of features.

REGISTER FOR WEBINAR

🔎 ML Research

Code Llama

Meta AI Research published a paper detailing Code Llama, a language model for code generation. The release includes the base model plus variations optimized for Python and instruction following —> Read more.

SeamlessM4T

Meta AI published a paper unveiling SeamlessM4T, a multilingual, multitask model for text-to-speech capabilities. Seamless4T enables translation and transcription for text and speech across 100 languages —> Read more.

Visual Information Seeking in LLMs

Google Research published a paper introducing Autonomous Visual Information Seeking (AVIS) in LLMs. The method extends LLMs with computer vision, web search and image search tools to automate visual information seeking tasks —> Read more.

Synthetic Labeled Image Generation

Amazon Science published a paper introducing HandsOff, a method that eliminates the need for annotation of synthetic images. HandsOff uses GANs to produce large number of synthetic images with the corresponding labels —> Read more.

🤖 Cool AI Tech Releases

GPT 3.5 Fine-Tuning

OpenAI enable fine-tuning capabilities on GPT 3.5 —>Read more.

Multilingual v2

ElevenLabs came out of beta announcing Eleven Multilingual v2, a text-to-speech model supporting over 30 languages —> Read more.

SQLCoder

Defog open sourced SQLCoder, an LLM for converting language to SQL queries —> Read more.

Hugging Face-AutoGPTQ

Hugging Face unveiled an itnegration with the AutoGPTQ library that enables efficient quantization of models —> Read more.

🛠 Real World ML

Offline LLM Inference at ByteDance

Scale AI published details the architecture powering offline LLM inference at ByteDance —> Read more.

Embeddings at LinkedIn

LinkedIn discusses the architecture used to manage embeddings in their homepage feed system —> Read more.

Real Time Semantic Search at Walmart

Walmart Global Tech describes a framework for scalable semantic search across millions of documents —> Read more.

Recommender Systems at Walmart

Walmart Global Tech discusses the explore-exploit techniques used in their large scale recommender systems —> Read more.

📡AI Radar

Hugging Face achieved a $4 billion valution in a round led by Salesforce Ventures.
Modular, the company behind the Mojo programming language for AI infrastructure, raised $100 million in new funding.
Ikigai Labs raised $25 million to enable generative AI in tabular datasets.
NVIDIA delivered incredibly strong earning results driven by AI chip sales.
AI biotech startup Genesis Therapeutics raised $200 million.
Twillio launched new AI tools for customer data.
AI chip company Arm, filed for what could be a record setting IPO.
IBM is leveraging generative AI to modernize COBOL.
AI writing tool Lex raised $2.5 million.
AI tool for creators Wand.app raised $4.2 million.
AI video creation startup Irreverent Labs raised a new round of funding led by Samsung Next.

You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities.

Like

Comment

Restack

TheSequence - Meta's Coding Language Model Bet

Meta's Coding Language Model Bet

Sundays, The Sequence Scope brings a summary of the most important research papers, technology releases and VC funding deals in the artificial intelligence space.

Next Week in The Sequence:

Go subscribe!

📝 Editorial: Meta's Coding Language Model Bet

📺 Webinar: Create better features for your ML models

🔎 ML Research

Code Llama

SeamlessM4T

Visual Information Seeking in LLMs

Synthetic Labeled Image Generation

🤖 Cool AI Tech Releases

GPT 3.5 Fine-Tuning

Multilingual v2

SQLCoder

Hugging Face-AutoGPTQ

🛠 Real World ML

Offline LLM Inference at ByteDance

Embeddings at LinkedIn

Real Time Semantic Search at Walmart

Recommender Systems at Walmart

📡AI Radar

Older messages

Edge 320: Meet I-JEPA: Meta AI’s First Super Model Based on their Theory of Autonomous Intelligence

The Sequence Chat: Oren Etzioni – Allen AI, About Advancing Research in Foundation Models

Edge 319: The Factors Behind In-Context Learning

📺 Webinar: Create better features for your ML models

The NVIDIA GPU Scarcity Madness

You Might Also Like

Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator

Defining Your Paranoia Level: Navigating Change Without the Overkill

5 ways AI can help with taxes 🪄

Recurring Automations + Secret Updates

The First Provable AI-Proof Game: Introducing Butterfly Wings 4

GCP Newsletter #437

Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰

The Great Social Media Diaspora & Tapestry is here

Daily Coding Problem: Problem #1689 [Medium]

📧 Stop Conflating CQRS and MediatR