TheSequence - The Llama 2 Effect
Was this email forwarded to you? Sign up here The Llama 2 EffectSundays, The Sequence Scope brings a summary of the most important research papers, technology releases and VC funding deals in the artificial intelligence space.Next Week in The Sequence:
Go Subscribe! 📝 Editorial: The Llama 2 EffectThe debate between open source and closed source foundation models has become as interesting as ever, and the open source space has found an unlikely champion: Meta. The “accidental leak” of the weights of the Llama model sparked a tremendous level of innovation in open source foundation models, triggering the creation of models such as Vicuna, Koala, Red Pajama, MPT, Alpaca, Gorilla, and many others. Last week, Meta announced the open-source release and commercial availability of Llama 2 and a distribution partnership with none other than Microsoft. Llama 2 was trained on a dataset over 40% larger than its predecessor, using 2 trillion pretraining tokens. The model was released in three main versions with 7B, 13B, and 70B parameters, respectively. Another solid improvement was the use of reinforcement learning with human feedback (RLHF) and proximal policy optimization (PPO) to improve the usefulness of the responses. The model was evaluated across many LLM benchmarks and performed very strongly relative to the recent generation of open-source LLMs. And then there is the partnership with Microsoft. As part of their strategic alliance, Microsoft announced support for Llama 2 on Azure and Windows. The Azure support includes the ability to deploy and fine-tune all versions of Llama 2 from the Azure AI Model Catalog. The Windows support enables the local execution of Llama 2 models using DirectML. Beyond the initial set of capabilities, Microsoft’s endorsement of Llama 2 represents a strong validation for the viability of open source foundation models. Together with Databricks’ acquisition of MosaicML and the recent funding rounds by companies like Stability AI, this event is signaling to the market that open source foundation models are a force to be reckoned with. The Llama effect was about unlocking innovation in the open-source LLM space. The Llama 2 effect is about robustness and commercial readiness at the highest level. 💡Report: State of Applied Machine Learning 2023We surveyed over 1700 ML practitioners for this inaugural report on the state of applied machine learning. It provides a comprehensive overview of applied ML, and shares the challenges and opportunities in the space, along with common trends across a diverse set of ML initiatives. Download the full report for key findings, recommendations, and a deeper dive into the trends that will shape the future of applied ML! 🔎 ML ResearchCM3leonMeta AI Research published a paper introducing CM3leon a text-to-image and image-to-text foundation model. CM3leon was trained with including a large-scale retrieval-augmented pre-training stage and a second multitask supervised fine-tuning (SFT) stage and achieve state of the art results in both modalities —> Read more. Diffusion Model Fine Tuning with RLResearchers from Berkeley AI Research(BAIR) lab published a paper detailing a reinforcement learning method used to fine tune diffusion models. The method fine tunes Stable Diffusion on different objective such as image compressibility, human-perceived aesthetic quality, and prompt-image alignment —> Read more. SimPerGoogle Research published a paper detailing SimPer, a self-supervised model for periodic data. SimPer uses contrastive learning to learn temporal properties of periodic target —> Read more. Consistent Reasoning in LLMsAmazon Science published a paper outlining a new chain-of-thought reasoning method for LLMs. The core idea is to use a teacher-student model that leverages knowledge distillation in question-answer pairs to improve the reasoning chain —> Read more. Flash Attention-2Researchers from Stanford University and Princeton published a paper FlashAttention-2, an IO-aware attention mechanism. FlashAttention-2 builds on its predecessor by adding several optimizations that reduce the FLOPs and parallelize attention computations —> Read more. 🤖 Cool AI Tech ReleasesLLama 2Meta AI released LLama 2, the next version of their marquee LLM now with commercial support —> Read more. ChatGPT Custom InstructionsOpenAI released ChatGPT Custom Instructions, which allow users to set preferences that ChatGPT should consider when producing outputs —> Read more. MPT-7B-8KMosaic ML unveiled MPT-7B-8k, a new LLM with an 8k context window —> Read more. 🛠 Real World MLPrompt Engineering at GitHub The GitHub engineering team discusses prompt engineering best practices —> Read more. Time Series Analysis at PinterestThe Pinterest engineering team shares some details about their architecture and techniques for time series analysis —> Read more. 📡AI Radar
You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
How OpenAI Uses GPT-4 to Interpret the Functions of Neurons in Other Language Models
Thursday, July 20, 2023
A new interpretability method based on GPT-4 can derive explanations about specific neurons in LLMs.
Luca Beurer-Kellner: ETH Zürich, Creator, Language Model Query Language,
Wednesday, July 19, 2023
LMQL, language model programming and the future of LLMs.
Edge 309: What is Active Prompting?
Tuesday, July 18, 2023
Understanding one of the most effective techniques to improve the effectiveness of prompts in LLM applications.
The Sequence Chat: Emmanuel Turlay – CEO, Sematic
Sunday, July 16, 2023
Model orchestration, Airflow limitaitons in ML and new ideas about MLOps.
Meet LMQL: An Open Source Query Language for LLMs
Sunday, July 16, 2023
Developed by ETH Zürich, the language explores new paradigms for LLM programming.
You Might Also Like
Weekend Reading — More time to write
Sunday, November 24, 2024
More Time to Write A fully functional clock that ticks backwards, giving you more time to write. Tech Stuff Martijn Faassen (FWIW I don't know how to use any debugger other than console.log) People
🕹️ Retro Consoles Worth Collecting While You Still Can — Is Last Year's Flagship Phone Worth Your Money?
Saturday, November 23, 2024
Also: Best Outdoor Smart Plugs, and More! How-To Geek Logo November 23, 2024 Did You Know After the "flair" that servers wore—buttons and other adornments—was made the butt of a joke in the
JSK Daily for Nov 23, 2024
Saturday, November 23, 2024
JSK Daily for Nov 23, 2024 View this email in your browser A community curated daily e-mail of JavaScript news React E-Commerce App for Digital Products: Part 4 (Creating the Home Page) This component
Not Ready For The Camera 📸
Saturday, November 23, 2024
What (and who) video-based social media leaves out. Here's a version for your browser. Hunting for the end of the long tail • November 23, 2024 Not Ready For The Camera Why hasn't video
Daily Coding Problem: Problem #1617 [Easy]
Saturday, November 23, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Microsoft. You are given an string representing the initial conditions of some dominoes.
Ranked | The Tallest and Shortest Countries, by Average Height 📏
Saturday, November 23, 2024
These two maps compare the world's tallest countries, and the world's shortest countries, by average height. View Online | Subscribe | Download Our App TIME IS RUNNING OUT There's just 3
⚙️ Your own Personal AI Agent, for Everything
Saturday, November 23, 2024
November 23, 2024 | Read Online Subscribe | Advertise Good Morning. Welcome to this special edition of The Deep View, brought to you in collaboration with Convergence. Imagine if you had a digital
Educational Byte: Are Privacy Coins Like Monero and Zcash Legal?
Saturday, November 23, 2024
Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 23, 2024? The HackerNoon
🐍 New Python tutorials on Real Python
Saturday, November 23, 2024
Hey there, There's always something going on over at Real Python as far as Python tutorials go. Here's what you may have missed this past week: Black Friday Giveaway @ Real Python This Black
Re: Hackers may have stolen everyone's SSN!
Saturday, November 23, 2024
I wanted to make sure you saw Incogni's Black Friday deal, which is exclusively available for iPhone Life readers. Use coupon code IPHONELIFE to save 58%. Here's why we recommend Incogni for