TheSequence - Don't Overlook China's Open Source LLMs
Was this email forwarded to you? Sign up here Next Week in The Sequence:
You can subscribe below!📝 Editorial: Don't Overlook China's Open Source LLMsIf you visit the open LLM leaderboard today, you might encounter an unfamiliar model at the top of the charts: Smaug-72B. Open-sourced by Abacus AI, this model is a fine-tuned version of another model, Qwen-72B, which Alibaba released a few months ago. The Qwen family of open-source LLMs has scored incredibly high across some of the top open-source benchmarks, showcasing the latest examples of Chinese innovation in the open-source generative AI space. While open-source LLMs are typically associated with Western models like LLaMA or Mistral, the pace of high-quality releases from China is nothing short of remarkable. Here are a few examples:
Smaug was technically developed by an American company but as a fine-tuned version of a Chinese model. From what I can tell, most open-source Chinese LLMs share strong architectural commonalities with models like Llama or Mistral; however, there hasn't been any major innovation from an architectural standpoint. Nonetheless, the quality is undeniable. While many skeptics of open-source generative AI regularly cited China as a major concern, they fail to recognize the contributions that Chinese research labs and startups will make to the space. It would be interesting to see how regulation plays a role in the evolution of open-source LLMs in China and Western countries. For now, don't overlook the Chinese open-source LLMs. They are very impressive. 🎥 Watch Now: Building Plaid’s ML Fraud Detection ApplicationWant to learn about Plaid’s ML platform journey? In this on-demand recording, Plaid Software Engineer Renault Young shared the technical challenges they faced, how they set up the data foundations they needed to start building an ML platform, what they used to look for patterns in transaction data in real time, and more. Today, Signal is Plaid’s biggest ML application and analyzes 1000+ risk factors per ACH transaction. The on-demand recording is now available for you to watch and share with your colleagues! 🔎 ML ResearchSpecialized SLMsApple Research published a paper evaluating small language model architectures based on inference, specialization and training budgets. The paper evaluates different architectures such as hyper-networks or mixture of experts to achieve different levels of specializations based on budget constraints —> Read more. Chain-of-AbstractionMeta AI Research published a paper detailing Chain of Abstraction(CoA), a method that combines reasoning and tool learning in LLMs. CoA creates abstract placeholders in reasoning chains and then fills htem with specific knowledge using tools —> Read more. Mastering Chess Without SearchResearchers from Google DeepMind published a paper proposing a 270 million parameter transformer model that was able to play chess at a grandmaster level. The model challenges traditional approaches to chess that relied on massive game datasets and complex heuristics —> Read more. Self-DiscoverGoogle DeepMind published a paper introducing Self-Discover, a framework to tackle complex reasoning problems with LLMs. The framework includes reasoning modules such as critical and step-by-step thinking as well as the building blocks to compose those modules into sophisticated reasoning chains —> Read more. AI Controller InterfaceMicrosoft Research released a prototype of AI Controller Interface (AICI), a framework to implement controllers that constraint the outputs of LLMs. AICI’s architecture allows the implementation of custom logic blocks the during the token decoding process and still maintaining the state of the LLM —> Read more. 🤖 Cool AI Tech ReleasesSmaug-72BAbacus AI released Smaug-72B which sits at the top of the open LLM leaderboard —> Read more. Gemini AdvancedGoogle rebranded Bard as Gemini and introduced Gemini Advanced with native integration for Google Docs and Gmail —> Read more. TensorFlow GNNGoogle released TensorFlow GNN, a new framework for graph neural networks in TensorFlow —> Read more. Imagen 2Google released Imagen 2, its powerful text-to-image model, across several of its AI products —> Read more. SVD 1.1Stability AI announced the release of SVD 1.1, a new version of its video generation model optimized for consistency —> Read more. 📡AI Radar
You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
💡WEBINAR: Beyond fine-tuning. Approaches in LLM optimization
Friday, February 9, 2024
We've talked about tuning, and we've talked about prompt engineering, but those are not the only techniques at our disposal to optimize LLMs. Join us for the next webinar of our LLM series on 📅
Edge 368: Inside MemGPT: A Framework for Building Autonomous Agents You Should Know About
Thursday, February 8, 2024
Built by AI researchers from UC Berkeley and inspired by operating systems architectures, MEMGPT enables the core building blocks for agent-based applications.
Edge 367: Understanding Multi-Chain Reasoning in LLMs
Tuesday, February 6, 2024
One of the most interesting techniques used for more complex reasoning in LLMs.
🔥Building Plaid’s ML Fraud Detection Application—an apply() Fireside Chat
Monday, February 5, 2024
Want to know how Plaid, a leading fintech company, built the ML infrastructure that powers Signal, its payment fraud detection and prevention application? Then watch this virtual fireside chat on
The Most Open Open Source Generative AI Release
Sunday, February 4, 2024
AllenAI just released all the components of its OLMo LLM model.
You Might Also Like
⚙️ Your own Personal AI Agent, for Everything
Saturday, November 23, 2024
November 23, 2024 | Read Online Subscribe | Advertise Good Morning. Welcome to this special edition of The Deep View, brought to you in collaboration with Convergence. Imagine if you had a digital
Educational Byte: Are Privacy Coins Like Monero and Zcash Legal?
Saturday, November 23, 2024
Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 23, 2024? The HackerNoon
🐍 New Python tutorials on Real Python
Saturday, November 23, 2024
Hey there, There's always something going on over at Real Python as far as Python tutorials go. Here's what you may have missed this past week: Black Friday Giveaway @ Real Python This Black
Re: Hackers may have stolen everyone's SSN!
Saturday, November 23, 2024
I wanted to make sure you saw Incogni's Black Friday deal, which is exclusively available for iPhone Life readers. Use coupon code IPHONELIFE to save 58%. Here's why we recommend Incogni for
North Korean Hackers Steal $10M with AI-Driven Scams and Malware on LinkedIn
Saturday, November 23, 2024
THN Daily Updates Newsletter cover Generative AI For Dummies ($18.00 Value) FREE for a Limited Time Generate a personal assistant with generative AI Download Now Sponsored LATEST NEWS Nov 23, 2024
📧 Building Async APIs in ASP.NET Core - The Right Way
Saturday, November 23, 2024
Building Async APIs in ASP .NET Core - The Right Way Read on: my website / Read time: 5 minutes The .NET Weekly is brought to you by: Even the smartest AI in the world won't save you from a
WebAIM November 2024 Newsletter
Friday, November 22, 2024
WebAIM November 2024 Newsletter Read this newsletter online at https://webaim.org/newsletter/2024/november Features Using Severity Ratings to Prioritize Web Accessibility Remediation When it comes to
➡️ Why Your Phone Doesn't Want You to Sideload Apps — Setting the Default Gateway in Linux
Friday, November 22, 2024
Also: Hey Apple, It's Time to Upgrade the Macs Storage, and More! How-To Geek Logo November 22, 2024 Did You Know Fantasy author JRR Tolkien is credited with inventing the main concept of orcs and
JSK Daily for Nov 22, 2024
Friday, November 22, 2024
JSK Daily for Nov 22, 2024 View this email in your browser A community curated daily e-mail of JavaScript news React E-Commerce App for Digital Products: Part 4 (Creating the Home Page) This component
Spyglass Dispatch: The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen
Friday, November 22, 2024
The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen The Spyglass Dispatch is a free newsletter sent out daily on