Generative Audio Models Just Had a Great Week
Was this email forwarded to you? Sign up here Generative Audio Models Just Had a Great WeekThree major generative audio released in the last seven days.Next Week in The Sequence:
You can subscribed to The Sequence below:📝 Editorial: Generative Audio Models Just Had a Great WeekAudio is rapidly becoming one of the most important frontiers in generative AI, a field that is advancing swiftly. Several factors contribute to this rapid evolution. Technically, generative audio poses a fundamentally simpler problem than video or 3D, which leads to faster iterations in research and implementation. Additionally, from a model standpoint, many of the techniques, such as diffusion that pioneered text-to-image generation, are quite applicable to audio. From a market perspective, there exists a rich set of audio datasets that can be used to train new models, and the impact on industries such as media, robotics, or home automation can be quite immediate. The pace of innovation in generative audio is accelerating at remarkable levels. Just a few days ago, OpenAI shared some details about Voice Engine, a new model for synthetic voice generation. Last week, we saw several major releases related to generative audio:
To these innovations, you need to add more established players such as Eleven Labs, which have been pushing the boundaries of generative audio for years. Generative audio might be inching towards its ChatGPT moment faster than we think." 🔎 ML ResearchJambaFollowing their open source release( which we covered last week) AI21 published the paper detailing Jamba. The model presents a unique transformer-Mamba MoE architecture that looks the leverage the best of both approaches —> Read more. Many-Shot JailbreakingAnthropic published a paper detailing many-shot jailbreaking, a technique that can bypass traditional guardrails in LLMs including Claude. The technique exploits large context windows in LLMs that allows an attacker to position malicious text in different positions —> Read more. GekkoGoogle DeepMind published a paper discussing Gekko, a hyper efficient text embedding model that can achieve knowledge generalization with relatively small data. Gekko uses a very simple architecture that leverages knowledge from LLMs into a retriever —> Read more. MambaMixerResearchers from Cornell University published a paper outlining MambaMixer an architecture block that could be added to State Space Models(SSMs). MambaMixer uses a dual selection mechanism that combines tokens across different modalities —> Read more. Mixture-of-DepthGoogle DeepMind published a paper detailing a technique for allocating compute to specific positions in a sequence instead of the entire sequence. This approach really optimizes compute in transformer models to the point of capping the number of tokens that can participate in the attention layers —> Read more. ReFTResearchers from Stanford University published a paper introducing representation fine-tuning(ReFT), a technique that looks to edit representations in LLMs via fine-tuning. ReFT banks on the idea that representations encode rich semantic information which can lead to more effective fine-tuning —> Read more. 🤖 Cool AI Tech ReleasesStable Audio 2.0Stability AI released Stable Audio 2.0 which can generate musical tracks up to 3 mins long —> Read more. Universal-1AssemblyAI launched Universal-1, its multilingual speech-to-text model —> Read more. Command R+Cohere introduced Command R+, the new version of its LLM optimized for RAG and tool usage —> Read more. OpenAI Custom ModelsOpenAI announced enhacements to its training API as well as new mechanisms for building custom models —> Read more. Ressemble EnhanceResemble AI released Ressemble Enhance, a high quality speech resolution model —> Read more. 🛠 Real World MLText-To-SQL at PinterestThe Pinterest team shared details about their use of text-to-sql models for analytics workflows —> Read more. ML Lifecycle Management at SalesforceSalesforce discusses details about their ML Console for managing the lifecycle of internal ML workloads —> Read more. 📡AI Radar
You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
📝 Guest Post: The EU AI Act – A Guide for Developers*
Friday, April 5, 2024
In this guest post, Raza Habib, CEO and co-founder of Humanloop, shares insights on the EU AI Act's implications for developers and startups, emphasizing that the act primarily affects high-risk
Edge 384: Inside Genie: Google DeepMind's Astonishing Model that can Build 2D Games from Text and Images
Thursday, April 4, 2024
The model represents a new category in generative AI. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 383: The Key Capabilities of Autonomous Agens
Tuesday, April 2, 2024
Planning, memory, profiling, action execution, knowledge management and several others. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Four New Major Open Source Foundation Models in a Week
Sunday, March 31, 2024
DBRX, Grok 1.5, Samba-CoE and Jamba are all bringing unique innovations to open source generative AI. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 381: Google DeepMind's PrompBreeder Self-Improves Prompts
Thursday, March 28, 2024
The method combines chain of thoughts, plan and solve and evolutionary algorithms in a single mthod. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
You Might Also Like
Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator
Friday, February 14, 2025
What came before the golem? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Defining Your Paranoia Level: Navigating Change Without the Overkill
Friday, February 14, 2025
We've all been there: trying to learn something new, only to find our old habits holding us back. We discussed today how our gut feelings about solving problems can sometimes be our own worst enemy
5 ways AI can help with taxes 🪄
Friday, February 14, 2025
Remotely control an iPhone; 💸 50+ early Presidents' Day deals -- ZDNET ZDNET Tech Today - US February 10, 2025 5 ways AI can help you with your taxes (and what not to use it for) 5 ways AI can help
Recurring Automations + Secret Updates
Friday, February 14, 2025
Smarter automations, better templates, and hidden updates to explore 👀 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The First Provable AI-Proof Game: Introducing Butterfly Wings 4
Friday, February 14, 2025
Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? undefined The Market Today #01 Instagram (Meta) 714.52 -0.32%
GCP Newsletter #437
Friday, February 14, 2025
Welcome to issue #437 February 10th, 2025 News BigQuery Cloud Marketplace Official Blog Partners BigQuery datasets now available on Google Cloud Marketplace - Google Cloud Marketplace now offers
Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰
Friday, February 14, 2025
Discover how the share of US wealth held by the top 1% has evolved from 1989 to 2024 in this infographic. View Online | Subscribe | Download Our App Download our app to see thousands of new charts from
The Great Social Media Diaspora & Tapestry is here
Friday, February 14, 2025
Apple introduces new app called 'Apple Invites', The Iconfactory launches Tapestry, beyond the traditional portfolio, and more in this week's issue of Creativerly. Creativerly The Great
Daily Coding Problem: Problem #1689 [Medium]
Friday, February 14, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a linked list, sort it in O(n log n) time and constant space. For example,
📧 Stop Conflating CQRS and MediatR
Friday, February 14, 2025
Stop Conflating CQRS and MediatR Read on: my website / Read time: 4 minutes The .NET Weekly is brought to you by: Step right up to the Generative AI Use Cases Repository! See how MongoDB powers your