TheSequence - Meta Gets Into AI Video Generation
Was this email forwarded to you? Sign up here Meta Gets Into AI Video GenerationMovie Gen promises to generate high fidelity videos with synchronized audio.Next Week in The Sequence:
You can subscribe to The Sequence below:
📝 Editorial: Meta Gets Into AI Video GenerationI rarely write back-to-back editorials about the same company, but Meta has left me no choice. After announcing an impressive number of AI releases last week, Meta AI has just unveiled its latest work in video and audio generation with Movie Gen. Open-source generative video has long been considered a challenging space due to the high cost of pretraining models. At its core, Movie Gen is a new suite of generative AI models from Meta that focuses on creating and editing media, including images, video, and audio, using text prompts. It represents the culmination of Meta’s prior work in generative AI, combining and improving upon elements from projects like Make-A-Scene and LLaMA Image Foundation models. Unlike previous models that targeted specific modalities, Movie Gen allows for fine-grained control across all of them, representing a significant leap forward in generative AI for media. One of Movie Gen’s key strengths is its ability to perform various tasks across different modalities. It can generate videos from scratch using text prompts, create personalized videos by integrating a user's image with text descriptions, and precisely edit existing videos using text commands for modifications. Additionally, Movie Gen includes an audio generation model capable of producing realistic sound effects, background music, and ambient sounds synchronized with video content. The Movie Gen research paper is fascinating, and we’ll be discussing more details in The Sequence Edge over the next few weeks. 💎 We recommend
Don't miss GenAI Productionize 2.0 – the premier conference for GenAI application development featuring AI experts from leading brands, startups, and research labs! 🔎 ML ResearchMovie GenMeta AI published a paper introducing Movie Gen, a new set of foundation models for video and audio generation. Movie Gen can generate 1080p HD videos with synchronized audio and includes capabilities such as video editing —> Read more. MM1.5Apple Research published a paper unveiling MM1.5, a new family of multimodal LLMs ranging from 1B to 30B. The new models built upon its MM1 predecessor which includes quite a few modalities during model training —> Read more. ComfyGenResearchers from NVIDIA and Tel Aviv University published a paper detailing ComfyGen, a technique for adapting workflows to each user prompt in text to image generation. The method combines two LLMs tasks to learn from user preference data and select the appropiate workflow respectively —> Read more. LLM Reasoning StudyResearchers from Google DeepMind and Mila published a paper studying the reasoning capabilities of different LLMs with surprising results. The paper uses grade-school math problem solving tasks as the core benchmark and showcases major gaps in LLMs across different model sizes —> Read more. Cross Capabilities in LLMsMeta AI and researchers from the University of Illionois published a paper studying the different types of abilities of LLMs across different tasks. They called this term cross capabilities. The paper also introduces CROSSEVAL, a benchmark for evaluating the cross capabilities of LLMs —> Read more. Embodied RAGResearchers from Carnegie Mellon University(CMU) published a paper introducing embodied-RAG, a memory method for both navigation and language generation in embodied agents. Embodied-RAG handles semantic resolutions across different environments —> Read more. LLaVA-CriticByteDance Research published a paper introducing LLaVA-Critic, a multimodal LLM designed to evaluate mutimodal tasks. LLaVA-Critic is trained using a large instruction dataset for evaluation across different scenarios —> Read more. 🤖 AI Tech ReleasesLiquid Foundation ModelsLiquid AI released their first set of foundation models based on a non-transformer architecture —> Read more. Black Forest Labs APIBlack Forest Labs, the image generation lab powering xAI’s Grok’s image capabilities, unveiled a new API —> Read more. Digital Twin CatalogMeta AI released the Digital Twin Catalog, a new dataset for 3D object reconstruction —> Read more. Data FormulatorMicrosoft open sourced the next version of Data Formulator, a project for designing chart interfaces using language —> Read more. CanvasOpenAI announced Canvas, a new interface to interact with ChatGPT —> Read more. 🛠 Real World AIAI at Amazon PharmacyAmazon details some of the AI methods used to process prescriptions for Amazon Pharmacy customers —> Read more. 📡AI Radar
You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
📝 Guest Post: Multimodal Retrieval –Bridging the Gap Between Language and Diverse Data Types*
Friday, October 4, 2024
Generative AI has recently witnessed an exciting development: using language to understand images, video, audio, molecules, time-series, and other "modalities." Multimodal retrieval
Edge 436: Salesforce's xLAM is a New Model for Agentic Tasks
Thursday, October 3, 2024
The new model excels in tasls such as function calling, tool integration and planning. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 435: Learn About Hungry Hungry Hippos and SSMs
Tuesday, October 1, 2024
One of the most important layers of state space models. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Meta AI’s Big Announcements
Sunday, September 29, 2024
New AR glasses, Llama 3.2 and more. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
How Does AI "See" Us?
Friday, September 27, 2024
A fascinating study that analyzed over 1200 images from four global AI models. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
You Might Also Like
Recording: 'Data Storytelling: What Organizations Need to Know Going Into 2025'
Friday, November 22, 2024
Thank you for your interest in our latest webinar. As promised here is your recording of the event. View email in browser Recording Now Available Thank you for your interest in receiving a recording of
💻 Issue 437 - Introducing local Azure Service Bus Emulator
Thursday, November 21, 2024
This week's Awesome .NET Weekly Read this email on the Web The Awesome .NET Weekly Issue » 437 Release Date Nov 21, 2024 Your weekly report of the most popular .NET news, articles and projects
💎 Issue 444 - Why did people rub snow on frozen feet? (2017)
Thursday, November 21, 2024
This week's Awesome Ruby Newsletter Read this email on the Web The Awesome Ruby Newsletter Issue » 444 Release Date Nov 21, 2024 Your weekly report of the most popular Ruby news, articles and
💻 Issue 444 - JavaScript Dos and Donts
Thursday, November 21, 2024
This week's Awesome JavaScript Weekly Read this email on the Web The Awesome JavaScript Weekly Issue » 444 Release Date Nov 21, 2024 Your weekly report of the most popular JavaScript news, articles
📱 Issue 438 - Reverse Engineering iOS 18 Inactivity Reboot
Thursday, November 21, 2024
This week's Awesome iOS Weekly Read this email on the Web The Awesome iOS Weekly Issue » 438 Release Date Nov 21, 2024 Your weekly report of the most popular iOS news, articles and projects Popular
💻 Issue 362 - React Anti-Pattern: Stop Passing Setters Down the Components Tree
Thursday, November 21, 2024
This week's Awesome React Weekly Read this email on the Web The Awesome React Weekly Issue » 362 Release Date Nov 21, 2024 Your weekly report of the most popular React news, articles and projects
💻 Issue 444 - Building simple event-driven applications with Pub/Sub
Thursday, November 21, 2024
This week's Awesome Node.js Weekly Read this email on the Web The Awesome Node.js Weekly Issue » 444 Release Date Nov 21, 2024 Your weekly report of the most popular Node.js news, articles and
📱 Issue 441 - Shift Left Is the Tip of the Iceberg
Thursday, November 21, 2024
This week's Awesome Swift Weekly Read this email on the Web The Awesome Swift Weekly Issue » 441 Release Date Nov 21, 2024 Your weekly report of the most popular Swift news, articles and projects
💻 Issue 439 - Async/Await Is Real And Can Hurt You
Thursday, November 21, 2024
This week's Awesome Rust Weekly Read this email on the Web The Awesome Rust Weekly Issue » 439 Release Date Nov 21, 2024 Your weekly report of the most popular Rust news, articles and projects
📲 Why I Ditched Linux for Samsung DeX — Buy This Instead of a Gaming Headset
Thursday, November 21, 2024
Also: Taking Instagram Stories to the Next Level, and More! How-To Geek Logo November 21, 2024 Did You Know Thurl Ravenscroft was both the voice behind the Christmas song "You're a Mean One,