TheSequence - Meta Gets Into AI Video Generation
Was this email forwarded to you? Sign up here Meta Gets Into AI Video GenerationMovie Gen promises to generate high fidelity videos with synchronized audio.Next Week in The Sequence:
You can subscribe to The Sequence below:
📝 Editorial: Meta Gets Into AI Video GenerationI rarely write back-to-back editorials about the same company, but Meta has left me no choice. After announcing an impressive number of AI releases last week, Meta AI has just unveiled its latest work in video and audio generation with Movie Gen. Open-source generative video has long been considered a challenging space due to the high cost of pretraining models. At its core, Movie Gen is a new suite of generative AI models from Meta that focuses on creating and editing media, including images, video, and audio, using text prompts. It represents the culmination of Meta’s prior work in generative AI, combining and improving upon elements from projects like Make-A-Scene and LLaMA Image Foundation models. Unlike previous models that targeted specific modalities, Movie Gen allows for fine-grained control across all of them, representing a significant leap forward in generative AI for media. One of Movie Gen’s key strengths is its ability to perform various tasks across different modalities. It can generate videos from scratch using text prompts, create personalized videos by integrating a user's image with text descriptions, and precisely edit existing videos using text commands for modifications. Additionally, Movie Gen includes an audio generation model capable of producing realistic sound effects, background music, and ambient sounds synchronized with video content. The Movie Gen research paper is fascinating, and we’ll be discussing more details in The Sequence Edge over the next few weeks. 💎 We recommend
Don't miss GenAI Productionize 2.0 – the premier conference for GenAI application development featuring AI experts from leading brands, startups, and research labs! 🔎 ML ResearchMovie GenMeta AI published a paper introducing Movie Gen, a new set of foundation models for video and audio generation. Movie Gen can generate 1080p HD videos with synchronized audio and includes capabilities such as video editing —> Read more. MM1.5Apple Research published a paper unveiling MM1.5, a new family of multimodal LLMs ranging from 1B to 30B. The new models built upon its MM1 predecessor which includes quite a few modalities during model training —> Read more. ComfyGenResearchers from NVIDIA and Tel Aviv University published a paper detailing ComfyGen, a technique for adapting workflows to each user prompt in text to image generation. The method combines two LLMs tasks to learn from user preference data and select the appropiate workflow respectively —> Read more. LLM Reasoning StudyResearchers from Google DeepMind and Mila published a paper studying the reasoning capabilities of different LLMs with surprising results. The paper uses grade-school math problem solving tasks as the core benchmark and showcases major gaps in LLMs across different model sizes —> Read more. Cross Capabilities in LLMsMeta AI and researchers from the University of Illionois published a paper studying the different types of abilities of LLMs across different tasks. They called this term cross capabilities. The paper also introduces CROSSEVAL, a benchmark for evaluating the cross capabilities of LLMs —> Read more. Embodied RAGResearchers from Carnegie Mellon University(CMU) published a paper introducing embodied-RAG, a memory method for both navigation and language generation in embodied agents. Embodied-RAG handles semantic resolutions across different environments —> Read more. LLaVA-CriticByteDance Research published a paper introducing LLaVA-Critic, a multimodal LLM designed to evaluate mutimodal tasks. LLaVA-Critic is trained using a large instruction dataset for evaluation across different scenarios —> Read more. 🤖 AI Tech ReleasesLiquid Foundation ModelsLiquid AI released their first set of foundation models based on a non-transformer architecture —> Read more. Black Forest Labs APIBlack Forest Labs, the image generation lab powering xAI’s Grok’s image capabilities, unveiled a new API —> Read more. Digital Twin CatalogMeta AI released the Digital Twin Catalog, a new dataset for 3D object reconstruction —> Read more. Data FormulatorMicrosoft open sourced the next version of Data Formulator, a project for designing chart interfaces using language —> Read more. CanvasOpenAI announced Canvas, a new interface to interact with ChatGPT —> Read more. 🛠 Real World AIAI at Amazon PharmacyAmazon details some of the AI methods used to process prescriptions for Amazon Pharmacy customers —> Read more. 📡AI Radar
You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
📝 Guest Post: Multimodal Retrieval –Bridging the Gap Between Language and Diverse Data Types*
Friday, October 4, 2024
Generative AI has recently witnessed an exciting development: using language to understand images, video, audio, molecules, time-series, and other "modalities." Multimodal retrieval
Edge 436: Salesforce's xLAM is a New Model for Agentic Tasks
Thursday, October 3, 2024
The new model excels in tasls such as function calling, tool integration and planning. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Edge 435: Learn About Hungry Hungry Hippos and SSMs
Tuesday, October 1, 2024
One of the most important layers of state space models. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Meta AI’s Big Announcements
Sunday, September 29, 2024
New AR glasses, Llama 3.2 and more. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
How Does AI "See" Us?
Friday, September 27, 2024
A fascinating study that analyzed over 1200 images from four global AI models. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
You Might Also Like
Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator
Friday, February 14, 2025
What came before the golem? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Defining Your Paranoia Level: Navigating Change Without the Overkill
Friday, February 14, 2025
We've all been there: trying to learn something new, only to find our old habits holding us back. We discussed today how our gut feelings about solving problems can sometimes be our own worst enemy
5 ways AI can help with taxes 🪄
Friday, February 14, 2025
Remotely control an iPhone; 💸 50+ early Presidents' Day deals -- ZDNET ZDNET Tech Today - US February 10, 2025 5 ways AI can help you with your taxes (and what not to use it for) 5 ways AI can help
Recurring Automations + Secret Updates
Friday, February 14, 2025
Smarter automations, better templates, and hidden updates to explore 👀 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The First Provable AI-Proof Game: Introducing Butterfly Wings 4
Friday, February 14, 2025
Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? undefined The Market Today #01 Instagram (Meta) 714.52 -0.32%
GCP Newsletter #437
Friday, February 14, 2025
Welcome to issue #437 February 10th, 2025 News BigQuery Cloud Marketplace Official Blog Partners BigQuery datasets now available on Google Cloud Marketplace - Google Cloud Marketplace now offers
Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰
Friday, February 14, 2025
Discover how the share of US wealth held by the top 1% has evolved from 1989 to 2024 in this infographic. View Online | Subscribe | Download Our App Download our app to see thousands of new charts from
The Great Social Media Diaspora & Tapestry is here
Friday, February 14, 2025
Apple introduces new app called 'Apple Invites', The Iconfactory launches Tapestry, beyond the traditional portfolio, and more in this week's issue of Creativerly. Creativerly The Great
Daily Coding Problem: Problem #1689 [Medium]
Friday, February 14, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a linked list, sort it in O(n log n) time and constant space. For example,
📧 Stop Conflating CQRS and MediatR
Friday, February 14, 2025
Stop Conflating CQRS and MediatR Read on: my website / Read time: 4 minutes The .NET Weekly is brought to you by: Step right up to the Generative AI Use Cases Repository! See how MongoDB powers your