TheSequence - 📹 🤖 Transformers for Video
Was this email forwarded to you? Sign up here 📝 EditorialTransformers are universally acknowledged as the most important development in deep learning architectures of the last decade. The impact of transformers in natural language understanding (NLU) tasks has challenged the imagination of even the most hard-core believers in neural networks. In recent years, we have seen steady contributions of transformers to domains such as computer vision but mostly related to image-related tasks such as classification. Now transformer architectures are expanding into a new frontier: video intelligence. The idea of using transformers for video intelligence tasks makes a lot of sense. Typically, video intelligence techniques require large amounts of labeled data to understand the predicted actions in a video frame. Transformers excel at learning from unlabeled datasets, and there are a lot of videos available on the internet to learn from. Just like in NLU tasks, transformer models could be pretrained in large sets of unlabeled videos and fine-tuned for specific tasks. Last week, OpenAI unveiled its work on video pertaining (VPT) models. This type of model adapts the principle of transformers to video intelligence tasks. To push the boundaries, OpenAI pretrained VPT in Minecraft videos, and the model was able to master tasks that required large training pipelines with techniques such as reinforcement learning which have produced some of the best results in video intelligence tasks in recent years. With GPT-3, OpenAI established kind of the gold standard for transformers in NLU tasks. They follow up with their work on Dall-E and Dall-E2 to apply transformers to both images and language tasks. VPT seems to be their first major step in extending this work into the area of video intelligence. Maybe VPT is the foundation for OpenAI’s new supermodel. 🔺🔻TheSequence Scope – our Sunday edition with the industry’s development overview – is free. To receive high-quality content about the most relevant developments in the ML world every Tuesday and Thursday, please subscribe to TheSequence Edge 🔺🔻 🗓 Next week in TheSequence Edge: Edge#203: we explain what Graph Recurrent Neural Networks are, discuss GNNs on Dynamic Graphs, explore DeepMind’s Jraph, a GNN Library for JAX. Edge#204: we deep dive into Imagen, Google’s impressive text-to-image alternative to OpenAI’s DALLE-2. Now, let’s review the most important developments in the AI industry this week 🔎 ML ResearchMastering Minecraft with Video Pretraining OpenAI published a paper detailing video pretraining(VPT), a semi-supervised, imitation learning method that was able to learn to play Minecraft from unlabeled datasets →read more on OpenAI blog QML Improvements AI labs from Google, Microsoft, CalTech, Harvard and others collaborated on quantum ML (QML) techniques that show tangible improvements over classical counterparts →read more on Google Research blog Swin Transformer Improvements Microsoft Research published details about improvements to Swin Transformer, its 3 billion parameter computer vision model →read more on Microsoft Research blog GODEL Microsoft Research published a paper detailing GODEL, a new form of pretrained language model that also leverages external datasets allowing to focus on specific tasks or engage in open-ended conversation →read more on Microsoft Research blog 📌 Event: June 29th – Arize:Observe UnstructuredOnly three days left to register for Arize:Observe Unstructured. This free, virtual event on Wednesday features an all-star lineup of speakers including from OpenAI, Hugging Face, the creator of UMAP & more! Register now. 🤖 Cool AI Tech ReleasesGitHub Copilot GA GitHub AI-based pair programming agent reached general availability →read more on GitHub blog TorchGeo PyTorch open-sourced TorchGeo, a library for processing geospatial data in ML models →read more on PyTorch blog 🛠 Real World MLPyTorch at Disney The Disney Media & Entertainment Distribution (DMED) detailed the PyTorch architecture used for activity recognition across video, audio, and text datasets →read more on PyTorch blog 💸 Money in AI
Acqusitions
You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
🎙 Orly Amsalem/cnvrg.io on building developer-first ML products
Friday, June 24, 2022
Can software developer be transformed into an ML creator?
🟢⚪️ Edge#202: How to Ship ML-powered Apps with Baseten
Thursday, June 23, 2022
Building a performant model is just the start, what to do next?
🎙 Google’s Allen Day on Using ML in the Cryptocurrency Space
Wednesday, June 22, 2022
It's so inspiring to learn from practitioners and thinkers. Getting to know the experience gained by researchers, engineers, and entrepreneurs doing real ML work is an excellent source of insight
Sign in to TheSequence
Tuesday, June 21, 2022
. Here's a link to sign in to TheSequence. This link can only be used once and expires after 24 hours. Sign in now © 2022 Jesus Rodriguez, Ksenia Semenova 75 Miracle Mile, Suite 7688, Coral Gables,
💠 Edge#201: Understanding Graph Convolutional Neural Networks
Tuesday, June 21, 2022
In this issue: we explain Graph Convolutional Neural Networks; we overview the original GCN Paper; we explore PyTorch Geometric, one of the most complete GNN frameworks available today. Enjoy the
You Might Also Like
Daily Coding Problem: Problem #1619 [Hard]
Monday, November 25, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given two non-empty binary trees s and t , check whether tree t has exactly the
Unpacking “Craft” in the Software Interface & The Five Pillars of Creative Flow
Monday, November 25, 2024
Systems Over Substance, Anytype's autumn updates, Ghost's progress with its ActivityPub integration, and a lot more in this week's issue of Creativerly. Creativerly Unpacking “Craft” in the
What Investors Want From AI Startups in 2025
Monday, November 25, 2024
Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 25, 2024? The HackerNoon
GCP Newsletter #426
Monday, November 25, 2024
Welcome to issue #426 November 25th, 2024 News LLM Official Blog Vertex AI Announcing Mistral AI's Large-Instruct-2411 on Vertex AI - Google Cloud has announced the availability of Mistral AI's
⏳ 36 Hours Left: Help Get "The Art of Data" Across the Finish Line 🏁
Monday, November 25, 2024
Visual Capitalist plans to unveal its secrets behind data storytelling, but only if the book hits its minimum funding goal. View Online | Subscribe | Download Our App We Need Your Help Only 36 Hours
DeveloPassion's Newsletter #180 - Black Friday Week
Monday, November 25, 2024
Edition 180 of my newsletter, discussing Knowledge Management, Knowledge Work, Zen Productivity, Personal Organization, and more! Sébastien Dubois DeveloPassion's Newsletter DeveloPassion's
Meet HackerNoon's Latest Features: Boost Stories with Translations, Speech-to-Text & More
Monday, November 25, 2024
Hey, Hacker! HackerNoon's monthly product update is here! Get ready for a new version of the mobile app, more translation developments, a new AI Gallery, backend moves, and more! 🚀 This product
The ultimate holiday gadget gift
Monday, November 25, 2024
AI isn't hitting a wall; $70 off Apple Watch; 60+ Amazon deals -- ZDNET ZDNET Tech Today - US November 25, 2024 Meta Quest 3S Why the Meta Quest 3S is the ultimate 2024 holiday present This $299
Deduplication in Distributed Systems: Myths, Realities, and Practical Solutions
Monday, November 25, 2024
This week, we'll discuss the deduplication strategies. We'll see whether they're useful and consider scenarios where you may need them. We'll also do a reality check with the promises
How to know if your data has been exposed
Monday, November 25, 2024
How do you know if your personal data has been leaked? Imagine getting an instant notification if your SSN, credit card, or password has been exposed on the dark web — so you can take action