TheSequence - 🗣🗣🗣 No Language Left Behind
Was this email forwarded to you? Sign up here 📝 EditorialNatural language understanding (NLU) is the area of deep learning that has seen the most impressive breakthroughs in recent years. However, most of the large-scale NLU models that impressed us are regularly optimized for a small set of high-resource languages. NLU models that exhibit remarkable performance in areas such as question answering, text completion and machine translation in languages like English, Spanish or French struggle when applied to hundreds of dialects that don’t possess large training datasets. The result is that there is growing inequality among the segments of the world population that can benefit from high-quality NLU solutions. This disparity is even more apparent for languages spoken outside Europe and North America. Extending NLU research to low-resource languages is a known challenge in the space. One of the most impressive achievements of recent years came last week from Meta AI with the release of the No Language Left Behind (NLLB)-200 model. This single neural network is able to translate text from 200 different languages achieving state-of-the-art results. To train NLLB-200, Meta AI used a technique two-step curriculum approach in which knowledge acquired from high-resource language training epochs was used in low-resource languages. The result was a massive 54 billion parameter model that had to be trained in Meta’s new Research SuperCluster (RSC) supercomputer. Together with NLLB-200, Meta AI open-sourced the FLORES-200 dataset for evaluating machine translation models. It also provides $200,000.00 in grants to non-profit organizations building applications that use NLLB-200. All together, NLLB-200 represents one of the most impressive milestones ever achieved in machine translation for low-resource languages. 🔺🔻TheSequence Scope – our Sunday edition with the industry’s development overview – is free. To receive high-quality content about the most relevant developments in the ML world every Tuesday and Thursday, please subscribe to TheSequence Edge 🔺🔻 🗓 Next week in TheSequence Edge: Edge#207: we summarize our graph neural networks (GNNs) series. Edge#208: we explore Google Brain’s Minerva who can solve complex mathematical and scientific problems using step-by-step reasoning. Now, let’s review the most important developments in the AI industry this week 🔎 ML ResearchTranslating Across 200 Languages Meta AI published a paper detailing a new model that can perform high-quality translations across 200 languages →read more on Meta AI blog Director – a Hierarchical RL Agent Google Research published a paper detailing Director, a hierarchical reinforcement learning agent that can learn hierarchical behaviors from raw pixels →read more on Google Reseach blog Joint Image-Text Representations Amazon Research published a paper presenting a model for alignment of features in image and text datasets →read more on Amazon Research blog Disfluency Speech Detection Google Research published a paper detailing a BERT-like model that can detect disfluency in natural speech →read more on Google Research blog ☝️ We Recommend – Try the Real-Time Database for Continuously Changing DataYou can now enroll in Molecula’s 7-day Cloud trial (without installation or infrastructure management) or install FeatureBase in your own environment to meet your needs (no credit card required) →See which trial experience is right for you 🤖 Cool AI Tech ReleasesPyTorch 1.12 A new release of PyTorch is available with capabilities with Torch Arrow for batch data preprocessing, a functional API for modules and many others →read more on PyTorch blog 🛠 Real World MLAnomaly Detection at Walmart Walmart details the ML architecture used for anomaly detection in its e-commerce infrastructure →read more on the Walmart Tech Labs blog Uber Spark Architecture Uber discusses some of the updates for data shuffling in its Spark architecture →read more on Uber Engineering blog 💸 Money in AI
Acquisitions
You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
📌 Free 7-Day Trial of FeatureBase, the Real-Time Database for Continuously Changing Data
Friday, July 8, 2022
We're excited to support Molecula's launch of FeatureBase and offer you a 7-day Trial. You can either enroll in a Cloud trial (without installation or infrastructure management) or install
🟩⬛️ Edge#206: OpenAI’s New Transformer Model Mastered Minecraft by Using Unlabeled Videos
Thursday, July 7, 2022
One of the first applications of transformer models to video intelligence
😱 Flash 50% OFF
Wednesday, July 6, 2022
A unique offer to celebrate TheSequence's 2nd Anniversary!
☝️⚙️ Edge#205: What is Graph Attention Network?
Tuesday, July 5, 2022
Welcome to our premium newsletter that help you learn ML concepts and focuses on the projects that move the AI industry forward. The content is unique and trusted by the main AI labs, universities,
♦️⚡️♦️ Databricks' New ML Announcements
Sunday, July 3, 2022
Databricks has been one of the companies that have been at the center of the big data movement, pioneering technologies such as Apache Spark. Machine learning (ML) has been a native component of Spark
You Might Also Like
Software Testing Weekly - Issue 217
Monday, April 29, 2024
How do you deal with conflicts in QA? ⚔️ View on the Web Archives ISSUE 217 April 29th 2024 COMMENT Welcome to the 217th issue! How do you deal with conflicts in QA? Ideally, you'd like to know how
📧 Did you watch the free MMA chapters? (1+ hours of content)
Monday, April 29, 2024
Did you watch the free MMA chapters? Hey there! 👋 I wish you a fantastic start to the week. Last week, I launched Modular Monolith Architecture. More than 300+ students are already deep into the MMA
WP Weekly 191 - Essentials - Duplicate in Core, White Label Kadence, Studio for Mac
Monday, April 29, 2024
Read on Website WP Weekly 191 / Essentials It seems many essential features are being covered in-house, be it the upcoming duplicate posts/pages feature in the WordPress core or the launch of Studio
SRE Weekly Issue #422
Monday, April 29, 2024
View on sreweekly.com A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries,
Quick question
Sunday, April 28, 2024
I want to learn how I can better serve you
Kotlin Weekly #404 (NOT FOUND)
Sunday, April 28, 2024
ISSUE #404 28st of April 2024 Announcements Kotlin Multiplatform State of the Art Survey 2024 Help to shape and understand the Kotlin Multiplatform Ecosystem! It takes 4 minutes to fill this survey.
📲 Why Is It Called Bluetooth? — Check Out This AI Text to Song Generator
Sunday, April 28, 2024
Also: What to Know About Emulating Games on iPhone, and More! How-To Geek Logo April 28, 2024 📩 Get expert reviews, the hottest deals, how-to's, breaking news, and more delivered directly to your
Daily Coding Problem: Problem #1425 [Easy]
Sunday, April 28, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Microsoft. Suppose an arithmetic expression is given as a binary tree. Each leaf is an
PD#571 Software Design Principles I Learned the Hard Way
Sunday, April 28, 2024
If there's two sources of truth, one is probably wrong. And yes, please repeat yourself.
When Procrastination is Productive & Ghost integrating with ActivityPub
Sunday, April 28, 2024
Automattic, Texts, and Beeper join forces to build world's best inbox, Reflect launches its iOS app, how to start small rituals, and a lot more in this week's issue of Creativerly. Creativerly