Was this email forwarded to you? Sign up here

🗣🗣🗣Another Amazing Week for Large Language Models

Weekly news digest curated by the industry insiders

Aug 7

📝 Editorial

Natural language understanding (NLU) has been, by far, the fastest growing area of deep learning. Regularly, we read about massive NLU models reaching new milestones across different language tasks. This week, we had a fresh taste of the progress with models published by Meta AI and Alexa AI.

In Edge#3, we covered Meta’s release of BlenderBot, a chatbot that could converse about almost any topic. The magic of BlenderBot is its ability to rapidly mine the internet and incorporate domain knowledge in conversations making the interactions more natural. BlenderBot is also able to collect feedback and upgrade itself. This week, Meta AI open-sourced BlederBot 3, a new 175 billion parameter version that achieves over 30% improvement compared to its predecessors across different conversational tasks. Meta AI released a live demo of BlenderBot 3, allowing users to interact with the chatbot and contribute to its training.

Amazon’s Alexa AI team is another AI lab that has been pushing the boundaries of NLU models. That is not surprising considering that Alexa-powered devices are one of the world’s most active AI conversational environments. This week, Alexa AI unveiled AlexaTM, a 20 billion parameter model that uses few-shot learning to master tasks in new languages with just a few training examples. AlexaTM topped GPT-3 in many tasks in low-resource languages.

The pace of progress in NLU research is astonishing and never boring. The models released this week by Meta AI and Alexa AI challenge the imagination of the new frontiers for NLU models.

🔺🔻TheSequence Scope – our Sunday edition with the industry’s development overview – is free. To receive high-quality content about the most relevant developments in the ML world every Tuesday and Thursday, please subscribe to TheSequence Edge 🔺🔻

🗓 Next week in TheSequence Edge:

Edge#215: we discuss Pre-Train Model Testing; overview the pillars of robust machine learning; explore Great Expectations.

Edge#216: we overview Gato, DeepMind’s new Super Model that can generalize across multiple tasks on different domains.

Now, let’s review the most important developments in the AI industry this week

🔎 ML Research

Multi-Domain Neural Architecture Search

Google Research published a paper about a multi-path neural architecture search technique to create unified architecture across multiple domains →read more on the Google Research blog

AlexaTM

Amazon Research unveiled AlexaTM, a 20 billion parameter model that achieves state-of-the-art performance in several few-shot learning language benchmarks →read more on the Amazon Research blog

ViTDet

Meta AI published a paper detailing ViTDet, a hierarchical vision transformer optimized for detecting uncommon object classes →read more on the Meta AI blog

Enhancing Backpropagation

Google Research published a paper introducing a new technique to train neural networks improving upon the iconic backpropagation algorithm →read more on the Google Research blog

🤖 Cool AI Tech Releases

BlenderBot 3

Meta AI released BlenderBot 3, a 175 billion parameter chatbot that can converse about almost any topic →read more on the Meta AI blog

auton-survival

Carnegie Mellon University open-sourced auton-survival, a framework for counterfactual estimation, regression, and evaluation of time-to-event data →read more on the Carnegie Mellon University blog

🛠 Real World ML

Pricing at Lyft

Lyft unveils some details about the data and ML infrastructure used for pricing in its transportation marketplace →read more on the Lyft Engineering blog

💸 Money in AI

AI-powered

Service experience solution Aisera raised $90 million in a Series D funding round led by Goldman Sachs Asset Management and Thoma Bravo. Hiring in India, the US, and Greece.
Decision-making platform Arena raised a $32 million Series A round led by Initialized and Goldcrest Capital. Hiring in New York, US.
Intellectual property (IP) protection platform MarqVision raised $20 million in a Series A funding round led by DST Global Partners and Atinum Investment. Hiring in Seoul/South Korea.
Product search engine Vetted raised a $14 million Series A investment round led by Insight Partners. Hiring remote.

Acquisition

NLP company Re:infer was acquired by RPA company UiPath. The terms of the deal were not disclosed. Hiring in London/UK (remote).
API observability company Seekret was acquired by monitoring platform Datadog. The terms of the deal were not disclosed. Hiring in Tel Aviv/Israel.

Like

Comment

Share

You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities.

🗣🗣🗣Another Amazing Week for Large Language Models

🗣🗣🗣Another Amazing Week for Large Language Models

Weekly news digest curated by the industry insiders

📝 Editorial

🔎 ML Research

🤖 Cool AI Tech Releases

🛠 Real World ML

💸 Money in AI

Older messages

📝 Guest post: Auto Labeling to Power Insurance Automation: Quickly Label Quality Datasets*

🗺 Edge#214: NLLB-200, Meta AI’s New Super Model that Achieved New Milestones in Machine Translations Across 200 L…

🩺 Edge#213: Testing Trained Models

📝 Guest post: Using AI to Learn a Disentangled Gait Representation for Versatile Quadruped Locomotion*

🧬 DeepMind’s AlphaFold Database

You Might Also Like

Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator

Defining Your Paranoia Level: Navigating Change Without the Overkill

5 ways AI can help with taxes 🪄

Recurring Automations + Secret Updates

The First Provable AI-Proof Game: Introducing Butterfly Wings 4

GCP Newsletter #437

Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰

The Great Social Media Diaspora & Tapestry is here

Daily Coding Problem: Problem #1689 [Medium]

📧 Stop Conflating CQRS and MediatR