Was this email forwarded to you? Sign up here

🍱 The Text-to-Image Synthesis Revolution

Weekly news digest curated by the industry insiders

Aug 21

📝 Editorial

Next week, we will start a new series about text-to-image synthesis models. In the last year, this deep learning discipline has seen an astonishing level of progress. You probably heard about OpenAI DALL-E 2, but plenty of other impressive text-to-image generation models have been created in the last few months. We have seen Google coming up with models like Imagen and Parti; Meta has done amazing work with Make-A-Scene; OpenAI created GLIDE and, of course, DALL-E 2. All these models push the boundaries of text-to-image synthesis in ways that challenge human imagination. However, the innovation is not only coming from the big AI labs but also from startups in the space. MidJourney is one of the text-to-images synthesis models created by a relatively small startup; it shows artistic qualities quite often superior to models created by big AI incumbents. Just this week, AI startup Stability AI released a new model known as Stable Diffusion, which shows an impressive performance.

The text-to-image synthesis revolution has been catalyzed by the progress in language models over the last few years. The fascinating thing about text-to-image synthesis is that it immediately appeals to graphic artists and mainstream audiences. Art is the most important materialization of human creativity and imagination and, for years, has been considered one of the boundaries between machine and human intelligence. Now text-to-image synthesis models are crossing those boundaries, trying to offer visual proofs to spark the debate of whether AI can show creativity and imagination. Regardless, it is pretty clear that, these days, text-to-image synthesis has surpassed natural language understanding as the field dominates the headlines in AI. The next few months will likely bring fascinating developments to this nascent field in AI.

🔺🔻TheSequence Scope – our Sunday edition with the industry’s development overview – is free. To receive high-quality content about the most relevant developments in the ML world every Tuesday and Thursday, please subscribe to TheSequence Edge 🔺🔻

🗓 Next week in TheSequence Edge:

Edge#219: we start the new series about text-to-image models; discuss CLIP, a neural network that can learn image representations while being trained using natural language datasets; explore Hugging Face’s CLIP implementation.

Edge#220: we deep dive into Meta AI’s Make-A-Scene, which pushes the boundaries of AI art synthesis.

Now, let’s review the most important developments in the AI industry this week

🔎 ML Research

AI Agent Agency

DeepMind published a fascinating paper that describes a causal modeling method to understand an incentive in AI agents better and explains how to tailor the training based on that knowledge →read more

Distributed GNN Training

Amazon Research published a paper proposing a distributed training approach for graph neural networks →read more

Language for Robots

Google Research published a paper proposing a model that leverages advanced language models, which allow robots to follow instructions in the physical world →read more

Hyperparameter Tuning and Transformers

Google Research published a paper detailing OptFormer, the first hyperparameter optimization method targeted to transformer models →read more

✏️ Data Labeling Survey

How to work with data properly when preparing it? What are the best labeling methods and tools for ML solutions today? We keep learning from the experience gained by engineers and entrepreneurs behind the leading data labeling solutions, Toloka, Superb AI, Label Studio, and more.

Please take a simple survey to help us prepare an article about data labeling. It will take about 2-3 minutes.

TAKE THE SURVEY

🤖 Cool AI Tech Releases

Stable Diffusion

AI startup Stability AI launched Stable Diffusion, a text-to-image synthesis model based on latent diffusion techniques →read more

Cloudera Data Lakehouse

Cloudera announced the release of CDP One, a data lake as a service solution with integrated storage, computation and ML capabilities →read more

New TorchVision APIs

PyTorch added new APIs to its TorchVision framework for listing and initializing models and weights →read more

🛠 Real World ML

NY Times Paywall

The NY Times unveils some ML details it uses to make its paywall smarter →read more

💸 Money in AI

Data processors provider Pliops raised a $100 million Series D funding round led by Koch Disruptive Technologies (KDT). Hiring in Israel.
Conversational AI startup Modulate raised $30 million in a Series A funding round led by Lakestar. Hiring in Cambridge, MA/US.
AGI startup Keen Technologies raised a $20 million round, led by Nat Friedman and Daniel Gross.
AIOps company BigPanda raised $20 million in an extension of its Series E round, with contributions from UBS Next and Wells Fargo Strategic Capital. Hiring in Israel, the US, and Europe.
Cloud infrastructure optimization company Sync Computing raised $15.5 million in Series A funding led by Costanoa Ventures. Hiring remote.
Customer-facing analytics service Explo raised $12 million in Series A led by Craft Ventures. Hiring in San Francisco and New York/US.

Like

Comment

Share

You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities.

🍱 The Text-to-Image Synthesis Revolution

🍱 The Text-to-Image Synthesis Revolution

Weekly news digest curated by the industry insiders

📝 Editorial

🔎 ML Research

✏️ Data Labeling Survey

🤖 Cool AI Tech Releases

🛠 Real World ML

💸 Money in AI

Older messages

📝 Guest post: "ML Data": The past, present and future*

🗣🤖 Edge#218: Meta AI's BlenderBot 3, A 175B Parameter Model that can Chat About Every Topic and Organically Impr…

🔂 Edge#217: ML Testing Series – Recap

📙 Free book: Meet the Data Science Innovators

😴 ❌ Don’t Sleep on JAX

You Might Also Like

Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator

Defining Your Paranoia Level: Navigating Change Without the Overkill

5 ways AI can help with taxes 🪄

Recurring Automations + Secret Updates

The First Provable AI-Proof Game: Introducing Butterfly Wings 4

GCP Newsletter #437

Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰

The Great Social Media Diaspora & Tapestry is here

Daily Coding Problem: Problem #1689 [Medium]

📧 Stop Conflating CQRS and MediatR