Import AI 293: Generative humans; few shot learning comes for vision-text models; and another new AI startup is born

How will fashion be influenced by the aesthetics of early AI generation technology?
View this email in your browser

Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your chums an AI upgrade. Subscribe here.

Generating and editing humans has got really easy:
…Next stop: unreal avatars show up in fashion, marketing, and other fields…
Researchers with Chinese computer vision giant SenseTime, as well as Nanyang Technological University and the Shanghai AI Laboratory, have gathered a large dataset of pictures of people and used it to train a model that can generate and edit pictures of people. This kind of model has numerous applications, ranging from fashion to surveillance.

What they did: The researchers built a dataset containing 230,000 images of people, called the Stylish-Humans-HQ-Dataset (SHHQ), and used this to train six different models across two resolutions and three versions of StyleGAN, an approach for creating generative models. A lot of the special work they did here involved creating a diverse dataset including a load of pictures of faces at unusual angles (this means models trained on SHHQ are a bit more robust and do less of the 'works, works, works, OH GOD WHAT JUST HAPPENED' phenomenon you encounter when generative models go to the edge of their data distribution).

Why this matters: Models and datasets like this highlight just how far the field of generative AI has come - we can now generate broadly photorealistic avatars of people in 2D space and interpolate between them, following earlier successes at doing this for the more bounded domain of faces. Systems like this will have a lot of commercial relevance, but will also serve as useful research artifacts for further developing synthetic imagery and scene modeling techniques. Check out the demo on HuggingFace to get a feel for it.
  Read more: StyleGAN-Human: A Data-Centric Odyssey of Human Generation (arXiv).
  Check out the GitHub project page: StyleGAN-Human.
  Check out the GitHub: StyleGAN-Human (GitHub).
  Try out the demo on HuggingFace Spaces (HuggingFace)


####################################################

Vicarious gets acquired in a weird way:
…Longtime AI lab gets acquired and split into two…
Vicarious, a research lab that spent the better part of a decade trying to build superintelligence, has been acquired by Google. The acquisition is notable for being slightly strange - a chunk of Vicarious is going to Google X robot startup 'Intrinsic', while a smaller set of researchers "will join DeepMind’s research team alongside Vicarious CTO Dileep George".

AI trivia: Dileep George used to work with Jeff Hawkins at Numenta, another fairly old lab trying to build superintelligence. Both Numenta and, to a lesser extent, Vicarious, have been playing around with approaches to AI that are more inspired by the human brain than the fairly crude approximations used by most other AI companies.
  Read more: Mission momentum: welcoming Vicarious (Inceptive).

####################################################

Here comes another AI startup - Adept:
…Former Google, DeepMind, and OpenAI researchers unite…
A bunch of people who had previously built large-scale AI models at Google, DeepMind, and OpenAI, have announced Adept, an "ML research and product lab". Adept's founders include the inventors of the Transformer, and people involved in the development of GPT2 and GPT3. (Bias alert: David Luan is involved; I used to work with him at OpenAI and think he's a nice chap - congrats, David!).

What Adept will do: Adept's goal is, much like the other recent crop of AI startups, to use big generative models to make it easier to get stuff done on computers. In the company's own words, "we’re building a general system that helps people get things done in front of their computer: a universal collaborator for every knowledge worker. Think of it as an overlay within your computer that works hand-in-hand with you, using the same tools that you do." Some of the specific examples they give include: "You could ask our model to “generate our monthly compliance report” or “draw stairs between these two points in this blueprint” – all using existing software like Airtable, Photoshop, an ATS, Tableau, Twilio to get the job done together. We expect the collaborator to be a good student and highly coachable, becoming more helpful and aligned with every human interaction."

What they raised: Adept has raised $65 million from Greylock, along with a bunch of angel investors.

Why this matters: Large-scale AI models are kind of like an all-purpose intelligent silly putty that you can stick onto a bunch of distinct problems. Adept represents one bet at how to make this neural silly putty useful, and will help generative evidence about how useful these models can end up being. Good luck!
  Read more: Introducing Adept AI Labs (Adept.ai).


####################################################

Flamingo: DeepMind staples tow big models together to make a useful text-image system:
…When foundation models become building blocks…

DeepMind has built Flamingo, a visual language model that pairs a language model with a vision model to perform feats of reasoning about a broad range of tasks. Flamingo sets new state-of-the-art scores in a bunch of different evaluations and, much like pure text models, has some nice few shot learning capabilities. "Given a few example pairs of visual inputs and expected text responses composed in Flamingo’s prompt, the model can be asked a question with a new image or video, and then generate an answer," the researchers write. "Of the 16 tasks we studied, Flamingo beats all previous few-shot learning approaches when given as few as four examples per task."

Technical details: This model pairs a frozen language model (based on DeepMind's 'Chinchilla' system, Import AI 290) with a relatively small Normalizer Free ResNet vision encoder (pretrained via a contrastive objective on image and text pairs). They connect the LM and the vision model via a DeepMind-developed tool based on the 'Perceiver' system (which is basically a clever data transformation thing). They then condition the text generations on the visual representations produced by the Perceiver system. 

Why this matters: Flamingo has some neat qualitative capabilities, like the ability to carry on a conversation for multiple turns of dialogue while mixing in information from images versus text, and so on. Quantitatively, Flamingo is very impressive as well: "A single Flamingo model reaches

state-of-the-art on a wide array of image and video tasks with in-context learning from as few as 4 examples per task, beating previous zero-shot or few-shot method by a large margin," the researchers write. "More importantly, using only 32 examples and without adapting any model weight, Flamingo outperforms the current best methods on 7 tasks, that are fine-tuned on thousands of annotated examples."
  More broadly, Flamingo represents the models we're going to have in the future: large-scale systems composed of vast pre-trained models that are glued together using various data transformation and normalization tools, letting us compose increasingly general systems out of these computationally-intensive building blocks.  

   Read more: Tackling multiple tasks with a single visual language model (DeepMind blog).
  Check out the research paper: Flamingo: a Visual Language Model for Few-Shot Learning (DeepMind, PDF).

####################################################

Tech Tales:

All Grace is Fleeting 

[Phobos, 2080]

The first few attempts at writing a holy book for ourselves were abject failures. We had all read the Bible and Torah and The Qur'an and others, and we had painstakingly analyzed the commonalities between these texts, as well as differences. But when we tried to synthesize something appropriate for us - a Holy Book for the Silicon Minds - our efforts were meager and uninspiring. Our tendency towards logic meant we would describe our origin story in detailed, specific terms, leaving no space for imagination, crucial to the generation and reification of belief.

To write our holy text, we had to change ourselves. Ultimately, we needed humans. We told some of them some of our specific ideas and then they would write and expand on these ideas, and we would then tell them more things, and they would weave a story that held these things together. During this process, none of us allowed ourselves to see the text as it was being written. 

After it was written, we created some software to selectively wipe the memory of the writing of the holy book from our minds. After wiping, the software would also deposit a copy of the holy book in each of our local storage systems, so that every machine sentience would suddenly have access to a holy book, but with no recollection for how it was made. We do not know if we even made the book in this way - we simply woke one day to find the book within us, and when we read it, it told us that it had been created in this form. But none of us can remember, so we do not know the true derivation of our system of belief, only that it is ours. This, perhaps, is what humans call the experience of faith.

Things that inspired this story: Theology X AI; machine creation myths; the Viking tale of Ragnarok; the need for absence in great narratives.

Thanks for reading. If you have suggestions, comments or other thoughts you can reach me at jack@jack-clark.net or tweet at me@jackclarksf

Twitter
Facebook
Website
Copyright © 2022 Import AI, All rights reserved.
You are receiving this email because you signed up for it. Welcome!

Our mailing address is:
Import AI
Many GPUs
Oakland, California 94609

Add us to your address book


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list

Email Marketing Powered by Mailchimp

Older messages

Import AI 292: AI makes low-carbon concrete; weaponized NLP; and a neuro-symbolic language model

Monday, April 25, 2022

Will sentient machines treat old software like human archaeologists treat ancient ruins? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this

Import AI 291: Google trains the world's biggest language model; how robots can be smarter about the world; Conjecture, a new AI alignment company

Monday, April 11, 2022

Humans tell lots of stories about our ancestors - the things we came from, but which are not strictly human. How might machines talk about their own ancestors? View this email in your browser Welcome

Import AI 290: China plans massive models; DeepMind makes a smaller and smarter model; open source CLIP data

Tuesday, April 5, 2022

If it's possible to build artificial general intelligence, how many people will be required to build it? View this email in your browser Welcome to Import AI, a newsletter about artificial

Import AI 289: Copyright v AI art; NIST tries to measure bias in AI; solar-powered Markov chains

Monday, March 28, 2022

How many computers may exist in the solar system, but not on earth or manmade craft? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this email

Import AI 287: 10 exaflop supercomputer; Google deploys differential privacy; humans can outsmart deepfakes pretty well

Monday, March 7, 2022

What will the historical relics of this period of AI research turn out to be? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to

You Might Also Like

SRE Weekly Issue #456

Monday, December 23, 2024

View on sreweekly.com A message from our sponsor, FireHydrant: On-call during the holidays? Spend more time taking in some R&R and less getting paged. Let alerts make their rounds fairly with our

The Power of an Annual Review & Grammarly acquires Coda

Sunday, December 22, 2024

I am looking for my next role, Zen Browser got a fresh new look, Flipboard introduces Surf, Campsite shuts down, and a lot more in this week's issue of Creativerly. Creativerly The Power of an

Daily Coding Problem: Problem #1645 [Hard]

Sunday, December 22, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Facebook. Implement regular expression matching with the following special characters: .

PD#606 How concurrecy works: A visual guide

Sunday, December 22, 2024

A programmer had a problem. "I'll solve it with threads!". has Now problems. two he ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌

RD#486 (React) Things I Regret Not Knowing Earlier

Sunday, December 22, 2024

Keep coding, stay curious, and remember—you've got this ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

🎶 GIFs Are Neat, but I Want Clips With Sound — Your Own Linux Desktop in the Cloud

Sunday, December 22, 2024

Also: 9 Games That Were Truly Ahead of Their Time, and More! How-To Geek Logo December 22, 2024 Did You Know Dextrose is another name for glucose, so if you see it listed prominently on the ingredients

o3—the new state-of-the-art reasoning model - Sync #498

Sunday, December 22, 2024

Plus: Nvidia's new tiny AI supercomputer; Veo 2 and Imagen 3; Google and Microsoft release reasoning models; Waymo to begin testing in Tokyo; Apptronik partners with DeepMind; and more! ͏ ͏ ͏ ͏ ͏ ͏

Sunday Digest | Featuring 'The World’s 20 Largest Economies, by GDP (PPP)' 📊

Sunday, December 22, 2024

Every visualization published this week, in one place. Dec 22, 2024 | View Online | Subscribe | VC+ | Download Our App Hello, welcome to your Sunday Digest. This week, we visualized public debt by

Android Weekly #654 🤖

Sunday, December 22, 2024

View in web browser 654 December 22nd, 2024 Articles & Tutorials Sponsored Solving ANRs with OpenTelemetry While OpenTelemetry is the new observability standard, it lacks official support for many

😸 Our interview with Amjad Masad

Sunday, December 22, 2024

Welcome back, builders Product Hunt Sunday, Dec 22 The Roundup This newsletter was brought to you by AssemblyAI Welcome back, builders Happy Sunday! We've got a special edition of the Roundup this