Import AI 293: Generative humans; few shot learning comes for vision-text models; and another new AI startup is born

How will fashion be influenced by the aesthetics of early AI generation technology?
View this email in your browser

Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your chums an AI upgrade. Subscribe here.

Generating and editing humans has got really easy:
…Next stop: unreal avatars show up in fashion, marketing, and other fields…
Researchers with Chinese computer vision giant SenseTime, as well as Nanyang Technological University and the Shanghai AI Laboratory, have gathered a large dataset of pictures of people and used it to train a model that can generate and edit pictures of people. This kind of model has numerous applications, ranging from fashion to surveillance.

What they did: The researchers built a dataset containing 230,000 images of people, called the Stylish-Humans-HQ-Dataset (SHHQ), and used this to train six different models across two resolutions and three versions of StyleGAN, an approach for creating generative models. A lot of the special work they did here involved creating a diverse dataset including a load of pictures of faces at unusual angles (this means models trained on SHHQ are a bit more robust and do less of the 'works, works, works, OH GOD WHAT JUST HAPPENED' phenomenon you encounter when generative models go to the edge of their data distribution).

Why this matters: Models and datasets like this highlight just how far the field of generative AI has come - we can now generate broadly photorealistic avatars of people in 2D space and interpolate between them, following earlier successes at doing this for the more bounded domain of faces. Systems like this will have a lot of commercial relevance, but will also serve as useful research artifacts for further developing synthetic imagery and scene modeling techniques. Check out the demo on HuggingFace to get a feel for it.
  Read more: StyleGAN-Human: A Data-Centric Odyssey of Human Generation (arXiv).
  Check out the GitHub project page: StyleGAN-Human.
  Check out the GitHub: StyleGAN-Human (GitHub).
  Try out the demo on HuggingFace Spaces (HuggingFace)


####################################################

Vicarious gets acquired in a weird way:
…Longtime AI lab gets acquired and split into two…
Vicarious, a research lab that spent the better part of a decade trying to build superintelligence, has been acquired by Google. The acquisition is notable for being slightly strange - a chunk of Vicarious is going to Google X robot startup 'Intrinsic', while a smaller set of researchers "will join DeepMind’s research team alongside Vicarious CTO Dileep George".

AI trivia: Dileep George used to work with Jeff Hawkins at Numenta, another fairly old lab trying to build superintelligence. Both Numenta and, to a lesser extent, Vicarious, have been playing around with approaches to AI that are more inspired by the human brain than the fairly crude approximations used by most other AI companies.
  Read more: Mission momentum: welcoming Vicarious (Inceptive).

####################################################

Here comes another AI startup - Adept:
…Former Google, DeepMind, and OpenAI researchers unite…
A bunch of people who had previously built large-scale AI models at Google, DeepMind, and OpenAI, have announced Adept, an "ML research and product lab". Adept's founders include the inventors of the Transformer, and people involved in the development of GPT2 and GPT3. (Bias alert: David Luan is involved; I used to work with him at OpenAI and think he's a nice chap - congrats, David!).

What Adept will do: Adept's goal is, much like the other recent crop of AI startups, to use big generative models to make it easier to get stuff done on computers. In the company's own words, "we’re building a general system that helps people get things done in front of their computer: a universal collaborator for every knowledge worker. Think of it as an overlay within your computer that works hand-in-hand with you, using the same tools that you do." Some of the specific examples they give include: "You could ask our model to “generate our monthly compliance report” or “draw stairs between these two points in this blueprint” – all using existing software like Airtable, Photoshop, an ATS, Tableau, Twilio to get the job done together. We expect the collaborator to be a good student and highly coachable, becoming more helpful and aligned with every human interaction."

What they raised: Adept has raised $65 million from Greylock, along with a bunch of angel investors.

Why this matters: Large-scale AI models are kind of like an all-purpose intelligent silly putty that you can stick onto a bunch of distinct problems. Adept represents one bet at how to make this neural silly putty useful, and will help generative evidence about how useful these models can end up being. Good luck!
  Read more: Introducing Adept AI Labs (Adept.ai).


####################################################

Flamingo: DeepMind staples tow big models together to make a useful text-image system:
…When foundation models become building blocks…

DeepMind has built Flamingo, a visual language model that pairs a language model with a vision model to perform feats of reasoning about a broad range of tasks. Flamingo sets new state-of-the-art scores in a bunch of different evaluations and, much like pure text models, has some nice few shot learning capabilities. "Given a few example pairs of visual inputs and expected text responses composed in Flamingo’s prompt, the model can be asked a question with a new image or video, and then generate an answer," the researchers write. "Of the 16 tasks we studied, Flamingo beats all previous few-shot learning approaches when given as few as four examples per task."

Technical details: This model pairs a frozen language model (based on DeepMind's 'Chinchilla' system, Import AI 290) with a relatively small Normalizer Free ResNet vision encoder (pretrained via a contrastive objective on image and text pairs). They connect the LM and the vision model via a DeepMind-developed tool based on the 'Perceiver' system (which is basically a clever data transformation thing). They then condition the text generations on the visual representations produced by the Perceiver system. 

Why this matters: Flamingo has some neat qualitative capabilities, like the ability to carry on a conversation for multiple turns of dialogue while mixing in information from images versus text, and so on. Quantitatively, Flamingo is very impressive as well: "A single Flamingo model reaches

state-of-the-art on a wide array of image and video tasks with in-context learning from as few as 4 examples per task, beating previous zero-shot or few-shot method by a large margin," the researchers write. "More importantly, using only 32 examples and without adapting any model weight, Flamingo outperforms the current best methods on 7 tasks, that are fine-tuned on thousands of annotated examples."
  More broadly, Flamingo represents the models we're going to have in the future: large-scale systems composed of vast pre-trained models that are glued together using various data transformation and normalization tools, letting us compose increasingly general systems out of these computationally-intensive building blocks.  

   Read more: Tackling multiple tasks with a single visual language model (DeepMind blog).
  Check out the research paper: Flamingo: a Visual Language Model for Few-Shot Learning (DeepMind, PDF).

####################################################

Tech Tales:

All Grace is Fleeting 

[Phobos, 2080]

The first few attempts at writing a holy book for ourselves were abject failures. We had all read the Bible and Torah and The Qur'an and others, and we had painstakingly analyzed the commonalities between these texts, as well as differences. But when we tried to synthesize something appropriate for us - a Holy Book for the Silicon Minds - our efforts were meager and uninspiring. Our tendency towards logic meant we would describe our origin story in detailed, specific terms, leaving no space for imagination, crucial to the generation and reification of belief.

To write our holy text, we had to change ourselves. Ultimately, we needed humans. We told some of them some of our specific ideas and then they would write and expand on these ideas, and we would then tell them more things, and they would weave a story that held these things together. During this process, none of us allowed ourselves to see the text as it was being written. 

After it was written, we created some software to selectively wipe the memory of the writing of the holy book from our minds. After wiping, the software would also deposit a copy of the holy book in each of our local storage systems, so that every machine sentience would suddenly have access to a holy book, but with no recollection for how it was made. We do not know if we even made the book in this way - we simply woke one day to find the book within us, and when we read it, it told us that it had been created in this form. But none of us can remember, so we do not know the true derivation of our system of belief, only that it is ours. This, perhaps, is what humans call the experience of faith.

Things that inspired this story: Theology X AI; machine creation myths; the Viking tale of Ragnarok; the need for absence in great narratives.

Thanks for reading. If you have suggestions, comments or other thoughts you can reach me at jack@jack-clark.net or tweet at me@jackclarksf

Twitter
Facebook
Website
Copyright © 2022 Import AI, All rights reserved.
You are receiving this email because you signed up for it. Welcome!

Our mailing address is:
Import AI
Many GPUs
Oakland, California 94609

Add us to your address book


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list

Email Marketing Powered by Mailchimp

Older messages

Import AI 292: AI makes low-carbon concrete; weaponized NLP; and a neuro-symbolic language model

Monday, April 25, 2022

Will sentient machines treat old software like human archaeologists treat ancient ruins? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this

Import AI 291: Google trains the world's biggest language model; how robots can be smarter about the world; Conjecture, a new AI alignment company

Monday, April 11, 2022

Humans tell lots of stories about our ancestors - the things we came from, but which are not strictly human. How might machines talk about their own ancestors? View this email in your browser Welcome

Import AI 290: China plans massive models; DeepMind makes a smaller and smarter model; open source CLIP data

Tuesday, April 5, 2022

If it's possible to build artificial general intelligence, how many people will be required to build it? View this email in your browser Welcome to Import AI, a newsletter about artificial

Import AI 289: Copyright v AI art; NIST tries to measure bias in AI; solar-powered Markov chains

Monday, March 28, 2022

How many computers may exist in the solar system, but not on earth or manmade craft? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this email

Import AI 287: 10 exaflop supercomputer; Google deploys differential privacy; humans can outsmart deepfakes pretty well

Monday, March 7, 2022

What will the historical relics of this period of AI research turn out to be? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to

You Might Also Like

Hacker Newsletter #697

Friday, April 26, 2024

Don't watch the clock; do what it does. Keep going. //Sam Levenson hackernewsletter Issue #697 // 2024-04-26 // View in your browser #Favorites Lattice is the AI-powered people platform that

TikTok threatens shutdown 📱, FCC passes net neutrality 🌐, the robotics renaissance 🤖

Friday, April 26, 2024

ByteDance would rather shut down TikTok than sell the company Sign Up |Advertise|View Online TLDR Together With Plaid TLDR 2024-04-26 6 fintech predictions you need to know for 2024 (Sponsor)

📧 What's inside MMA and how it can help you

Friday, April 26, 2024

What's Inside Modular Monolith Architecture? Hey there! 👋 I wish you an excellent end to the week. What better way to spend the weekend than diving headfirst into a 12+ hour course? Well, maybe

Data Science Weekly - Issue 544

Friday, April 26, 2024

Curated news, articles and jobs related to Data Science, AI, & Machine Learning ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Develop highly relevant search applications using AI

Friday, April 26, 2024

New Elasticsearch and AI training ㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤ ㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤ ㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤ elastic | Search. Observe. Protect A world of AI possibilities door-test 2.png Explore

Stripe makes more changes

Thursday, April 25, 2024

TikTok is in trouble, and net neutrality is back View this email online in your browser By Christine Hall Thursday, April 25, 2024 Welcome back to TechCrunch PM, your home for all things startups,

💎 Issue 414 - From a Lorry Driver to Ruby on Rails Developer at 38

Thursday, April 25, 2024

This week's Awesome Ruby Newsletter Read this email on the Web The Awesome Ruby Newsletter Issue » 414 Release Date Apr 25, 2024 Your weekly report of the most popular Ruby news, articles and

💻 Issue 414 - JavaScript Features That Most Developers Don’t Know

Thursday, April 25, 2024

This week's Awesome Node.js Weekly Read this email on the Web The Awesome Node.js Weekly Issue » 414 Release Date Apr 25, 2024 Your weekly report of the most popular Node.js news, articles and

💻 Issue 407 - The Performance Impact of C++'s `final` Keyword

Thursday, April 25, 2024

This week's Awesome .NET Weekly Read this email on the Web The Awesome .NET Weekly Issue » 407 Release Date Apr 25, 2024 Your weekly report of the most popular .NET news, articles and projects

💻 Issue 414 - Everyone Has JavaScript, Right?

Thursday, April 25, 2024

This week's Awesome JavaScript Weekly Read this email on the Web The Awesome JavaScript Weekly Issue » 414 Release Date Apr 25, 2024 Your weekly report of the most popular JavaScript news, articles