Import AI 281: China does more surveillance research than US and Europe; Google reveals its text model LaMDA; Microsoft improves MoEs

Has a Dyson Sphere ever existed?
View this email in your browser

Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your chums an AI upgrade. Subscribe here.
 

Google (finally) reveals its big text model - LaMDA:
…Plus: why you need a lot of humans to make a language model safe…
Google has finally given details on LaMDA, it's GPT-3 competitor. LaMDA are a family of language models ranging in size from 2B to 137B parameters (GPT3: 175B), and have been trained on a massive dataset of 1.56 trillion words. One way LaMDA is different to other big language models is that it is centered around dialogue, with 50% of its data coming from "dialogues from public forums". Google has also fine-tuned LaMDA models to make them better, safer, and more likely to give factual responses - all these fine-tuned versions are facilitate by Google using crowdworkers to gather specific dialog data.

The dataset: LaMDA's underlying dataset is called 'Infiniset', and besides the dialogue also involves common crawl, wikipedia, a mixture of english and non-english web documents, and data from programming-related sites (so LaMDA models can also dabble in code).

Learning to use an external dataset: Similar to DeepMind's 'RETRO' model, LaMDA has also been taught to use an external information retrieval system - this makes LaMDA models more factually correct than models without external datasets, because the model can check facts from an external, trusted source. To train the model, they also collect 9k dialogs where LaMDA's generated answers are labelled 'correct' or 'incorrect', to give it a stronger training signal.
  The results of this are quite striking. For example, when you ask a non-information-retrieval LaMDA model "What do you think of Rosalie Gascoigne’s sculptures?", it replies: "LAMDA-BASE: They’re great, and I love how her work changed through her life. I like her later work more than her earlier ones. Her influence is also super interesting - did you know she was one of the artists that inspired Miró?". This response seems plausible but it's wrong - Miro was active till late 1960s, and Gascoigne first solo exhibits were in 1970s, so she couldn't have been an influence on Miro.
  By comparison, the version of LaMDA trained on information retrieval says this: "LAMDA: Oh wow, her life course is so inspiring. Did you know she was a practitioner of Japanese flower arrangement before turning to sculpture? https://artsearch.nga.gov.au/detail.cfm?irn=8774&pictaus=true", where it gives a factually accurate statement and provides a source as well.

Things that make you go 'hmmm' - more compute than GPT-3: LaMDA consumed 3.55E+23 flops during training, versus 3.14+23 flops for GPT3 (so more parameters doesn't necessarily mean more resource intensive). It was trained on a cluster of 1024 TPU V3 chips.

Why this matters: "LaMDA is a step closer to practical and safe open-ended dialog systems, which can in turn unlock a wide range of useful applications. We hope that this work encourages further research in this area", Google writes. This is true - systems like LaMDA are basically refinements and improvements on the ideas of GPT2/3. We're a few years away from everyone having access to vast, planet-scale AI models that tell them truthful things in natural ways - the proverbial angel (or devil) on everyone's shoulder. The cultural impacts will be vast and destabilizing.
  Read more: LaMDA: Language Models for Dialogue Applications (arXiv).

####################################################

Write about a world where AI goes well, and win (part of) $100k:
…Future of Life Institute's worldbuilding contest tries to imagine positive AGI rollouts…
The Future of Life Institute is launching a competition based around "designing visions of a plausible, aspirational future that includes strong artificial intelligence." The competition deadline is April 15th 2022. The idea here is that if we can figure out realistic ways in which powerful AI can go well, then that gives us a map to use to get civilization there. The first prize is $20,000, followed by two second prizes of $10,000 each, and smaller prizes.
    Find out more about the competition here (Worldbuild.ai, FLI site).

####################################################

Want to teach your drone to see? Use this massive dataset:
…WebUAV-3M is probably the largest public UAV tracking dataset…
Researchers with the Chinese Academy of Sciences, the Shenzhen Research Institute of Big Data, and the Chinese University of Hong Kong Shenzhen, have built WebUAV-3M, a large dataset to help people teach drones to accurately label images and videos. WebUAV-3M consists of 4,485 videos, where each one has been labeled with dense bounding boxes that cover 216 distinct categories of object to be tracked (e.g, bears, wind turbines, bicycles, etc). The authors claim this is "by far the largest public UAV tracking benchmark".

Multimodal: Unusually, this is a multi-modal dataset; each labeled video is accompanied by a natural language sentence describing the video, as well as an audio description of it. "We provide natural language specifications and audio descriptions to facilitate multi-modal deep UAV tracking," the authors write. "The natural language specification can provide auxiliary information to achieve accurate tracking".

Why this matters: In the same way CCTV cameras have instrumented the streets of cities around the world, drones are doing the same for cities and rural areas. And just like how increasingly good AI got trained on datasets gathered by CCTV cameras, we can expect the same for drones. The result? An ever-expanding suite of surveillance capabilities that we can expect will be integrated, for good and bad purposes, by a broad range of governments and private sector actors. Datasets like WebUAV-3M are the fuel for this.
  Read more: WebUAV-3M: A Benchmark Unveiling the Power of Million-Scale Deep UAV Tracking (arXiv).
  Get the code from here (eventually - wasn't online when I wrote this section this week).

####################################################

FFCV: Train ImageNet for 98 cents!
…What's this? Free software that makes all model training better? Interesting!...:
There's some new software that could help pretty much everyone train models more efficiently. The software is called FFCV, short for Fast Forward Computer Vision, and it is a "drop-in data loading system that dramatically increases data throughput in model training". It looks like a potentially big deal - FFCV can be much more efficient for training AI models, according to tests done by the authors, and may also work for other applications as well. "FFCV can speed up a lot more beyond just neural network training---in fact, the more data-bottlenecked the application (e.g., linear regression, bulk inference, etc.), the faster FFCV will make it!," says the project's GitHub page.

Why this matters: Software like FFCV is part of the broader industrialization of AI - now we know how to train networks, various people are modularizing the training process and perfecting different elements of it. Stuff like FFCV is part of that trend.
  Find out more and get the code: FFCV GitHub repo.
   Get more details by reading the Performance Guide (FFCV site).
  Check out the main project website here (FFCV site).

####################################################

Microsoft makes MoEs easier to train:
…MoEs might be the best way to scale-up large models…
Microsoft has given a technical update on how it's trying to scale-up mixture-of-experts (MoE) networks. MoEs are one of the more promising routes for creating trillion-parameter-plus AI models, as MoEs are a lot more efficient to train than dense models like GPT3. In this paper, Microsoft talks about how it has made some tweaks so MoEs work well for auto-regressive natural language generation tasks, "demonstrating training cost reduction of 5X to achieve same model quality for models like GPT-3" and Microsoft's own 530B parameter 'Megatron-Turing NLG'.

MoEs might be cheaper and better: In tests, Microsoft shows that it can train 350M and 1.3B parameter MoE text models that have better (or the same) performance as GPT3 for a range of different tasks.Microsoft says this nets out to models with the "same quality with 5X less training cost".

Why this matters: MoEs could turn out to be the main way people break the trillion-parameter barrier (and there are rumors that China's 'Wu Dao' MoE at an alleged ~1.7 trillion parameters has already done this). Via efficient MoE training and inference software, "a model with comparable accuracy as trillion-parameter dense model can be potentially trained at the cost of a 200B parameter (like GPT-3) sized dense model, translating to millions of dollars in training cost reduction and energy savings", Microsoft says.
  Read more: DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale (arXiv).

####################################################

Backchain science out of fictional news - and win a hundred bucks:
What could cause a computer virus to infect a biological organism? Or how might a biological organism evolve into a computer virus? These are the two questions posed by a 'Fiction Science Competition'. Entrants will need to write a plausible scientific explanation for how either of the above scenarios could transpire, and will respond to a short (fictionalized) news article written about the scenarios. There's a prize of $100 dollars for winning entries, and submissions close February 28th 2022.
    Find out more here at the official Fiction Science Contest website.

####################################################

AI Ethics Brief by Abhishek Gupta from the Montreal AI Ethics Institute

Visual surveillance’s share in computer vision research across the world shows some worrying trends … Research coming out of China dominates the field, especially in emergent surveillance sub-areas like person re-identification, crowd counting, and facial spoofing detection …
CSET researchers have identified trends in computer vision research by looking for patterns of publication for six distinct tasks, analyzing 100 million English publications that were published between 2015-2019.

Surveillance tasks examined: A SciREX model trained on data from Papers with Code was used to identify references to the following six tasks: face recognition, person re-identification, action recognition, emotion, recognition, crowd counting, and facial spoofing detection.

Some key findings: Facial recognition was the most well-established task over this period, and crowd counting and face spoofing detection were rapidly growing areas. The overall percentage share of surveillance papers has remained stable around 5.5% over this period, though the raw volume of papers has grown given the surge in computer vision research overall. During this time period, China’s share of global CV papers grew from 33 to 37% and surveillance papers from 36% to 42%, exceeding research from the EU (2nd) and the US (3rd) by more than 20% in each category.

Why it matters: While dual-use technologies developed in one part of the world can be used elsewhere, such analyses reveal a nation’s primary interest and provide quantitative evidence for decision-making in policy. The identified areas are important since tasks like action recognition can detect individuals with abnormal behavior in crowds, emotion recognition can help identify security threats in public areas, crowd counting can help to monitor civilian protests, and face spoofing detection can prevent journalists and activists from hiding their identity. All of these have significant implications in terms of fundamental rights and freedoms of people.
Read more: Trends in AI Research for the Visual Surveillance of Populations

####################################################

Tech Tales:

VHS vs Betamax
[An online forum, 2035]

"Alright I need you to livestream from your phone what's happening on the computer, and I'm gonna send you an image to use as a prior, then I'm gonna watch it generate the first few epochs. If everything checks out I'll authorize the transfer to the escrow service and you'll do the same?"
"Yes," wrote the anonymous person.
I sent them a seed picture - something I'd drawn a couple of years ago that had never been digitized.
They turned on their livestream and I watched as the ML pipeline booted up and started the generation process. It seemed legit. Some of these older models had a very particular style that you could ID during early generation. I watched for a few minutes and was satisfied. This was the final authentication step and the only way I'd know for certain is if I just took a leap of faith and paid up.
"Okay, I'm sending the funds to the escrow service. They'll be distributed to your account once the service confirms receipt of the model."
"Excellent. Good doing business with you."
And then their little green dot went out and they were gone.

A few minutes passed, and then the escrow service pinged me confirming they'd received the model. I downloaded it, then stuck it in my pipeline and started generating the client orders. People paid a lot of money for these kinds of 'vintage' AI-generated objects, and the model I'd just got was very old and very notorious.

Just another beautiful day in America, sifting through all the debris of decades of software, panning for little chunks of gold.

Things that inspired this: How the flaws of a media system ultimately become desired or fetishized aesthetic attributes - and specifically, this amazing Brian Eno quote; how models like CLIP will one day be obscure; how models vary over their development lifespans, creating the possibility of specific aesthetics and tastes.


Thanks for reading. If you have suggestions, comments or other thoughts you can reach me at jack@jack-clark.net or tweet at me@jackclarksf

Twitter
Facebook
Website
Copyright © 2022 Import AI, All rights reserved.
You are receiving this email because you signed up for it. Welcome!

Our mailing address is:
Import AI
Many GPUs
Oakland, California 94609

Add us to your address book


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list

Email Marketing Powered by Mailchimp

Older messages

Import AI 279: Baidu adds knowledge to a language model; US military + AI; how China thinks about AI governance

Monday, January 10, 2022

How would an AI, given the objective of stabilizing the Earth's climate in distribution with an averaged multi-decade sliding window of past few hundred years, approach the subject of climate

Import AI 278: Can we ever trust an AI?; what the future of semiconductors looks like; better images of AI

Monday, December 27, 2021

Given the pace of progress in generative AI, how long until people will be able to generate their own customized feature-length films on command? View this email in your browser Welcome to Import AI, a

Import AI 277: DeepMind builds a GPT-3 model; Catalan GLUE; FTC plans AI regs

Monday, December 13, 2021

Could crypto computation eventually compete with AI computation at the level of fab production capacity? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence

Import AI 276: Tracking journalists with computer vision; spotting factory defects with AI; and what simulated war might look like

Monday, December 6, 2021

What would be the smallest computational envelope required to simulate fluid dynamics to the same fidelity as reality? View this email in your browser Welcome to Import AI, a newsletter about

Import AI 274: Multilingual models cement power structures; a giant British Sign Language dataset;  and benchmarks for the UN SDGs

Monday, November 15, 2021

If you had the choice of having 1, 3, or 10 'AGI-class' systems come online at once, which would you pick? View this email in your browser Welcome to Import AI, a newsletter about artificial

You Might Also Like

SRE Weekly Issue #456

Monday, December 23, 2024

View on sreweekly.com A message from our sponsor, FireHydrant: On-call during the holidays? Spend more time taking in some R&R and less getting paged. Let alerts make their rounds fairly with our

The Power of an Annual Review & Grammarly acquires Coda

Sunday, December 22, 2024

I am looking for my next role, Zen Browser got a fresh new look, Flipboard introduces Surf, Campsite shuts down, and a lot more in this week's issue of Creativerly. Creativerly The Power of an

Daily Coding Problem: Problem #1645 [Hard]

Sunday, December 22, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Facebook. Implement regular expression matching with the following special characters: .

PD#606 How concurrecy works: A visual guide

Sunday, December 22, 2024

A programmer had a problem. "I'll solve it with threads!". has Now problems. two he ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌

RD#486 (React) Things I Regret Not Knowing Earlier

Sunday, December 22, 2024

Keep coding, stay curious, and remember—you've got this ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

🎶 GIFs Are Neat, but I Want Clips With Sound — Your Own Linux Desktop in the Cloud

Sunday, December 22, 2024

Also: 9 Games That Were Truly Ahead of Their Time, and More! How-To Geek Logo December 22, 2024 Did You Know Dextrose is another name for glucose, so if you see it listed prominently on the ingredients

o3—the new state-of-the-art reasoning model - Sync #498

Sunday, December 22, 2024

Plus: Nvidia's new tiny AI supercomputer; Veo 2 and Imagen 3; Google and Microsoft release reasoning models; Waymo to begin testing in Tokyo; Apptronik partners with DeepMind; and more! ͏ ͏ ͏ ͏ ͏ ͏

Sunday Digest | Featuring 'The World’s 20 Largest Economies, by GDP (PPP)' 📊

Sunday, December 22, 2024

Every visualization published this week, in one place. Dec 22, 2024 | View Online | Subscribe | VC+ | Download Our App Hello, welcome to your Sunday Digest. This week, we visualized public debt by

Android Weekly #654 🤖

Sunday, December 22, 2024

View in web browser 654 December 22nd, 2024 Articles & Tutorials Sponsored Solving ANRs with OpenTelemetry While OpenTelemetry is the new observability standard, it lacks official support for many

😸 Our interview with Amjad Masad

Sunday, December 22, 2024

Welcome back, builders Product Hunt Sunday, Dec 22 The Roundup This newsletter was brought to you by AssemblyAI Welcome back, builders Happy Sunday! We've got a special edition of the Roundup this