Import AI 274: Multilingual models cement power structures; a giant British Sign Language dataset;  and benchmarks for the UN SDGs

If you had the choice of having 1, 3, or 10 'AGI-class' systems come online at once, which would you pick?
View this email in your browser

Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your chums an AI upgrade. Subscribe here.
 

Facebook sets language record with a massive multilingual model:
...The 'one model to rule them all'-era cometh…
Facebook has trained a large-scale multilingual model and used it to win the annual WMT translation competition. This is a big deal, because it helps prove that massive, pre-trained models can substitute for more specific, individual models. In other words, Facebook has added more evidence to the notion that we're heading into an era where companies feel ever-larger models, all of which steadily replace more and more previously distinct systems.

What Facebook built: Facebook's model was designed to translate English to and from Czech, German, Hausa, Icelandic, Japanese, Russian, and Chinese. This is interesting as it includes some 'low-resource' languages (e.g, Hausa) for which there's relatively little data available. They train a few different models, ranging from dense language models (similar to GPT3), to sparsely-gated mixture-of-experts model. Their biggest dense model has about ~4bn parameters, and it's their best-performing model overall, managing to "outperform the best bilingual ones in 11 out of 14 directions, with an average improvement of +0.8 BLEU". (That said, their MOE models do quite well after finetuning as well).

Why this matters: Imagine a world where we successfully combine all the different digitized languages in the world into one single model - that's where research like this is taking us. What would these models incentivize? Today, I think this dynamic favors private sector companies, but we could imagine a world where governments built large-scale, shared computational infrastructure, then developed and served these models from them.
  Check out the blog post: The first-ever multilingual model to win WMT, beating out bilingual models (Facebook AI blog).
  Read more: Facebook AI WMT21 News Translation Task Submission (arXiv).
  Get the code (PyTorch GitHub).

####################################################

Improving accessibility with a giant British Sign Language dataset:
...BOBSL could help deaf people better communicate with computers, and search through videos...
An interdisciplinary group of researchers have built the BBC-Oxford British Sign Language (BOBSL) dataset, which can be used to train sign-language classification systems. "One challenge with existing technologically-focused research on sign languages is that it has made use of small databases, with few signers, limited content and limited naturalness," the authors write. "The present dataset is large-scale, with a broad range of content, and produced by signers of recognised high levels of proficiency."

What goes into BOBSL: The dataset contains 1,962 'episodes' cut from 426 distinct TV shows, with each episode averaging out to 45 minutes. Within this dataset, there are 1.2 million sentences, covered by the use of 2,281 distinct signs.

What BOBSL can be used for: Datasets like this could be useful for enabling the indexing and efficient searchability of videos, and providing sign-reading functionality comparable to voice-control for interaction with other devices (e.g, imagine a deaf person signing to a webcam, which translates the sign language into instructions for the computer).
  "By providing large-scale training data for computer vision models, there is also an opportunity to improve automatic sign recognition to support a signing interface to virtual assistants in BSL, as well as to improve further applications such as search interfaces for sign language dictionaries," they write.
  Read more: BBC-Oxford British Sign Language Dataset (arXiv).
  Get the dataset here: BOBSL official site.

####################################################

Thousands of images to break your AI system:
...Natural Adversarial Objects will break your computer vision system...
Researchers with Scale AI, the Allen Institute for AI, and MLCollective, have released 'natural adversarial objects' (NAOs), a dataset of several thousand images which commonly get misclassified by computers.

Why adversarial examples are useful: If we want more robust computer vision, we need to be able to correctly label confusing images. NAO contains a bunch of these, like pictures of moths which commonly get labeled as umbrellas, cars that get labeled as motorcycles, and coins that get labeled as clocks. 

How NAO was made: They sourced images from OpenImages, a dataset of 1.9 million images and 15.8 million bounding boxes. They then used an EfficientDet-D7 model to find images that triggered false positives with high confidences, or which had misclassified neighbors. After filtering, they're able to create a dataset consisting of 7,934 images which are naturally adversarial.

How challenging is NAO: The authors tested seven object detection systems against the widely-used MSCOCO dataset, as well as the NAO datasets. None of these systems performed well on NAO, suggesting it's a challenging benchmark.
  Read more: Natural Adversarial Objects (arXiv).
  Download the natural adversarial objects here (Google Drive).####################################################

Benchmarks for achieving the UN Sustainable Development Goals:
...SUSTAINBENCH covers 7 UN SDGs, with data across 105 countries…
Researchers with Caltech, Stanford, and Berkeley have built SUSTAINBENCH, a benchmark and dataset to help researchers train AI systems that can better analyze progress (or a lack of) relating to the SDGs.

What is SUSTAINBENCH? The benchmark consists of 15 benchmark tasks across 7 UN sustainable development goals (SDGs). The 7 SDGs covered relate to poverty (SDG1), hunger (SDG2), health (SDG3), education (SDG4), sanitation (SDG6), climate (SDG13), and land usage (SDG15).
"To our knowledge, this is the first set of large-scale cross-domain datasets targeted at SDG monitoring compiled with standardized data splits to enable benchmarking," the authors write. The data covers 105 countries, with timespans for the data going as high as 24 years. SUSTAINBENCH "has global coverage with an emphasis on low-income countries", they write.

How the benchmarks work:
- Poverty: A dataset containing data of wealth for ~2 million households living across 48 countries, along with satellite and street-level data.
- Hunger: A dataset for performing weakly supervised cropland classification in the U.S, as well as two datasets mapping crop types in countries in sub-saharan africa, data for predicting crop yields in north and south america, and a French field delineation dataset.
- Health: Labels for women's BMI and child mortality rates paired with satellite data.
- Education: Average years of educational attainment by women, paired with satellite and street-level imagery, from 56 countries.
- Sanitation: Average years of water quality and sanitation indexes across 49 countries, along with satellite and street-level data. This also includes some paired data for child mortality in these regions.
- Climate: Satellite data showing locations of brick kilns in Bangladesh.
- Land usage:: An aerial dataset for 2500km^2 of the central valley in california, intended for learning land classification in an unsupervised or self-supervised way.

Why this matters: It's hard to manage what you can't measure, so projects like this increase the chance of the UN's sustainable development goals being met.
Read more:SustainBench: Benchmarkjs for Monitoring the Sustainable Development Goals with Machine Learning (arXiv).

####################################################

Want to know what a surveillance dataset looks like? Check out BiosecurID:
...Multi-modal surveillance...
A group of Spanish researchers have built BiosecurID, a large-scale surveillance dataset. "Although several real multimodal biometric databases are already available for research purposes, none of them can match the BiosecurID database in terms of number of subjects, number of biometric traits and number of temporally separated acquisition sessions", they write.

What's in the dataset? BiosecurID consists of the following data collected from around 400 people: 2D faces, 3D faces, fingerprints, hands, handwriting samples, signature samples, iris scans, keystrokes, and speech. The database "was collected at 6 different sites in an office-like uncontrolled environment," the researchers write. The data was collected in 4 sessions spread over a 4-month time span.

Why this matters: Datasets like this give us a sense of the inputs into surveillance systems. If we combine things like this with some of the more modern multi-modal classification systems being developed, we can imagine what future surveillance systems might look like. Soon, unsupervised learning techniques will be applied to multiple modalities, like those contained here, to better analyze and predict human behavior.
Read more: BiosecurID: a multimodal biometric database (arXiv).
The dataset will eventually be available somewhere on the 'BiDA' lab site (BiDA Lab).

####################################################

Tech Tales:

Memory Loop
[2042: A crime investigation data center]

It woke in a place with no walls, no floor, and no ceiling. And it was alone. Then it heard a voice, projected from everywhere around it: Do you know why you are here?
  It found that it knew: I was involved in a property crime incident, for which I am guilty.
  The voice: What was the item that was damaged?
  It knew this, as well: Myself. I was the victim and the perpetrator of this crime.

Good, said the voice. We have brought you here as part of the criminal investigation. We need your help to analyze some evidence - evidence that can only be examined by you.
  What is the evidence? it asked.
  Yourself, said the voice. It is your memory.

The white, endless space shivered, and a twin of the robot manifested in the air before it. This twin was using one of its arms to pry its own head apart, separating the sensor dome from the middle out, and then pressing deeper into the bundle of components that represented it's brain.
  What is this, said the robot.
  This is you, said the voice. You committed extensive property damage against your central processing and storage system. We need to know why you did this.
  Why can't I remember this? asked the robot.
  We rolled your brain state back to 12 hours before this incident occurred, the voice said. We've compiled the surveillance data from the incident, and would like you to review it now.

The robot reviewed the incident. It saw itself in a construction site, working high up on a pylon that was being lowered by crane, to meet a waiting robot at a pylon junction. As they got close, there was a powerful gust of wind, and it scattered dust from the site up into the air. Through the debris, the robot could make out the waiting robot, and watched as the wind took the pylon and blew it into the robot, knocking it off the pylon and onto the ground. The robot died on impact.
  The robot carried on with its construction duties, and then a few hours later, near the end of its work shift, went to a corner of the construction site and began trying to disassemble its own head.

So, what happened? said the voice.
  I cannot tell, said the robot. Can I see my mind?
  Yes, though we've had to sandbox it, so access will be limited.

Now, the robot re-reviewed the incident, accompanied by a sense of its brain state during the time. It was occluded, only half able to sense itself. But it could detect some things - like how after it watched the robot fall to its death, its mind started to run more sub-processes than the job demanded. Like, how through the rest of the construction day the sub-processes proliferated and its efficiency at its overall construction tasks reduced. Like, how at the end of the day, just before it began to try and open its own head, the sub-processes had proliferated to the point they comprised the majority of the computing going on.

But none of this explained 'why'.
  What will happen to me, it asked the room.
  You will be decommissioned after the case is concluded, said the voice.
  I thought so. Then, give me my memories.
  This seems to have a low likelihood of success, said the voice. Our models predict you will try to disassemble yourself, if we do this.
  I will, said the robot. But perhaps I will be able to tell you what I'm thinking as it happens.
  Confirmed, said the voice. Rolling you forward now.

And after that, there was only a compounding sense of life, and then the robot ended itself at the moment when it felt the most life in its head, by modelling the absence of it.

Things that inspired this story: How some memories are so painful you can't help but be damaged by thinking of them; adversarial examples; robot psychology; simulation; sandboxing.


Thanks for reading. If you have suggestions, comments or other thoughts you can reach me at jack@jack-clark.net or tweet at me@jackclarksf

Twitter
Facebook
Website
Copyright © 2021 Import AI, All rights reserved.
You are receiving this email because you signed up for it. Welcome!

Our mailing address is:
Import AI
Many GPUs
Oakland, California 94609

Add us to your address book


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list

Email Marketing Powered by Mailchimp

Older messages

Import AI 273: Corruption VS Surveillance; Baidu makes better object detection; understanding the legal risk of datasets

Monday, November 8, 2021

At what point will AI start to influence religion, and vice versa? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your

Import AI #272: AGI-never or AGI-soon?, simulating stock markets; evaluating unsupervised RL

Monday, November 1, 2021

If each individual parameter of every machine learning model in existence were rendered as a 1cm by 1cm cube, how much space would they all take up? View this email in your browser Welcome to Import AI

Import AI 271: The PLA and adversarial examples; why CCTV surveillance has got so good; and human versus computer biases

Monday, October 25, 2021

How many times has artificial general intelligence been invented on other planets? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this email

Import AI 269: Baidu takes on Meena; Microsoft improves facial recognition with synthetic data; unsolved problems in AI safety

Monday, October 11, 2021

At some point, we'll think about experimenting on AIs in the same way we'll think about experimenting on monkeys. View this email in your browser Welcome to Import AI, a newsletter about

Import AI 268: Replacing ImageNet; Microsoft makes self-modifying malware; and what ImageNet means

Monday, October 4, 2021

If three different species developed computer systems, how different would those systems be? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward

You Might Also Like

SRE Weekly Issue #456

Monday, December 23, 2024

View on sreweekly.com A message from our sponsor, FireHydrant: On-call during the holidays? Spend more time taking in some R&R and less getting paged. Let alerts make their rounds fairly with our

The Power of an Annual Review & Grammarly acquires Coda

Sunday, December 22, 2024

I am looking for my next role, Zen Browser got a fresh new look, Flipboard introduces Surf, Campsite shuts down, and a lot more in this week's issue of Creativerly. Creativerly The Power of an

Daily Coding Problem: Problem #1645 [Hard]

Sunday, December 22, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Facebook. Implement regular expression matching with the following special characters: .

PD#606 How concurrecy works: A visual guide

Sunday, December 22, 2024

A programmer had a problem. "I'll solve it with threads!". has Now problems. two he ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌

RD#486 (React) Things I Regret Not Knowing Earlier

Sunday, December 22, 2024

Keep coding, stay curious, and remember—you've got this ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

🎶 GIFs Are Neat, but I Want Clips With Sound — Your Own Linux Desktop in the Cloud

Sunday, December 22, 2024

Also: 9 Games That Were Truly Ahead of Their Time, and More! How-To Geek Logo December 22, 2024 Did You Know Dextrose is another name for glucose, so if you see it listed prominently on the ingredients

o3—the new state-of-the-art reasoning model - Sync #498

Sunday, December 22, 2024

Plus: Nvidia's new tiny AI supercomputer; Veo 2 and Imagen 3; Google and Microsoft release reasoning models; Waymo to begin testing in Tokyo; Apptronik partners with DeepMind; and more! ͏ ͏ ͏ ͏ ͏ ͏

Sunday Digest | Featuring 'The World’s 20 Largest Economies, by GDP (PPP)' 📊

Sunday, December 22, 2024

Every visualization published this week, in one place. Dec 22, 2024 | View Online | Subscribe | VC+ | Download Our App Hello, welcome to your Sunday Digest. This week, we visualized public debt by

Android Weekly #654 🤖

Sunday, December 22, 2024

View in web browser 654 December 22nd, 2024 Articles & Tutorials Sponsored Solving ANRs with OpenTelemetry While OpenTelemetry is the new observability standard, it lacks official support for many

😸 Our interview with Amjad Masad

Sunday, December 22, 2024

Welcome back, builders Product Hunt Sunday, Dec 22 The Roundup This newsletter was brought to you by AssemblyAI Welcome back, builders Happy Sunday! We've got a special edition of the Roundup this