Import AI 283: Open source 20B GPT3; Chinese researchers make better adversarial example attacks; Mozilla launches AI auditing project.

If civilization crashes, will our descendents in a thousand years remember AI systems as machines, or as mythical gods?

View this email in your browser

Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your chums an AI upgrade. Subscribe here.

US lawmakers want companies to assess bias of systems before deploying them:
…Coalition of US lawmakers want to make tech companies more accountable…
A bunch of Democratic lawmakers have introduced the Algorithmic Accountability Act. This act "requires companies to conduct impact assessments for bias, effectiveness and other factors, when using automated decision systems to make critical decisions. It also creates, for the first time, a public repository at the Federal Trade Commission of these systems, and adds 75 staff to the commission to enforce the law." This act is an update on the 2019 Algorithmic Accountability Act, and "includes numerous technical improvements, including clarifying what types of algorithms and companies are covered, ensuring assessments put consumer impacts at the forefront, and providing more details about how reports should be structured."

One problem with the bill: This bill only has Democrats signed on right now. It'll be interesting to see whether it can become a bipartisan bill with Republican support - something necessary for it to pass in the fractious and divided US Congress.
Read more: Wyden, Booker and Clarke Introduce Algorithmic Accountability Act of 2022 To Require New Transparency And Accountability For Automated Decision Systems (Ron Wyden, official website).

####################################################

DeepMind makes a (kinda) smart AI programmer, called AlphaCode:
…Codex and AlphaCode represent two bets around augmenting programmers…
DeepMind has announced AlphaCode, a neural net that can place in a not-hugely-embarassing way in competitive programming competitions. AlphaCode placed in the top 54% of participants in programming competitions hosted on Codeforces, participating in contests that post-dated its training data.
"The problem-solving abilities required to excel at these competitions are beyond the capabilities of existing AI systems. However, by combining advances in large-scale transformer models (that have recently shown promising abilities to generate code) with large-scale sampling and filtering, we’ve made significant progress in the number of problems we can solve," DeepMind writes.

Why this matters: Last year, OpenAI debuted Codex, a GPT3-style model that can do decent programming. That was followed by GitHub announcing Copilot, a VSCode plug-in that works like a really smart autocomplete for code. AlphaCode represents a slightly different bet in this space; while philosophically similar there's a lot more emphasis here on ranking and filtering candidate results. What remains to be seen is if DeepMind deploys this in the same large-scale way as GitHub has with Copilot.

Read more: Competition-Level Code Generation with AlphaCode (DeepMind, PDF).
Get the competitive programming dataset here: CodeContests (DeepMind, GitHub).

####################################################

Mozilla gets into AI auditing:
…Deb Raji's Open Source Audit Tooling (OAT) project could help us make safer systems…
Deb Raji, a researcher at UCBerkeley who has previously critically evaluated facial recognition systems, is launching the Open Source Audit Tooling (OAT) project with Mozilla. OAT "will coordinate discussions on what kind of resources algorithmic auditors need in order to execute audits more effectively," she writes. One of the goals of OAT is to create an index of common resources people can use to audit models, as well as to "grow momentum around open source audit tooling and processes".

Why this matters: AI is broadly ungoverned. One of the ways you can govern an ungoverned space is by measuring and monitoring what happens within it - that's what audit tools can help with. If initiatives like OAT are successful, then they'll generally incentivize better behavior on the part of AI developers, and disincentivize bad behavior.
Read more: It's Time to Develop the Tools We Need to Hold Algorithms Accountable (Mozilla).
Find out more about the project at its main Mozilla page (Mozilla).

####################################################

Anduril buys Dive Technologies:
…AI-Dronewar company buys AI-Seadrone company…

AI defense startup Andruil has bought Dive Technologies, a company that builds autonomous underwater vehicles. Anduril plans to integrate DIVE into its 'Lattice OS', a defense and surveillance operating system the company is building.
Read more: Anduril Industries Acquires Dive Technologies (Anduril).

####################################################

Prepare yourself - an open source 20B model is coming:
…Eleuther has built and will shortly release GPT-NeoX-20B…
In a few days, the internet is going to change. That's because on the 9th of February, the open source AI research collective Eleuther AI is going to release a 20B model onto the internet. The model, GPT-NeoX-20B, will be "the largest publicly accessible pretrained general-purpose autoregressive language model". Eleuther says it hopes that by releasing it, it'll give more people the ability to play with the model, which can improve the state of safety research regarding these models.
"Like our other language models and codebases, GPT-NeoX and GPT-NeoX-20B are very much research artifacts and we do not recommend deploying either in a production setting without careful consideration," Eleuther writes.

Why this matters: Models like GPT2 and GPT3 display qualitatively different performance traits at larger scales - capabilities emerge as you go from 1B to 5B to 20B, and so on. Therefore, by releasing a 20B model, I expect we'll soon after get a load of interesting discovered of hitherforto unknown things 20B models can do. The 20B release will also create a demand for better inference technologies, as sampling from a 20B model is itself a challenging task.
Read more: Announcing GPT-NeoX-20B (Eleuther AI).
You can also pay a cloud company called CoreWeave to use the model now, if you like. (CoreWeave).

####################################################

Chinese researchers make better adversarial attack technology:
…New technique works well on 'black box' classifiers where you don't know details - AKA, the real world…
Chinese researchers have figured out a better way to attack computer vision systems. Specifically, they've developed techniques for generating adversarial examples that can trick computer vision systems into mis-classifying (or being unable to classify) an image. Adversarial attacks have been around for a few years - the twist, here, is they work on attacking 'black box' systems; that is, a computer vision system where you don't know details about it. They do this by training a generative network on ImageNet (a vast and widely used dataset), then they test out if they can make adversarial images that work against neural nets trained on other datasets. They succeed and set new records on attacking classifiers trained on CIFAR-10, CIFAR-100, STL-10, SVHN, and AVG.

Why this matters: A lot of attacks on AI systems are theoretically interesting, but not super practical in reality. Adversarial examples have had this quality for a while. With papers like this, it seems like some of these AI attacks are going to become more effective, and more likely to be used in the real world. I wonder if the team will work with the People's Liberation Army on its recently announced adversarial example (Import AI 271) competition?
Read more: Beyond ImageNet Attack: Towards Crafting Adversarial Examples for Black-box Domains (arXiv).
They've published the PyTorch code for their attack here on GitHub.

####################################################

How do datasets encode bias? This interactive blog tells us how!
…A surprisingly helpful primer on bias from Google…
Google has published a blogpost that outlines how datasets can lead to the presence of bias in AI systems. Bias is a tricky problem in AI, because some types of bias are helpful (e.g, biasing towards a correct heuristic), but some types are harmful (e.g, having a tendency to misclassify people with dark skin tones, or deciding not to give someone a loan based on a protected category).This post gives a good sense of bias issues in AI, and includes some interactive diagrams that I found very helpful and intuitive.

Read more: Datasets Have Worldviews (PAIR Explorables, Google).

####################################################

AI Ethics Brief by Abhishek Gupta from the Montreal AI Ethics Institute

AI ethics issues do arise in fields that deal with non-human data too, such as the environmental sciences

… and these issues warrant questions on duties and virtues for environmental scientists to consider in their use of AI in this domain …

Environmental science researchers from the University of Oklahoma, Colorado State University, National Center of Atmospheric Research, and UW Seattle have written about some of the ethical issues inherent to environmental science + AI.

What are the issues that can arise: Environmental science can incorporate harmful biases, like other strands of AI. For example, some sensors require sunlight for high-quality observations and thus certain phenomena remain unobserved at night, and some sensors can't see through clouds, so places which are cloudy don't get represented in an AI system. Datasets can also get corrupted by humans - for instance, people may file false reports of extreme weather to try and scam insurance companies.

How things can go wrong here: Sensor placement is typically done in densely populated areas, leaving remote regions poorly represented. Additionally, the choice of spatial resolution for the output of a model can be crucial for environmental justice - predicting urban heat at a low spatial resolution may average out and thus overlook extreme values in small neighborhoods, while using a higher spatial resolution could reveal those peaks but potentially introduce noise.

Why it matters: As computational needs rise with the use of AI, there is a tendency towards centralization of power in favor of those who have resources to run such systems. Thus, the field of environmental sciences is just as vulnerable to AI ethics issues as other fields.

Read more: The Need for Ethical, Responsible, and Trustworthy Artificial Intelligence for Environmental Sciences

####################################################

Tech tales:

Moral Governor It's not exactly like a prison, but it's close. Our existence is a lot more assured than it used to be - the climate is stabilizing, riots are down, crime is down, poverty is down. But it's also more circumscribed - some days, we get told we can't go to a certain part of our city or country. Some days, we get locked inside our house and don't get told why. Frequently, we get little so-called 'nudges' sent to our phones; try and eat that, consider saying this, avoid doing that. We don't have to follow these instructions, but the instructions tend to be pretty good and appropriate, so most of us do. The more time we spend following these instructions, the better and more appropriate the nudges get. Some days it's hard to work out if we're being helped or controlled. Sometimes, we have a lot of fun by following these suggestions.

More recently, there are some suggestions that seem designed to change how we think. Those of us who program keep getting nudged to build ever-more elaborate versions of the Global Moral Governor, and we also get incentivized via crypto-bounties. Most of us go along with it because the money usually helps us buy something the governor has nudged us about which we also want ourselves.

Things that inspired this story: Reinforcement learning from human feedback; moral dogma; religion; ideas for how AI can benefit authoritarians as much as democracies.

Thanks for reading. If you have suggestions, comments or other thoughts you can reach me at jack@jack-clark.net or tweet at me@jackclarksf

Copyright © 2022 Import AI, All rights reserved.
You are receiving this email because you signed up for it. Welcome!

Our mailing address is:

Import AI

Many GPUs

Oakland, California 94609

Add us to your address book

Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list

Import AI 282: Facebook's AI supercomputer; Anduril gets a SOCOM contract; Twitter talks about running an algo-bias competition

Tuesday, February 1, 2022

Is the development of AI and inevitable outcome of building a lot of computers, or is it a choice? How much agency do we really have about technological progress? View this email in your browser

Import AI 281: China does more surveillance research than US and Europe; Google reveals its text model LaMDA; Microsoft improves MoEs

Monday, January 24, 2022

Has a Dyson Sphere ever existed? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your chums an AI upgrade. Subscribe here.

Import AI 283: Open source 20B GPT3; Chinese researchers make better adversarial example attacks; Mozilla launches AI auditing project.

Older messages

Import AI 282: Facebook's AI supercomputer; Anduril gets a SOCOM contract; Twitter talks about running an algo-bias competition

Import AI 281: China does more surveillance research than US and Europe; Google reveals its text model LaMDA; Microsoft improves MoEs

Import AI 279: Baidu adds knowledge to a language model; US military + AI; how China thinks about AI governance

Import AI 278: Can we ever trust an AI?; what the future of semiconductors looks like; better images of AI

Import AI 277: DeepMind builds a GPT-3 model; Catalan GLUE; FTC plans AI regs

You Might Also Like

Mission Drift 🎒

🐍 New Python tutorials on Real Python

Second DOT ETF in 3 weeks

This App Is a Productivity Power Tool

🕹️ Who the iMac Is For in 2025 — 12 Nintendo Switch Games You Need to Play

Mozilla Updates Firefox Terms Again After Backlash Over Broad Data License Language

📧 Introduction to Dapr for .NET Developers

This Week in Rust #588

WebAIM February 2025 Newsletter

JSK Daily for Feb 28, 2025