Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your chums an AI upgrade. Subscribe here.
Uh-oh, we can use reinforcement learning to get robots to walk now:
...Berkeley researchers walk, turn, and squat across the Sim2Reality gap...
Researchers are getting better at crossing the 'simulation to reality' gap. That's the implication of new research from the University of California at Berkeley, where researchers train the bipedal 'Cassie' robot to walk in simulation, then transfer the software onto a physical robot - and it works. The Cassie robots are made by Agility Robotics and cost "low-mid six figures" (Import AI 180).
How it works: The technique works by training a reinforcement learning controller to teach Cassie to walk in-sim via the use of specialized Hybrid Zero Dynamics (PDF) gait library along with domain randomization techniques. This is a good example of the hybrid approach which tends to dominate robotics - use reinforcement learning to help you figure something out, but don't be afraid to use some prior knowledge to speed up the learning process (that's where HZD comes in). The use of domain randomization is basically just a way to cheaply generate additional training data.
How well does it work: The results are impressive - in a video accompanying the research, Cassie walks over over surfaces of various textures, can be hit or disrupted by an external human, and even balances loads of varying weights. "This paper is the first to develop a diverse and robust bipedal locomotion policy that can walk, turn and squat using parameterized reinforcement learning," they write.
Read more: Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots (arXiv).
Watch video: Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots (YouTube).
###################################################
European AI Fund makes its first grants:
…$1.8 million dollars to strengthen AI policy in Europe…
The European AI Fund, a fund supported by a bunch of different philanthropic orgs (ranging from Ford to Mozilla), has announced it is providing 1.55mEuros (~$1.8 million) to 16 organizations working to improve AI policy, ethics, and governance in Europe.
The winning orgs and what they'll do: Some of the orgs include well known technically-oriented organizations (such as Access Now and Algorithm Watch), and others include groups like Friends of the Earth and the Irish Council for Civil Liberties, which are starting to turn their attentions towards AI.
Why this matters: AI has rapidly shifted from an exciting part of scientific research to a technology with broad societal implications. Infusions of funding like this will help a greater chunk of society think about and debate the future of AI, which may help to increase trust in the space as a whole.
Read more: Announcing our open call grantees (European Artificial Intelligence Fund).
###################################################
DeepMind lays out some safety issues with language models and potential interventions:
...Now that language models can produce intelligible text, how do we ensure they're doing what we want?...
Soon, the internet will be full of words generated by neural language models. These models, like GPT-3, will animate customer support agents, write articles, provide advice to people, and carry out an innumerable range of functions. Now, a team of researchers at DeepMind have tried to think about what safety issues are implied by these increasingly capable magical typewriters. Put simply: language models are complicated and their safety issues will require quite a lot of work by a bunch of people to make progress on.
What are the safety issues of language models?: The research focuses on two things: ways in which developers could 'misspecify' a language model, and also "behavioural issues due to misalignment of the [language model] agent - unintended direct/first-order harms that are due to a fault made by the system's designer", the researchers write.
Misspecification: When developing language models, data is one area of potential misspecification, because many of the datasets used for training language models are created via crawling the web, or using things that did (e.g, CommonCrawl). Even when you try and filter these datasets, you're unlikely to successfully filter out all the things you want to. There's also a secondary data issue - as more language models get deployed, a larger amount of the internet will contain LM-written data, which could introduce pathological flaws in LMs trained in this way.
Another area is the training process itself, where the algorithms you choose to train these things can influen ce their behavior. Finally, there's the matter of 'distributional shift' - these LMs are trained in a general way, which means that once trained they can get prompted with anything in their context window - including nonsense. Creating LMs that can automatically spot out of distribution questions or statements is an open challenge.
Behavioural issues: The larger issue this research covers is behavior - specifically, how LMs can manifest a range of behaviors which could have downstream safety impacts. These include:
- Deception: Language models could deceive people by, for instance, withholding salient information.
- Manipulation: Language agents could try to manipulate the humans that interact with them, for instance by getting a human to do something that benefits the agent and is a consequence of bypassing the human's ability to carry out 'rational deliberation', causing the human to stop a 'faulty mental state', or otherwise placing the human under pressure (for instance, by overtly threatening them unless they carry out an action).
- Harmful content: Language agents "may give harmful and biased outputs", both accidentally and in response to intentional priming by a human user.
- Objective gaming: In reinforcement learning, we've seen multiple examples of AI agents 'gaming the system', for instance, by fulfilling the letter of an objective but not the spirit (e.g, getting a high score in a game to therefore receive a high score, but not actually completing the level). Right now, this might be going on with language models, but we lack real-world examples to refer to.
Why this matters and what we need to do: These are all weighty, complex problems, and the DeepMind researchers don't outline many solutions, beyond recommending that more of the machine learning community focuses efforts on understanding these alignment issues. "We urge the community to focus on finding approaches which prevent language agents from deceptive, manipulative and harmful behaviour," they say.
Read more: Alignment of Language Agents (arXiv).
###################################################
Why does measurement in AI matter? A talk by me:
...It's not a coincidence Import AI focuses so much on metrics, I think this really matters…
We write a lot about measurement here at Import AI. Why is that? First, it's because quantitative measures are a helpful lens through which to view the progression of the AI field as a whole. Second, it's because metrics are measures and measures are the things that drive major policy decisions. The better we get at creating metrics around specific AI capabilities and assessing systems against them, the more of a chance we have to create the measures that are a prerequisite for effective policy regimes.
I care a lot about this - which is why I also co-chair the AI Index at Stanford University. Last week, I gave a lecture at Stanford where I discussed the 2021 AI Index report and also gave some ambitious thoughts about measurement and how it relates to policy. Thoughts and feedback welcome!
Watch the talk here: Jack Clark: Presenting the 2021 AI Index (YouTube).
###################################################
Using AI to improve game design:
...Google makes a fake game better using AI…
In the future, computer games might be tested by AI systems for balance before they're unleashed on humans. That's the idea in a new blog from Google, which outlines how the company used AI to simulate millions of games of a virtual card game called 'Chimera', then analyzed the results to find out ways the game was imbalanced. By using computers to play the games, instead of people, Google was able to do something that previously took months and generate useful data in days.
Read more: Leveraging Machine Learning for Game Development (Google AI Blog).
###################################################
Pre-training on fractals, then fine-tuning on images just might work:
...No data? No problem. FractalDB looks somewhat useful…
We write a lot about the data requirements of AI here at ImportAI. But what would happen if machine learning algorithms didn't need as much expensive data? That's the idea behind FractalDB (Import AI 234 ), a dataset composed of computationally-generated fractals (and sub-components of fractals), which can be used as the input fuel to train some systems on. New research from the Tokyo Institute of Technology investigates FractalDB in the context of training Vision Transformers (ViT), which have recently become one of the best ways to train computer vision systems.
Is FractalDB as useful as ImageNet? Not quite, but… They find that pre-training on FractalDB is less effective than pre-training on ImageNet for a range of downstream computer vision tasks, but - and this is crucial - it's not that bad. Put another way: training on entirely synthetic images yields performance close, but not quite the same, as training on real images. And these synthetic images can be procedurally generated from a pre-written ruleset - put another way, this dataset has a seed which generates it, so it's very cheap relative to normal data. This is I think quite counterintuitive - we wouldn't naturally expect this kind of thing to work as well as it does. I'll keep tracking FractalDB with interest - I wonder if we'll start to see people augment other pre-training datasets with it as well?
Read more: Can Vision Transformers learn without Natural Images? (arXiv).
###################################################
Major AI conference makes a checklist to help researchers be more ethical:
...Don't know where to start with your 'Broader Impacts' statement? This should help…
Last year, major AI conference NeurIPS asked researchers to submit 'Broader Impacts' statements along with their research papers. These statements were meant to cover some of the potential societal effects of the technologies being proposed. The result was a bunch of researchers spent a while thinking about the societal impact of their work and writing about these effects with varying degrees of success.
Enter, the checklist: To help researchers be more thorough in this, the NeurIPS program chairs have created a checklist. This list is meant "to encourage authors to think about, hopefully address, but at least document the completeness, soundness, limitations, and potential negative societal impact of their work. We want to place minimal burden on authors, giving authors flexibility in how they choose to address the items in the checklist, while providing structure and guidance to help authors be attentive to knowledge gaps and surface issues that they might not have otherwise considered," they say. (Other resources exist, as well, like guides from the Future of Humanity Institute, #198).
What does the checklist ask? The checklist provides a formulaic way for people to think about their work, asking them if they've thought about the (potential) negative societal impacts of their work, if they've described limitations, if their system uses personally identifiable information or "offensive content" (which isn't defined), and so on.
Why this matters: AI is in the pre-hippocratic oath era. We don't have common ethical standards for practitioners in the AI community, nor much direct ethics education. By encouraging authors to add Broader Impacts to their work - and making it easier for them to think about creating these statements - NeurIPS is helping to further the ethical development of the field of AI as a whole. Though it's clear we need much more investment and support in this area to help our ethical frameworks develop as richly as our technical tooling.
Read more: Introducing the NeurIPS 2021 Paper Checklist (NeurIPS blog).
Check out the paper checklist here (NeurIPS official website).
###################################################
Tech Tales:
The Drone that Got Lost
[Rural England, 2030]
It knew it was lost because it stopped getting a signal telling it that it was on track.
According to its diagnostic systems, a fault had developed with its GPS system. Now, it was flying through the air, but it did not know where it was. It had records of its prior locations, but not of its current one.
But it did know where it was going - both the GPS coordinate was in a database as well as, crucially, the name of the city, Wilderfen. It sent a message back to its origination station, attaching telemetry from its flight. It would be seconds or, more likely, minutes, until it could expect a reply.
At this point, the backup system kicked in, which told the drone that it would first seek to restore GPS functionality then, given the time critical nature of the package the drone was conveying, would seek to get the drone to its intended location.
A few milliseconds passed and the system told the drone that it was moving to 'plan B' - use other sensory inputs and AI augmentations to reacquire the location. This unlocked another system within the drone's brain, which began to use an AI tool to search over the drone's vision sensors.
- Street sign: 95% probability, said the system. It drew a red bounding box around a sign that was visible on a road, somewhere beneath and to the East of the drone.
- Because the confidence was above a pre-wired 90% baseline, the drone initiated a system that navigated it closer to the sign until it was able to check for the presence of text on the sign.
- Text: 99%, said the system, once the drone had got closer.
- Text parsed as "Wilderfen 15 miles".
- At this point, another pre-written expert system took over, which gave the drone new instructions: follow roadway and scan signs. Follow the signs that point towards Wilderfen.
So the drone proceeded like this for the next couple of hours, periodically zooming down from the sky until it could read streetsigns, then parsing the information and returning to the air. It arrived, around two hours later, and delivered its confidential payload to a man with a thin face, standing outside a large, unmarked datacenter.
But it was not able to return home - the drone contained no record of its origination point, due to the sensitive nature of what it was carrying. Instead, a human was dispatched to come and find the drone, power it down, place it into a box, then drive it to wherever its 'home' was. The drone was not permitted to know this, nor did it have the ability to access systems that might let it infer for itself. Broader agency was only given under special circumstances and the drone was not yet sophisticated enough to independently desire that agency for itself.
But the human driving the car knew that one day the drone might want this agency. And so as they drove they found their eyes periodically staring into the mirror inside the car, looking at the carrycase on the backseat, aware that something slumbered inside which would one day wake up.
Technical things that inspired this story: Multimodal models like CLIP that can be used to parse/read text from visual inputs; language models; reinforcement learning; instruction manuals; 10 drones that the FAA recently published airworthiness criteria for.
Thanks for reading. If you have suggestions, comments or other thoughts you can reach me at jack@jack-clark.net or tweet at me@jackclarksf
|