Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your chums an AI upgrade. Subscribe here.
Train your RL agent in this snappy gridworld:
...Griddly Version1.1.0 has self-play support…
Griddly, an open source project for doing research on AI agents simulated in gridworld environment, has just moved to version 1.1.0. The latest version of the software includes support for RTS self-play - that is, the technique used in approaches like DeepMind's AlphaGo and OpenAI's Dota2, where an Rl agent plays success games against itself until its performance improves.
A caveat about gridworlds: Gridworlds, that is simplified 2D environments, are used frequently in AI research. That doesn't mean they're a good idea. Gridworlds are really just a temporary approach that pairs a pragmatic desire to experiment with something scoped for today's limited computational resources (e.g Griddly is written in C++ to further optimize its performance). I'm excited to see what kinds of replacements for Gridworlds people use in the future.
Get the code here for Griddly (official GitHub).
Read more about Griddly here (ReadTheDocs).
###################################################
OpenAI releases a formal mathematics benchmark:
...F2F: One benchmark for comparing multiple systems…
OpenAI has built MiniF2F, a formal mathematics benchmark to evaluate and compare automated theorem proving systems based on different formal systems being targeted (e.g, Lean, Metamath). The benchmark is still in development and OpenAI is looking for feedback and plans to create a version 1 of the benchmark in the summer.
Why this matters: Formal mathematics is an area where we've recently seen deep learning based methods cover surprising ground (e.g, Google has a system it uses called HOList for running AI-math experiments ImportAI: 142). Benchmarks like MiniF2F will make it easier to understand what kind of progress is being made here.
Read more: MiniF2F (OpenAI, GitHub).
###################################################
Affinity groups swear off Google funding after Gebru and Mitchell firings:
...Black in AI, Queer in AI, and Widening NLP reject sponsorshop…
Late last year, Google fired Timnit Gebru, co-founder of its AI Ethics team. Then, early in 2021, it fired Margarat Mitchell, the other co-founder. Since then, senior manager Samy Bengio has moved onto Apple, and various people have tweeted statements to the nature of 'more departures are on the way'. Of course, there's been blowback in response to Google's actions here. The latest example of this blowback is AI affinity groups refusing Google sponsorship.
Specifically, Black in AI, Queer in AI, and Widening NLP have all decided to end their sponsorship relationship with Google in response to the firings. "We share a mandate to not merely increase the representation of members from our respective communities in the field of AI, but to create safe environments for them and to protect them from mistreatment" the orgs write in a letter. "Google’s actions in the last few months have inflicted tremendous harms that have reverberated throughout our entire community. They not only have caused damage but set a dangerous precedent for what type of research, advocacy, and retaliation is permissible in our community."
Read more: An Open Letter to Google (WINLP official site).
###################################################
IBM wants to teach machines to program using 'CodeNet' dataset:
...14 million code samples for 4,000 problems…
IBM has built and released CodeNet, a dataset of 14 million code submissions for 4,000 distinct programming challenges. CodeNet is designed to help people build AI systems that can generate and analyze code. Part of why CodeNet exists is because of the impressive progress in NLP which has occurred in recent years, with architectural improvements like the Transformer and AI systems such as GPT-3 and T5 leading leading to NLP having its so-called "ImageNet moment" (Import AI 170).
What is CodeNet? CodeNet consists of coding problems scraped from two coding websites - AIZU and AtCoder. More than 50% of the code samples within CodeNet "are known to compile and run correctly on the prescribed test cases", IBM said. More than 50% of the submissions are in C++, followed by Python (24%), and Java (5%). CodeNet contains 55 different languages in total.
Why this matters: Now that computers can read and generate text, we might ask how well they can read and generate code. We know that they have some basic capabilities here, but it's likely that investment into larger datasets, such as CodeNet, could help us train far more sophisticated code processing AI systems than those we have today. In a few years, we might delegate coding tasks to AI agents, in the same way that today we're starting to delegate text creation and processing tasks.
Read the paper: Project CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks (IBM GitHub).
Read more:Kickstarting AI for Code: Introducing IBM's Project CodeNet (IBM research blog).
Get the code here: Project CodeNet (IBM GitHub).
###################################################
US Senators introduce bill to increase AI talent in government, fund AI safety, and support development of military prototypes:
...The Artificial Intelligence Capabilities and Transparency (AICT) Act could be a big deal…
Two US senators - a Republican named Rob Portman and a Democrat called Martin Heinrich - have drafted legislation that will give the Federal government more resources to use to develop AI technology, as well as emphasizing AI safety in both the government's use of AI as well as its funding for AI research.
Key ingredients: Specifically, the bill would establish a "AI development and prototyping fund" worth $50 million at the Department of Defence; requires the National Institute of Standards and Technology (NIST) to assess how well organizations can identify potential privacy, civil rights, and civil liberties effects of AI systems; encourages the National Science Foundation to establish "focus areas in AI safety and AI ethics"; creates a "chief digital recruiting officer" at the DoD, DOE, and the Intelligence Community (IC) to help them hire talent.
Senators to National Science Foundation: Please prioritize AI safety! In particular, the bill - and a separate letter sent to NSF - emphasizes need for the government to invest more in AI ethics and safety research.
"AI safety refers to technical efforts to improve AI systems in order to reduce their dangers, and AI ethics refers to quantitative analysis of AI systems to address matters ranging from fairness to potential discrimination. While we understand that NSF incorporates concepts of ethics and safety across all of the thematic areas of its AI research, establishing two new themes dedicated to ethics and safety would help ensure that innovations in AI ethics and safety were pursued for their own ends rather than being merely best practices for different use cases," they write.
Why this matters: We spend a lot of time writing about the 'sausagemaking' aspects of policy here at Import AI - that's because sausagemaking is both obscure and important. Bills and letters like this increase the chance of the US government investing more of its R&D efforts into things relating to safety and ethics, as well as builds capacity for AI development within the US government. We currently live in a deeply lopsided world where companies have huge AI development and deployment capacity, while the government's ability to develop and deploy and regulate AI is minimal. This is not a long-term stable equilibrium and our choices as a society are to a) drift into full libertarian 'cypherpunk' corporate rule, or b) have a functioning democracy where the government has sufficient technical leverage it can hope to steer the private sector towards a just and equitable future. The choice is ours.
Read more:Portman, Heinrich Announce Bipartisan Artificial Intelligence Bills To Boost AI-Ready National Security Personnel, Increase Governmental Transparency (Senator Portman, official website).
Read more: Portman, Heinrich Urge National Science Foundation To Prioritize Safety and Ethics in Artificial Intelligence Research, Innovation (Senator Portman, official website).
###################################################
Dataset archaeology: BookCorpus:
...What lies within the dataset that helped create BERT and GPT-3?...
BookCorpus is a dataset of around 11,000 books by unpublished authors posted on the internet. The dataset was compiled in 2014 and since then has been a key ingredient in systems ranging from BERT to GPT-3. Now, a couple of researchers have done a detailed analysis of the dataset. Their findings? BookCorpus has some areas of potential copyright claims (somewhat unsurprising), significant duplication (where they find only 7,185 of the books in the corpus are unique), and a skewed genre representation where BooksCorpus contains way more romance relative to the platforms it was scraped from.
Why this matters: In recent years, people like Timnit Gebru and Margaret Mitchell and Emily Bender have all called for a greater amount of documentation applied to the world's datasets and AI system. Research like this helps document these datasets, which will ultimately help create the metadata out of which regulators craft standards for dataset disclosure in the future.
Read more:Dirty Secrets of BookCorpus, a Key Dataset in Machine Learning (Towards Data Science).
Read more: Addressing "Documentation Debt" in Machine Learning Research: A Retrospective Datasheet for BookCorpus (arXiv).
###################################################
Facebook builds a dataset so computers can read the world:
...You can't do visual question answering if you can't read...
Facebook wants to be in a world where AI systems can look at an image, read the text in it, and reason about that text (e.g, parsing street addresses and feeding that into a location system; looking at clockfaces and parsing that into time; seeing license plates and porting those into another database, et cetera). To help speed the science here, Facebook has just released 'TextOCR', a dataset of (almost) a million high quality word annotations applied to TextVQA images (VQA = Visual Question Answering).
What goes into TextOCR: TextOCR has around ~900,000 labels applied to ~28,000 images, creating a large dataset that - Facebook says - can be used both to pre-train AI systems, and to test AI systems' text parsing and reasoning capabilities.
Why this matters: A lot of AI is about two things:
- Building stuff to turn squishy reality into something digital and structured - that's the point of a lot of computer vision.
- Building stuff to reason about the resulting digitized representation of reality (e.g, massive generative models that can be prompted, like CLIP or GPT3).
...TextOCR contributes to both of these things, unlocking more of the world for analysis by robots, and increasing the likelihood of us training systems that can reason about this stuff.
Read more:TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text (arXiv).
Get thedataset here from the TextOCR site.
Read about the earlier TextVQA challenge and dataset here.
###################################################
Tech Tales:
A survey of recent progress in 'child finder' UAVs
Lean Kerman, Karlsruhe Institute of Technology
Abstract
In recent years, so called 'child finder' (CF) unmanned aerial vehicles (UAVs) have started to be used in missing person cases. These systems have helped to find and identify missing persons on numerous occasions and with sufficient success that regulators are now assessing whether to deploy them to urban areas, in addition to their contemporary intra-urban and countryside deployments. This paper surveys recent progress in CF-UAVs, identifies some potential challenges to further deployment, and discusses some implications of their performance.
Selected highlights:
Recent progress in computer vision - specifically, self-supervised learning, as well as advances in semantic outlining of objects - has enabled UAVs as 'people surveillance' platforms; such systems have been fielded for tasks as diverse as identifying migrants attempting border crossings; law enforcement crowd classification at large public events; employee 'wellness analysis' at firms ranging from Amazon to Walmart; and the deployment of UAVs as 'hunter' or 'finder' platforms targeted at specific individuals.
**
CF-UAVs have a history of deployment issues; early versions were not sufficiently accurate and there are numerous documented cases of misclassification, while more recent ones have been criticized at length in the media for their reinforcement learning-enabled 'individual tracking' abilities.
**
Looking ahead, recent trends in drone swarm technologies have made it possible to network together multiple CF-UAVs into a single unit that can autonomously map and search over an area. Effective ranges vary according to the complexity of the environment; recent research has demonstrated powerful search capabilities over 5km urban areas, 100km agricultural land, and 10km dense forest.
Thanks for reading. If you have suggestions, comments or other thoughts you can reach me at jack@jack-clark.net or tweet at me@jackclarksf
|