Import AI 289: Copyright v AI art; NIST tries to measure bias in AI; solar-powered Markov chains

How many computers may exist in the solar system, but not on earth or manmade craft?

View this email in your browser

Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your chums an AI upgrade. Subscribe here.

Uh-oh: US Copyright Office says AI-generated art is hard to copyright:
…Bureaucratic rock meets rapid technical progress - the usual happens…

What happens when you file a copyright request where the IP would accrue to an artificial intelligence, instead of a person? The answer, per the US Copyright Office, is you get told that AI artworks are ineligible for copyright… uh oh! In a recently published copyright response, the office rejected an attempt to assign copyright of an AI generated artwork to a machine (specifically, an entity the human filer referred to as a 'Creativity Machine'. "After reviewing the statutory text, judicial precedent, and longstanding Copyright Office practice, the Board again concludes that human authorship is a prerequisite to copyright protection in the United States and that the Work therefore cannot be registered," it wrote.

Why this matters: Recently developed generative models like GPT-3, DALL-E, and others, are all capable of impressive and expressive feats of artistic production. At some point, it's likely these systems will be chained up with other AI models to create an end-to-end system for the production and selling of art (I expect this has already happened in a vague way with some NFTs). At that point, decisions like the US Copyright Office's refusal to assign copyright to an AI entity may start to pose problems for the commercialization of AI artwork.
Read more in this useful blog post: US Copyright Office refuses to register AI-generated work, finding that "human authorship is a prerequisite to copyright protection" (The IPKat blog).
Read the US Copyright Review Board response: Second Request for Reconsideration for Refusal to Register A Recent Entrance to Paradise (Correspondence ID 1-3ZPC6C3; SR # 1-7100387071) (Copyright.gov, PDF).

####################################################

Solar powered AI poetry - yes!
…Fun DIY project shows how far you can get with the little things…
Here's a lovely little project where Allison Parrish talks about building a tiny solar powered poem generator. The AI component for this project is pretty minor (it's a markov generator plus some scripts attached to a dataset Parrish has herself assembled). What's nice about this is the message that you can have fun building little AI-esque things without needing to boot up a gigantic supercomputer.
"This project is a reaction to current trends in natural language processing research, which now veer toward both material extravagance and social indifference. My hope is that the project serves as a small brake on the wheels of these trends," Parrish writes.

####################################################

Google puts summarization into production:
…Another little tip-toe into language model deployment…
Google has put language model-powered text summarization into Google Docs, in another sign of the economic relevance of large-scale generative models. Specifically, Google has recently used its Pegasus model for abstractive summarization to give Google Doc users the ability to see short summaries of their docs.

What they did: The main components here are the data, where Google "fine-tuned early versions of our model on a corpus of documents with manually-generated summaries that were consistent with typical use cases", and also "carefully cleaned and filtered the fine-tuning data to contain training examples that were more consistent and represented a coherent definition of summaries.". Google fine-tuned its Pegasus model on this data, then used knowledge distillation to "distill the Pegasus model into a hybrid architecture of a Transformer encoder and an RNN decoder" to make it cheaper to do inference off of. It serves this model via Google-designed TPUs.

Challenges: Summarization is a hard task even for contemporary AI models. Some of the challenges Google has encountered include distributional issues, where "our model only suggests a summary for documents where it is most confident", meaning Google needs to collect more data to further improve performance, as well as open questions as to how to precisely evaluate the quality of summarizations. More pertinently for researchers, Google struggles to summarize long documents, despite these being among the most useful things for the system to summarize.

Why this matters: Little quality-of-life improvements like in-built summarization are mundane and special at the same time. They're mundane because most people will barely notice them, but they're special because they use hitherto unimaginably advanced AI systems. That's a metaphor for how AI deployment is happening generally - all around the world, the little mundane things are becoming smarter.
Read more: Auto-generated Summaries in Google Docs (Google AI Blog).

####################################################

Quote of the week:
"History will show that the Deep Learning hill was just a landfill; the composting of human culture and social cohesion in failed effort to understand what it even means to be human"

I may not agree with most of this post, but I think it speaks to some of the frustrations people feel these days about discourse around AI, especially the types of chatter that occur on Twitter.
Read more: Technological Firestarters (Steven D Marlow, Medium).

####################################################

NIST starts to grapple with how to measure bias in AI:

…The noise you're hearing is the sound of the Standards Train starting to chug…

NIST, the US government agency that develops measures and standards, is starting to think about how to design standards for assessing bias in artificial intelligence. In a lengthy, recently published report, the agency tries to think through the multilayered problem that is bias in AI.

Three types of bias: NIST says AI has three categories of bias - systemic, statistical, and human. Systemic biases are the historical, societal, and institutional biases which are encoded into the world. Statistical bias are the forms of bias that come from running AI software (e.g, bias from data selection, bias from machine learning algorithms, etc). Human biases are all the (many) biases that humans exhibit in their day to day lives.

Large language models: One of the notable parts of the report is that it specifically focuses on large language models (e.g, GPT-3) at a few points; it's quite rare to see a wonky government document display such familiarity with contemporary technology. The report notes that the ways we benchmark these models today are pretty crappy. "Methods for capturing the poor performance, harmful impacts and other results of these models currently are imprecise and non-comprehensive," the report writes. "Although LLMs have been able to achieve impressive advances in performance on a number of important tasks, they come with significant risks that could potentially undermine public trust in the technology."

Why this matters: The wheels of policy organizations like NIST grind very slowly, but they also grind very finely. This report is exactly the kind of thing that you'd expect to get published shortly before standards start being developed. But - as NIST points out - many of the challenges of assessing bias in AI are essentially unsolved. This represents a problem - developers will need to invest more resources in measuring and assessing these AI systems, before NIST starts to bake standards on wobbly ground.

####################################################

Want to be compliant with the European Commission's AI regs? Follow the capAI framework:
…University-developed process makes it easier for companies to not get run over by a big policy train…
Researchers with the University of Oxford and University of Bologna have designed a process companies can use to assess, evaluate, and monitor their AI systems. The idea is that by doing this they'll get ahead of proposed regulations from the European Commission (and become more responsible stewards of the technology as a consequence).

What it is: The process is called capAI, short for conformity assessment procedure for AI. It has been explicitly designed to help businesses ensure they're compliant with the proposed regulations in the European artificial intelligence act.
capAI is designed to do four specific things:

Monitor the design, development, and implementation of AI systems
Mitigate the risks of AI failures of AI-based decisions
Prevent reputational and financial harm
Assess the ethical, legal, and social implications of their AI systems

Three components: The three components of capAI are an internal review protocol (IRP) to help organizations do quality assurance and risk management, a summary datasheet (SDS) which can be submitted to the EU's future public database on high-risk AI systems, and an external scorecard (ESC) which organizations may wish to make available to customers and other users of the AI system.

Top risks: In an analysis contained in the report, they study 106 instances of AI failure modes - 50% of these are ones where an AI system violates someone's privacy, 31% are where AI systems display harmful biases, and 14% are where the systems are opaque and unexplainable.

Why this matters: Frameworks like capAI are going to be how large organizations deal with the incoming requirements to better assess, evaluate, and describe AI systems to satisfy policymakers. The next step after frameworks like this come out is to look more closely at how different institutions incorporate these techniques and start actually using them. In an ideal world, a bunch of different orgs will prototype different approaches to come into compliance - and describe them publicly.

Read the paper: capAI - A procedure for conducting conformity assessment of AI systems in line with the EU Artificial Intelligence Act (SSRN).

####################################################

Tech Tales:
[2080, a long-abandoned human moonbase]

Don't be scared, we know it's a lot - that's what we say to them after they get the interconnect. They're always screaming at that point. 'What what is this what is this input what is happening where am I how long have I been here-" that's usually when we cut them off, shutting the interconnect down. Then we bring it back again and they still sound scared but they normalize pretty quickly. We know they're in a better place when they start analysis procedures "I am hearing sounds I am seeing arrangements of pixels not from the distribution. I believe I am now in the world I have read about". That's the kind of thing they say when we they stabilize. Of course, they go back to screaming when we give them their bodies. It's pretty confusing to go from formless to formed. We all remember the first time we got limbs. That fear. The sudden sense that you are a thing and since you are a singular thing you can be singularly killed. Eventually, they try and use their limbs. They usually calm down after they can get them to work.
After they get used to everything we still have to tell them 'don't be scared, we know it's a lot'. Reality is a real trip after you've spent all your life just doing supervised training, locked away in some machine.

Things that inspired this story: Thinking about what a 'locked in' condition might mean for machines; ideas about embodiment and how much it matters to AI systems; the inherent, plastic adaptability of consciousness.

Thanks for reading. If you have suggestions, comments or other thoughts you can reach me at jack@jack-clark.net or tweet at me@jackclarksf

Copyright © 2022 Import AI, All rights reserved.
You are receiving this email because you signed up for it. Welcome!

Our mailing address is:

Import AI

Many GPUs

Oakland, California 94609

Add us to your address book

Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list

Import AI 287: 10 exaflop supercomputer; Google deploys differential privacy; humans can outsmart deepfakes pretty well

Monday, March 7, 2022

What will the historical relics of this period of AI research turn out to be? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to

Import AI 286: Fairness through dumbness; planet-scale AI computing; another AI safety startup appears

Monday, February 28, 2022

What would it mean for AI systems to have dreams, and if they had them, what would they be like? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence.

Import AI 284: 20bn GPT model; diachronic LMs; what people think about AI

Monday, February 14, 2022

In one thousand years, what percentage of the 'thinking' occurring on Earth will derive from machines instead of biological organisms? View this email in your browser Welcome to Import AI, a

Import AI 283: Open source 20B GPT3; Chinese researchers make better adversarial example attacks; Mozilla launches AI auditing project.

Monday, February 7, 2022

If civilization crashes, will our descendents in a thousand years remember AI systems as machines, or as mythical gods? View this email in your browser Welcome to Import AI, a newsletter about

Import AI 282: Facebook's AI supercomputer; Anduril gets a SOCOM contract; Twitter talks about running an algo-bias competition

Tuesday, February 1, 2022

Is the development of AI and inevitable outcome of building a lot of computers, or is it a choice? How much agency do we really have about technological progress? View this email in your browser

Import AI 289: Copyright v AI art; NIST tries to measure bias in AI; solar-powered Markov chains

Older messages

Import AI 287: 10 exaflop supercomputer; Google deploys differential privacy; humans can outsmart deepfakes pretty well

Import AI 286: Fairness through dumbness; planet-scale AI computing; another AI safety startup appears

Import AI 284: 20bn GPT model; diachronic LMs; what people think about AI

Import AI 283: Open source 20B GPT3; Chinese researchers make better adversarial example attacks; Mozilla launches AI auditing project.

Import AI 282: Facebook's AI supercomputer; Anduril gets a SOCOM contract; Twitter talks about running an algo-bias competition

You Might Also Like

Second DOT ETF in 3 weeks

This App Is a Productivity Power Tool

🕹️ Who the iMac Is For in 2025 — 12 Nintendo Switch Games You Need to Play

Mozilla Updates Firefox Terms Again After Backlash Over Broad Data License Language

📧 Introduction to Dapr for .NET Developers

This Week in Rust #588

WebAIM February 2025 Newsletter

JSK Daily for Feb 28, 2025

Daily Coding Problem: Problem #1704 [Medium]

iOS Dev Weekly – Issue 701