Import AI 307: Copilot lawsuit; Stability raises $101m; US v China CHIPLOMACY

If all AI research stopped today (but engineering and improvement of existing systems continued), then how would the world look in a decade?
View this email in your browser

Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your chums an AI upgrade. Subscribe here.

The single best thing to read about the China chip controls:

…What CHIPLOMACY looks like…

Here's a great writeup by Greg Allen about the impact of the USA's anti-China semiconductor controls. The tl;dr is this is a powerful and overlapping set of policy actions which, in combination, are designed to destroy China's burgeoning chip industry. These sanctions are a huge deal and the Chinese government will likely be responding - be prepared. 

   Read more: Choking Off China’s Access to the Future of AI (CSIS).

####################################################

Gray area code models: Lawyer-programmer mulls anti-Copilot lawsuit:

…What one person calls fair use another person calls infringement…

Matthew Butterick, a lawyer and programmer, has reactivated his California bar membership so he can investigate "a potential lawsuit against GitHub Copilot for violating its legal duties to open-source authors and end users". The gist of the complaint is that GitHub was trained on tons of public GitHub repos, yet the code GitHub spits out doesn't have any attributions to those repos, and therefore you need to argue Copilot is fair use because it is sufficiently transformative - but that's not established. 

What's wrong with Copilot? "Though some courts have con­sid­ered related issues, there is no US case squarely resolv­ing the fair-use ram­i­fi­ca­tions of AI train­ing," Butterick writes. Since there is no legal precedent here, it's not clear you can argue that Copilot falls under fair use, one way or the other.

   Additionally, Copilot can sometimes regurgitate code which is a copy of identifiable reporistories, but both Microsoft (and their underlying AI partner, OpenAI) offload responsibility here to the user of the Copilot suggestion rather than themselves. "As a side effect of Copi­lot’s design, infor­ma­tion about the code’s ori­gin—author, license, etc.—is stripped away. How can Copi­lot users com­ply with the license if they don’t even know it exists?"

Copilot is climate change for coders: Butterick notes that Copilot may, as it becomes more successful, "inhibit" or "remove any incentive" for programmers to spend time in open source communities. "Over time, this process will starve these com­mu­ni­ties. User atten­tion and engage­ment will be shifted into the walled gar­den of Copi­lot and away from the open-source projects them­selves—away from their source repos, their issue track­ers, their mail­ing lists, their dis­cus­sion boards. This shift in energy will be a painful, per­ma­nent loss to open source," he writes. "The legal­ity of Copi­lot must be tested before the dam­age to open source becomes irrepara­ble. That’s why I’m suit­ing up."

Why this matters: These generative models can do amazing and beguiling things - and people are betting they're the future (see, elsewhere in this issue, Common Sense Machines, and the Stable Diffusion fundraise). But they also do pose significant issues with regard to the 'digital commons' from which we all depend - I worry that systems like Copilot can both starve the commons (destroy open source incentives) and also poison them (loop Copilot-generated code back into the commons, which could theoretically lower the aggregate quality of what is available.) 

   Read more: Maybe you don’t mind if GitHub Copi­lot used your open-source code with­out ask­ing.

But how will you feel if Copi­lot erases your open-source com­mu­nity? (GitHub Copilot investigation).

####################################################

Common Sense Machines wants to make a 3D, temporal DALL-E:
…CSM-1 is a neural network pretending to be a simulator and a sign of things to come…

New AI startup Common Sense Machines has built CommonSim-1 (CSM1), a "neural simulation engine" which people can use to generate arbitrary 3D scenes and simulations. 

   "CommonSim-1 is operated with images, language, and action. A user (machine or human) shows or describes what they want to simulate and then controls the kinds of outputs they want to measure and observe,"  they write. "At the heart of CommonSim-1 is a foundation model of the 3D world that is trained on a large-scale, growing dataset of diverse human (and non-human) experience across a wide range of tasks. We combine publicly available data, our own internal datasets, and task-specific data provided by our partners."

What can CommonSim-1 do? CSM1 can build high-resolution videos from as little as a single frame of video. "Since this model imagines the future, one can use its imagination (1) as training data for 3D generation and perception and (2) as part of another system’s predictive model," they write. "With a mesh or NeRF generated by CommonSim-1, one can type natural-language descriptions into a text prompt and generate unlimited new hybrid scenes."

Why this matters - worlds within worlds: CSM-1 is a miniature world - it's literally a world model. It combines text and image and video and provides another approach to monetizing AI; helping to take costs out of 3D design and simulation via leveraging a (presumably) gigantic model. It's also a sign of things to come - all models are going to tend towards incorporating all modalities and unfolding over time; CSM-1 is a taste of things to come. 

   Read more: Generating 3D Worlds with CommonSim-1 (Common Sense Machines, blog)

####################################################

Open access image generation raises $101 million:
…That's a whole lot of capital for a company commoditizing itself…

Stability.ai, the company behind the free 'Stable Diffusion' image model, has raised $101 million in funding. The round was led by Coatue, Lightspeed Venture Partners, and O'Shaughnessy Ventures LLC. For those not familiar, Stability.ai built Stable Diffusion, a widely used image generation model which, unlike proprietary counterparts Imagen and DALL-E, has had its weights released onto the internet, making it available to tinker with for free. 

   "Since launching, Stable Diffusion has been downloaded and licensed by more than 200,000 developers globally," the company writes in a press release.

A funny aside: I wrote this section of the newsletter while sat on a couch in the Exploratorium watching as people ate short-rib sliders and drank glasses of wine, awaiting a presentation from Stable Diffusion about their raise. 

Why this matters: There's a vigorous debate in the AI community about how AI models should proliferate (and there's some indication that this debate seeped through to politicians; see Eshoo's letter to the US National Security Advisor criticizing the release of model weights for Stability.ai (Import AI 304)), and Stability.ai represents one extreme end of the spectrum - proliferate the weights, then build a range of as-a-service businesses on top. How this debate unfolds is going to have a major influence over the AI development landscape, so it's worth paying attention to how Stability.ai navigates this space. 

   Read more: Stability AI Announces $101 Million in Funding for Open-Source Artificial Intelligence (PR Newswire).

####################################################

First, image models, now language models get commoditized:

…Carper plans to release a pretty good RLHF language model…

CarperAI, an AI startup slash open source research collective slash cypherpunk-AI-guerilla group, plans to release a "chinchilla-optimal large language model explicitly trained to follow human instructions". This is a big deal! Up to now, publicly released language models (e.g, OPT, BLOOM, GLM-130) are either not trained on the optimal amount of data, nor are they calibrated via human feedback to be better at following instructions. Instead, these models mostly reside inside proprietary labs (e.g, Anthropic, OpenAI). (Carper also recently released code to make it easy for anyone to train LMs - up to 20B parameters - from human feedback (Import AI #305)).

Who they're partnering with: CarperAI are partnering with Scale, Humanloop, HuggingFace, Multi, EleutherAI, and StabilityAI to train and deploy the model. This is a neat illustration of the shifting politics and allegiances of the AI ecosystem, and feels like a representation of a 'second wave' of labs, following the 'first wave' epitomized by OpenAI and DeepMind.

Why this matters: Models trained with reinforcement learning from human feedback (RLHF) are really good. They're way, way better than non-RLHF models for most tasks. Also, models trained on more data via the Chinchilla insight are also way more capable than those trained on less data. By combining these two things, CarperAI is likely to release far and away the most capable language model onto the open internet. This has upsides - researchers will get to play with a decent RLHF model in an unrestricted way - as well as downsides - RLHF models are the proverbial machine gun to a pistol (non-RLHF models), so potential misuses are magnified as well. 

   Read more: CarperAI, an EleutherAI lab, announces plans for the first open-source “instruction-tuned” language model (CarperAI).

####################################################

Tech Tales:

So, do I have your attention

[Meta's wasteland, 2030]

You want to survive in this world, you need to keep one eye closed. 

That's what my Dad said to me when he handed me the headset. 

But dad - these are for both eyes, I said. 

I know, and that's how they get you, he said. I know you've just 18 and think you've got it all figured out, but trust me - they've got you figured out more. 

So I put the headset on and kept one eye closed. I walked through a vast world full of verdant nature and bustling cities and intriguing quests and characters. After half an hour, I had almost completed my first quest. The last part of the mission was to place a gem I'd mined at the base of a totem. I found the totem and, as I approached, the background music in the game changed. Then after I put the gem in the base, some huge light source overhead turned on and the music swelled to a crescendo. 

'No son don't look up,' i could hear my dad, muffled, shouting at me. 

But I looked up. Stared into the light on top of the totem and felt something tickle my brain, like the beginning of a joke. My right eye hurt from keeping it shut and I wanted to open it as lights strobed across the eyelid. But I didn't. And then I got a splitting headache and I paused the game and took the headset off. 

   What the hell was that? I said. 

   That, my dad said, was your first encounter with an attention harvester. 

   A what?

   How do you think they fund the game? All the utility functions? Services. 

   I don't know, I guessed ads. 

   We're way beyond ads, he said. This thing is designed to capture you - if you had both eyes open you'd have spent half an hour talking to that thing, telling it everything about yourself. And the next time you did a quest the world would be even more engaging, and the next time you talked to a totem it'd take an hour, and then the world would get even more interesting. Do you see?

   I do, I said. 

The next time I went in the game I walked until I was in the multiplayer area and, across a great plain, I saw numerous totems light up and numerous players stop at the base of them, some staying for minutes and others for hours. One player was there for five hours and still there when I left, standing at the base of the totem and looking up into its brilliant light. 

Things that inspired this story: Attention harvesting; the logic of the metaverse; computer games; wisdom; MK Ultra.


Thanks for reading. If you have suggestions, comments or other thoughts you can reach me at jack@jack-clark.net or tweet at me@jackclarksf

Twitter
Facebook
Website
Copyright © 2022 Import AI, All rights reserved.
You are receiving this email because you signed up for it. Welcome!

Our mailing address is:
Import AI
Many GPUs
Oakland, California 94609

Add us to your address book


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list

Email Marketing Powered by Mailchimp

Key phrases

Older messages

Import AI 306: Language models learn about the world via MuJoCo; Amazon releases a big Q&A dataset; and DeepMind tests out multimodal systems

Monday, October 17, 2022

In the same way dogs and whales are alien intelligences with respect to humans, how 'alien' might AI seem to us? View this email in your browser Welcome to Import AI, a newsletter about

Import AI 305: GPT3 can simulate real people; AI discovers better matrix multiplication; Microsoft worries about next-gen deepfakes

Tuesday, October 11, 2022

If I can be simulated by GPT3, then will a larger LM simulate me better than myself? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this email

Import AI 304: Reality collapse thanks to Facebook; open source speech rec; AI culture wars.

Monday, October 3, 2022

Are the culture wars we're experiencing in the AI industry similar to those which occurred at the beginning of the rail and oil industries, or are these culture wars radically different? View this

Import AI 303: Adversarial examples for language models; Censorship vs 'Safety'; free image classification from the StableDiffusion people

Monday, September 19, 2022

How many empires will be built with AI as the fundamental input resource? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give

Import AI 302: Fictional AI labs and AI theft; Google makes an audio model by training like a language model.

Monday, September 12, 2022

What predictions about future AI progress are most likely to be wrong? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your

You Might Also Like

Tesla Autopilot investigation closed

Friday, April 26, 2024

Inside the IBM-HashiCorp deal and Thoma Bravo takes another company private View this email online in your browser By Christine Hall Friday, April 26, 2024 Good afternoon, and welcome to TechCrunch PM.

Microsoft's and Google's bet on AI is paying off - Weekly News Roundup - Issue #464

Friday, April 26, 2024

Plus: AI-controlled F-16 has been dogfighting with humans; Grok-1.5 Vision; BionicBee; Microsoft's AI generates realistic deepfakes from a single photo; and more! ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

🤓 The Meta Quest Might Be the VR Steam Deck Soon — Games to Play After Finishing Wordle

Friday, April 26, 2024

Also: Why a Cheap Soundbar Is Better Than Nothing, and More! How-To Geek Logo April 26, 2024 Did You Know TMI: Rhinotillexomania is the medical term for obsessive nose picking. 🖥️ Get Those Updates

JSK Daily for Apr 26, 2024

Friday, April 26, 2024

JSK Daily for Apr 26, 2024 View this email in your browser A community curated daily e-mail of JavaScript news A Solid primer on Signals with Ryan Carniato (JS Party #320) Ryan Carniato joins Amal

So are we banning TikTok or what?

Friday, April 26, 2024

Also: Can an influencer really tank an $800M company? View this email online in your browser By Haje Jan Kamps Friday, April 26, 2024 Image Credits: Jonathan Raa/NurPhoto / Getty Images Welcome to

[AI Incubator] 300+ people are already in. Enrollment closes tonight at 11:59pm PT.

Friday, April 26, 2024

How to decide if you're ready. ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

Daily Coding Problem: Problem #1423 [Medium]

Friday, April 26, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. You are given an array of nonnegative integers. Let's say you start at the

Data science for Product Managers

Friday, April 26, 2024

Crucial resources to empower you with data that matters. ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

Inner Thoughts

Friday, April 26, 2024

'The Inner Circle' Comes Around... Inner Thoughts By MG Siegler • 26 Apr 2024 View in browser View in browser If you'll allow me a brief meta blurb this week (not a Meta blurb, plenty of

Digest #135: Kubernetes Hacks, Terraform CI/CD, HashiCorp Acquisition, AWS Data Transfer Monitoring

Friday, April 26, 2024

Explore Advanced Kubernetes Techniques, Dive Into Terraform CI/CD Frameworks, Monitor AWS Data Transfer, and Explore Cloud Security with Gitleaks! ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏