Astral Codex Ten - Pause For Thought: The AI Pause Debate
I. Last month, Ben West of the Center for Effective Altruism hosted a debate among long-termists, forecasters, and x-risk activists about pausing AI. Everyone involved thought AI was dangerous and might even destroy the world, so you might expect a pause - maybe even a full stop - would be a no-brainer. It wasn’t. Participants couldn’t agree on basics of what they meant by “pause”, whether it was possible, or whether it would make things better or worse. There was at least some agreement on what a successful pause would have to entail. Participating governments would ban “frontier AI models”, for example models using more training compute than GPT-4. Smaller models, or novel uses of new models would be fine, or else face an FDA-like regulatory agency. States would enforce the ban against domestic companies by monitoring high-performance microchips; they would enforce it against non-participating governments by banning export of such chips, plus the usual diplomatic levers for enforcing treaties (eg nuclear nonproliferation). The main disagreements were:
I’ve grouped opinions into five categories: Simple Pause: What if we just asked AI companies to pause for six months? Or maybe some longer amount of time? This was the request in the FLI Pause Giant AI Experiments open letter, signed by thousands of AI scientists, businesspeople, and thought leaders, including many participants in this debate. So you might think the debate organizers could find one person to argue for it. They couldn’t. The letter was such a watered-down compromise that nobody really supported it, even though everyone signed it to express support for one or another of the positions it compromised between. Why don’t people want this? First, most people think it will take the AI companies more than six months of preliminary work before they start training their next big model anyway, so it’s useless. Second, even if we do it, six months from now the pause will end, and then we’re more or less where we are right now. Except worse, for two reasons:
These are counterbalanced by one benefit:
This benefit is real, but this kind of pause doesn’t optimize it. Technical alignment research benefits from advanced models to experiment on; the Surgical Pause strategy takes this consideration more seriously. And social/political preparation depends on some kind of plan: this is what the Regulatory Pause strategy adds. Surgical Pause: The Surgical Pause tweaks the Simple Pause to add two extra considerations:
Obviously the problem with the Surgical Pause is that we might not know when we’re on the verge of dangerous AI, and we might not know how much of a lead “the good guys” have. Surgical Pause proponents suggest being very conservative with both free variables. This is less of a well-thought-out plan and more saying “come on guys, let’s at least try to be strategic here”. At the limit, it suggests we probably shouldn’t pause for six months, starting right now. Since this involves leading labs burning their lead time for safety, in theory it could be done unilaterally by the single leading lab, without international, governmental, or even inter-lab coordination. But you could buy more time if you got those things too. Some leading labs have promised to do this when the time is right - for example OpenAI and (a previous iteration of) DeepMind - with varying levels of believability. AnonResearcherAtMajorAILab discussed some of the strategy here in Aim For Conditional AI Pauses, and this Less Wrong post is also very good. Regulatory Pause: If one benefit of the Simple Pause is to use the time to prepare for AI socially and politically, maybe we should just pause until we’ve completed social and political preparations. David Manheim suggests a monitoring agency like the FDA. It would “fast-track” small AIs and trivial re-applications of existing AIs, but carefully monitor new “frontier models” for signs of danger. Regulators might look for dangerous capabilities by asking AIs to hack computers or spread copies of themselves, or test whether they’ve been programmed against bias/misinformation/etc. We could pause only until we’ve set up the regulatory agency, and take hostile actions (like restrict chip exports) only to other countries that don’t cooperate with our regulators or set up domestic regulators of their own. Many people in tech are regulation-skeptical libertarians, but proponents point out that regulation fails in a predictable direction: it usually does successfully prevent bad things, it just also prevents good things too. Since the creation of the Nuclear Regulatory Commission in 1975, there has never been a major nuclear accident in the US. And sure, this is because the NRC prevented any nuclear plants from being built in the United States at all from 1975 to 2023 (one was finally built in July). Still, they technically achieved their mandate. Likewise, most medications in the US are safe and relatively effective, at the cost of an FDA approval process being so expensive that we only get a tiny trickle of new medications each year and hundreds of thousands of people die from unnecessary delays. But medications are safe and effective. Or: San Francisco housing regulators almost never approve new housing, so housing costs millions of dollars and thousands of San Franciscans are homeless - but certainly there’s no epidemic of bad houses getting approved and then ruining someone’s view or something. If we extrapolate this track record to AI, AI regulators will be overcautious, progress will slow by orders of magnitude or stop completely - but AIs will be safe. This is a depressing prospect if you think the problems from advanced AI would be limited to more spam or something. But if you worry about AI destroying the world, maybe you should accept a San-Francisco-housing-level of impediment and frustration. A regulatory pause could be better than a total stop if you think it will be more stable (lots of industries stay heavily regulated forever, and only a few libertarians complain), or if you think maybe the regulator will occasionally let a tiny amount of safe AI progress happen. But it could be worse than a total stop if you expect continued progress will eventually produce unsafe AIs regardless of regulation. You might expect this if you’re worried about deceptive alignment, eg superintelligent AIs that deliberately trick regulators into thinking they’re safe. Or you might think AIs will eventually be so powerful that they can endanger humanity from a walled-off test environment even before official approval. The classic Bostrom/Yudkowsky model of alignment implies both of these things. David Manheim and Thomas Larsen set out their preferred versions of this strategy in What’s In A Pause? and Policy Ideas For Mitigating AI Risk. Total Stop: If you expect AIs to exhibit deceptive alignment capable of fooling regulators, or to be so dangerous that even testing them on a regulator’s computer could be apocalyptic, maybe the only option is a total stop. It’s tough to imagine a total stop that works for more than a few years. You have at least three problems:
Regulators can plausibly control the flow of supercomputers, at least domestically. But eventually technology will advance to the point where you can train an AI on anything. Then you either have to ban all computing, restrict it at gradually more extreme levels (1990 MS-DOS machines! No, punch cards!) or accept that AI is going to happen. Still, you can imagine this buying us a few decades. Rob Bensinger defended this view in Comments On Manheim’s “What’s In A Pause?”, and it’s the backdrop to Holly Elmore’s Case For AI Advocacy To The Public². No Pause: Or we could not do any of that. If we think alignment research is going well, and that a pause would mess it up, or cause a compute overhang leading to un-research-able fast takeoff, or cede the lead to China, maybe we should stick with the current rate of progress. Nora Belrose made this argument in AI Pause Will Likely Backfire. Specifically:
Along with this overall arc, the debate included a few other points: Holly Elmore argued in The Case For AI Advocacy To The Public that pro-pause activists should be more willing to take their case to the public. EA has a long history of trying to work with companies and regulators, and has been less confident in its ability to execute protests, ads, and campaigns. But in most Western countries, the public hates AI and wants to stop it. If you also want to stop it, the democratic system provides fertile soil. Holly is putting her money where her mouth is and leading anti-AI protests at the Meta office in San Francisco; the first one was last month, but there might be more later. Matthew Barnett said in The Possibility Of An Indefinite AI Pause that it might be hard to control the length of a pause once started, and might drag on longer than people who expected a well-planned surgical pause might like. He points to supposedly temporary moratoria that later became permanent (eg aboveground nuclear test ban, various bans on genetic engineering) and regulatory agencies that became so strict they caused the subject of their regulation to essentially cease to happen (eg nuclear plant construction for several decades). Such an indefinite pause would either collapse in a disastrous actualization of compute overhang, or require increasingly draconian international pressure to sustain. He thinks of this as a strong argument against most forms of pause, although he is willing to consider a “licensing” system that looks sort of like regulation. Quintin Pope said in AI Is Centralizing By Default, Let’s Not Make It Worse that the biggest threat from AI is centralizing power, either to dictators or corporations. AIs are potentially more loyal flunkies than humans, and let people convert power (including political power and money) into intelligence more efficiently than the usual methods. His interest is mostly in limiting the damage, putting him skew to most of the other people in this debate. He would support regulation that makes it easier for small labs to catch up to big ones, or that limits the power-centralizing uses of AI, but oppose regulation focused on centralizing AI power into a few big, supposedly-safer corporations. II. For a “debate”, this lacked much inter-participant engagement. Most people posted their manifesto and went home. The exception was the comments section of Nora’s post, AI Pause Will Likely Backfire. As usual, a lot of the discussion was just clarifying what everyone was fighting about, but there were also a few real fights:
In How Could A Moratorium Fail?, David Manheim discussed his own takeaways from the debate:
III. Some added thoughts of my own: First, I think it’s silly to worry about world dictatorships here. The failure mode for global treaties is that the treaty doesn’t get signed or doesn’t work. Consider the various global warming treaties (eg Kyoto) or the United Nations. Even though many ordinary people (ie non-x-risk believers) dislike AI enough to agree to a ban, they’re not going to support it when it starts interfering with their laptops or gaming rigs, let alone if it requires ceding national sovereignty to the UN or something. Second, if we never get AI, I expect the future to be short and grim. Most likely we kill ourselves with synthetic biology. If not, some combination of technological and economic stagnation, rising totalitarianism + illiberalism + mobocracy, fertility collapse and dysgenics will impoverish the world and accelerate its decaying institutional quality. I don’t spend much time worrying about any of these, because I think they’ll take a few generations to reach crisis level, and I expect technology to flip the gameboard well before then. But if we ban all gameboard-flipping technologies (the only other one I know is genetic enhancement, which is even more bannable), then we do end up with bioweapon catastrophe or social collapse. I’ve said before I think there’s a ~20% chance of AI destroying the world. But if we don’t get AI, I think there’s a 50%+ chance in the next 100 years we end up dead or careening towards Venezuela. That doesn’t mean I have to support AI accelerationism because 20% is smaller than 50%. Short, carefully-tailored pauses could improve the chance of AI going well by a lot, without increasing the risk of social collapse too much. But it’s something on my mind. Third, most participants agree that a pause would necessarily be temporary. There’s no easy way to enforce it once technology gets so good that you can train an AI on your laptop, and (absent much wider adoption of x-risk arguments) government’s won’t have the stomach for hard ways. The singularity prediction widget currently predicts 2040. If I make drastic changes to starve everybody of computational resources, the furthest I can push it back is 2070. This somewhat reassures me about my concerns above, but not completely. Matthew Barnett talks about whether a temporary pause could become permanent, and concludes probably not without a global police state. But I think people 100 years ago would be surprised that the state of California has managed to effectively ban building houses. I think if some anti-house radical had proposed this 100 years ago, people would have told her that would be impossible without a hypercompetent police state³. Fourth, there are many arguments that a pause would be impossible, but they mostly don’t argue against trying. We could start negotiating an international AI pause treaty, and only sign it if enough other countries agree that we don’t expect to be unilaterally-handicapping ourselves. So “China will never agree!” isn’t itself an argument against beginning diplomacy, unless you expect that just starting the negotiations would cause irresistible political momentum toward signing even if the end treaty was rigged against us. Fifth, a lot hinges on whether alignment research would be easier with better models. I’ve only talked to a handful of alignment researchers about this, but they say they still have their hands full with GPT-4. I would like to see broader surveys about this (probably someone has done these, I just don’t know where). I find myself willing to consider trying a Regulatory or Surgical Pause - a strong one if proponents can secure multilateral cooperation, otherwise a weaker one calculated not to put us behind hostile countries (this might not be as hard as it sounds; so far China has just copied US advances; it remains to be seen if they can do cutting-edge research). I don’t entirely trust the government to handle this correctly, but I’m willing to see what they come up with before rejecting it. Thanks to Ben and everyone who participated. You can find all posts, including some unofficial late posts I didn’t cover, here. 1 Zach writes in an email: “Much/most of my concern about China isn't China has worse values than US or even Chinese labs are less safe than Western labs but rather it's better for leading labs to be friendly with each other (mostly to better coordinate and avoid racing near the end), so (a) it's better for there to be fewer leading labs and (b) given that there will be Western leading labs it's better for all leading labs to be in the West, and ideally in the US […] 2 Holly writes in an email: “I also think [you’re] taking the distinction between a mere pause and a regulatory pause too much from the opponents. The people who are out asking for a pause (like me and PauseAI) mostly want a long pause in which alignment research could either work, effective regulations could be put in place, or during which we don’t die if alignment isn’t going to be possible.I suppose I didn’t get into that in my entry but I would Iike to see [you] engage with the possibility that alignment doesn’t happen, especially since [you] seem to think civilization will decline for one reason or another without AI in the future. I think the assumption of [this] piece was too much AI development as the default. “ 3 Matthew responds in an email: “I'd like to point out that the modern practice of restricting housing can be traced back to 1926 when the Supreme Court ruled that enforcing land-use regulation and zoning policy was a valid exercise of a state's police power. The idea that we could effectively ban housing would not have been inconceivable to people 100 years ago, and indeed many people (including the plaintiffs in the case) were worried about this type of outcome.I don't think people back then would have said that zoning would require a hypercompetent police state. It's more likely that they would say that zoning requires an intrusive expansion of government powers. I think they would have been correct in this assessment, and we got the expansion that they worried about.Unlike banning housing, banning AI requires that we can't have any exceptions. It's not enough to ban AI in the United States if AI can trained in Switzerland. This makes the proposal for an indefinite pause different from previous regulatory expansions, and in my opinion much more radical.To the extent you think that such crazy proposals simply aren't feasible, then you likely agree with me that we shouldn't push for an indefinite pause. That said, you also predicted that if current trends continued, "rising totalitarianism + illiberalism + mobocracy, fertility collapse and dysgenics will impoverish the world and accelerate its decaying institutional quality". This prediction doesn't seem significantly less crazy to me than the prediction that governments around the will attempt to ban AI globally (sloppily, and with severe negative consequences). I don't think it makes much sense to take one of these possibilities seriously and dismiss the other.” My answer: I think there’s a difference between the regulatory framework for something existing vs. expecting it. It’s constitutional and legal for the US to raise the middle-class tax rate to 99%, but most people would still be surprised if it happened. I’m surprised how easy it is for governments to effectively ban things without even trying just by making them annoying. Could this create an AI pause that lasts decades? My Inside View answer is no; my Outside View answer has to be “maybe”. Maybe they could make hardware progress and algorithmic progress so slow that AI never quite reaches the laptop level before civilization loses its ability to do technological advance entirely? Even though this would be a surprising world, I have more probability on something like this than on a global police state. Possible exception if AI does something crazy (eg launches nukes) that makes all world governments over-react and shift towards the police state side, but at that point we’re not discussing policies in the main timeline anymore. You're currently a free subscriber to Astral Codex Ten. For the full experience, upgrade your subscription. |
Older messages
How Are The Gay Younger Brothers Doing?
Wednesday, October 4, 2023
...
Open Thread 296
Monday, October 2, 2023
...
Links For September 2023
Thursday, September 28, 2023
...
Open Thread 295
Monday, September 25, 2023
...
ACX Classifieds 9/23
Thursday, September 21, 2023
...
You Might Also Like
Guest Newsletter: Five Books
Saturday, November 23, 2024
Five Books features in-depth author interviews recommending five books on a theme Guest Newsletter: Five Books By Sylvia Bishop • 23 Nov 2024 View in browser View in browser Five Books features in-
Weekend Briefing No. 563
Saturday, November 23, 2024
Beyond the Bots -- The Lonely Technology Trap -- Africa's Healthcare Paradox ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Gladiators, vanity and self-restraint
Saturday, November 23, 2024
+ what's causing West Coast's drenching weather
Isabelle Huppert’s Uniqlo Socks and Paige DeSorbo’s White T-shirt
Saturday, November 23, 2024
Plus: Inside New York department stores of yore. The Strategist Every product is independently selected by editors. If you buy something through our links, New York may earn an affiliate commission.
The best carry-on backpacks
Saturday, November 23, 2024
A few of our favorites are on sale View in browser Ad The Recommendation Ad Consider a carry-on travel backpack Three carry-on backpacks pictured together. Connie Park/NYT Wirecutter Opening a good
☕ Ragebait
Saturday, November 23, 2024
Bluesky might be having its Justin Bieber moment... November 23, 2024 View Online | Sign Up | Shop Morning Brew Presented By The Points Guy Good morning. Christkindlmarket season is upon us. Here's
The Russian Missile, America's Deadliest Animals, and a Math Emergency
Saturday, November 23, 2024
NATO and Ukrainian officials will hold emergency talks Tuesday after Russia escalated hostilities with a hypersonic missile strike on a military facility in Dnipro last Thursday. ͏ ͏ ͏ ͏ ͏ ͏
Your new crossword for Saturday Nov 23 ✏️
Saturday, November 23, 2024
View this email in your browser Take a mental break with this week's crosswords: We have six new puzzles teed up for you this week. Play the latest Vox crossword right here, and find all of our new
YOU LOVE TO SEE IT: Freeing Captive Audiences
Saturday, November 23, 2024
Employers have to free their captives, the Feds curb a health care giant's shopping spree, Google's big breakup levels up, scam call crackdowns are showing results, and Facebook's anti-
What A Day: MTGeeze Louise
Saturday, November 23, 2024
DOGE just got dumber. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏