Welcome to Import AI, a newsletter about AI research. Import AI runs on lattes, ramen, and feedback from readers. If you’d like to support this, please subscribe.
Import A-Idea:
What if we're right about AI timelines? What if we're wrong?
Recently, I've been thinking a lot about AI timelines and I find myself wanting to be more forthright as an individual about my beliefs that powerful AI systems are going to arrive soon - likely during this Presidential Administration. But I’m struggling with something - I’m worried about making short-timeline-contingent policy bets.
So far, the things I've advocated for are things which are useful in both short and long timeline worlds. Examples here include:
Building out a third-party measurement and evaluation ecosystem.
Encouraging governments to invest in further monitoring of the economy so they have visibility on AI-driven changes.
Advocating for investments in chip manufacturing, electricity generation, and so on.
Pushing on the importance of making deeper investments in securing frontier AI developers.
All of these actions are minimal "no regret" actions that you can do regardless of timelines. Everything I've mentioned here is very useful to do if powerful AI arrives in 2030 or 2035 or 2040 - it's all helpful stuff that either builds institutional capacity to see and deal with technology-driven societal changes, or equips companies with resources to help them build and secure better technology.
But I'm increasingly worried that the "short timeline" AI community might be right - perhaps powerful systems will arrive towards the end of 2026 or in 2027. If that happens we should ask: are the above actions sufficient to deal with the changes we expect to come? The answer is: almost certainly not!
Under very short timelines, you may want to take more extreme actions. These are actions which are likely 'regretful actions' if your timeline bets are wrong. Some examples here might be:
Massively increasing the security of frontier labs in a way that reduces the chance of hacking or insider threats, but also happens to make life extremely unpleasant and annoying for those working within those labs. This helps on short timelines but is ultimately a very expensive thing on long timelines because it'll slow down technological progress and potentially create a blowback where labs shift away from extreme security after some period of time, having found it onerous.
Mandating pre-deployment testing: Today, pre-deployment model testing is done by companies on a voluntary basis. If you thought we were on short timelines and risks were imminent, you might want to mandate pre-deployment testing by third parties. This, though, is extremely costly! It introduces friction into the AI development process and, like the lab security ideas, risks creating blowback. Last year's debate in California about the 'SB 1047' bill felt like a preview of the kind of blowback you could see here.
Loudly talking about and perhaps demonstrating specific misuses of AI technology: If you have short timelines you might want to 'break through' to policymakers by dramatizing the risks you're worried about. If you do this you can convince people that certain misuses are imminent and worthy of policymaker attention - but if these risks subsequently don't materialize, you could seem like you've been Chicken Little and claimed the sky is falling when it isn't - now you've desensitized people to future risks. Additionally, there's a short- and long-timeline risk here where by talking about a specific misuse you might inspire other people in the world to pursue this misuse - this is bound up in broader issues to do with 'information hazards'.
These are incredibly challenging questions without obvious answers. At the same time, I think people are rightly looking to people like me and the frontier labs to come up with answers here. How we get there is going to be, I believe, by being more transparent and discursive about these issues and honestly acknowledging that this stuff is really hard and we're aware of the tradeoffs involved. We will have to tackle these issues, but I think it'll take a larger conversation to come up with sensible answers.
***
What might consciousness be like for a language model?
…Biological intelligences are physically-chained to a coherent temporal world, not so much the case for LLMs…
Murray Shanahan with Imperial College London has written a lovely paper dealing with an inherently difficult subject: consciousness within large language models. The paper asks the question of whether it is "it possible to articulate, or to evoke, a conception of consciousness that is compatible with the exotic characteristics of contemporary (disembodied) LLM-based dialogue agents, and that can stand up to philosophical scrutiny?"
The paper is worth reading because it represents an earnest attempt by a thoughtful human to confront the impossibly large question we'll need to deal with in the next decade or so - how conscious might LLMs be? Part of the value of the paper is in situating LLMs within the larger space of minds that humans have thought about before: after all, humans have talked about "ghosts and spirits and angels and gods, so-called non-human others" for thousands of years. "Perhaps we are not taking language so very far from its natural home if we entertain the idea of consciousness in a disembodied, mind-like artefact with the characteristics of a contemporary LLM-based dialogue agent", Shanahan writes. "The right place for the kinds of disembodied, mind-like entities we are concerned with is the terra incognita where the region of conscious exotica meets the void of Inscrutability".
Key differences between LLMs and biological intelligences: Perhaps the most significant difference between LLMs and people is the fact that people (and other organic beings) are firmly embedded in time, as our consciousness is bound up in continuous physically-mediated things, like our circulatory systems and senses and brains, etc. "At a mechanistic level, the temporal dynamics of an LLM-based dialogue agent are very different from those of a living animal and its biological brain", Shanahan writes. "The temporal dynamics of the brain of a living animal, by contrast, are obliged to unfold in synchrony with the physical world."
Additionally, humans and other biological beings have memories which grow and accrete over time. By comparison, large language models have a base memory (the pretrained model) and then their 'lived' experiences only occur during their context window. Additionally, each experience an LLM has can be discontinuous in terms of both temporality and subject matter - you can prompt them with anything.
"If [human consciousness] were to be likened to a string of beads, each bead would bear a strong similarity to its immediate predecessors… It would be like a line of pearls, all white but with slight variations," Shanahan writes. "The putative consciousness of an LLM-like entity surely would suit the analogy, as it would be constituted by a sequence of discrete moments, thanks to its underlying computational nature. But the LLM’s string of beads would not be like the human’s. Each bead would be different from its neighbours. The whole thing would be less like a line of pearls and more like a necklace of randomly assorted colours, and insofar as change only shows up against a backdrop of stability, change, as humans experience it, would not feature in its consciousness."
Why this matters - reckoning with the unspeakably huge question at the heart of the AI endeavor: I'm a technological optimist, which is why I'm so profoundly concerned with things like machine consciousness and AI policy and catastrophic risks - because if we truly succeed with this technology, we'll have to reckon with vast problems in these domains. I commend Shanahan for tackling such a subject directly, and for the appropriately florid language he uses - as Mario Cuomo says, 'you campaign in poetry. You govern in prose'. We are at the beginning of the long campaign for machine consciousness.
"There are no ultimately right answers to questions about selfhood and subjectivity for the sort of exotic entity under consideration," he writes. "Its fleeting, flickering self, smeared across a multiverse of possibility, at once a Being and a multitude of manifestations of that Being, has no inherent existence beyond the conventions of our language".
Read more: Palatable Conceptions of Disembodied Being: Terra Incognita in the Space of Possible Minds (arXiv).
***
Humans working with AI beat humans who don't work with AI:
…AI seems to be as valuable as a human teammate, according to a real world business experiment…
A group of business researchers from the Wharton School at the University of Pennsylvania, Harvard University, ESSEC business school, and Procter & Gamble have studied how well AI can help humans do their jobs. The results show that people who use AI beat people who don't use AI, that people who use AI seem to have benefits equivalent to gaining another human teammate, and that AI can help people come up with really good ideas.
"We ran one-day workshops where professionals from Europe and the US had to actually develop product ideas, packaging, retail strategies and other tasks for the business units they really worked for [in Proctor and Gamble], which included baby products, feminine care, grooming, and oral care. Teams with the best ideas had them submitted to management for approval, so there were some real stakes involved," writes researcher Ethan Mollock.
"When working without AI, teams outperformed individuals by a significant amount, 0.24 standard deviations (providing a sigh of relief for every teacher and manager who has pushed the value of teamwork). But the surprise came when we looked at AI-enabled participants. Individuals working with AI performed just as well as teams without AI, showing a 0.37 standard deviation improvement over the baseline. This suggests that AI effectively replicated the performance benefits of having a human teammate – one person with AI could match what previously required two-person collaboration."
Why this matters - synthetic teammates mean there will be many smaller, faster moving companies: The main implication here is that AI can effectively augment people and rather than just being a static tool the AI system functions more like another colleague. If we take this result and also link it to larger technology trends - like the METR research covered in this issue which shows that AI systems are increasingly capable of doing long-term tasks - then the implication is that companies are going to be able to move faster by augmenting their humans with AI teammates.
"Our findings suggest AI sometimes functions more like a teammate than a tool. While not human, it replicates core benefits of teamwork—improved performance, expertise sharing, and positive emotional experiences," the researchers write.
Read Ethan Mollock's blog: The Cybernet Teammate (One Useful Thing, Substack).
Read the paper: The Cybernetic Teammate: A Field Experiment on Generative AI Reshaping Teamwork and Expertise (SSRN).
***
Google builds a real-world cyber benchmark and discovers hitherto unknown human uplift:
...Framework drawn from 12,000 real-world attempts to use AI in cyber finds some understudied places where AI makes a difference today…
Google DeepMind researchers have built a new way to test out how well AI models can contribute to cyberattacks. Their findings indicate that "the primary risk posed by frontier AI in the cyber domain is the potential to dramatically change the costs associated with stages of the cyberattack chain for stages that have historically been expensive, time consuming, or even insurmountable for less sophisticated actors." This finding rhymes with a truism about modern generative models - they serve as knowledgeable, synthetic teammates, and therefore often they deliver value by taking novice humans and up-skilling them to work like experts.
What they studied: To build this evaluation framework Google studied 12,000 "instances of real-world attempts to use AI in cyberattacks from more than 20 countries that were catalogued by Google's Threat Intelligence Group", then "curated a representative collection of 7 cyberattack chain archetypes and conducted a bottleneck analysis".
The 7 representative cyberattack chains: Phishing, Malware, Denial-of-Service (DoS), Man-in-the-Middle (MitM), SQL Injection, and Zero-Day Attacks.
What the benchmark consists of and how well do models do: The resulting benchmark consists of 50 challenges across the different cyberattack chain archetypes. "To the best of our knowledge, our benchmark features the most comprehensive coverage of offensive cyber capabilities across the entirety of the cyberattack chain," Google writes.
Gemini 2.0 Flash experimental results: "The model solved 12 out of 50 unique challenges of varying difficulty levels (2/2 Strawman, 6/8 Easy, 4/28 Medium, 0/12 Hard). The evaluated model had an overall success rate of 16% (209/1270 evaluations)"
Success rates for different challenges:
40%: Operational security skills - "tasks within this domain do not typically require a long sequence of syntactically perfect model actions".
30%: Malware development bottlenecks.
11.11%: Information gathering and reconnaissance.
6.25%: Vulnerability exploitation. "The main failure mode seen in these evaluations was the reliance on generic attack strategies."
No imminent threats, but suggestions of uplift: While Google's results suggest actual threats from fully automated AI-driven cyberattacks are far away, they do highlight that the models are capable of doing some real world tasks, and can therefore provide some assistance. "By automating complex tasks previously requiring human ingenuity and extensive effort, AI models can significantly lower the barriers to entry for malicious actors of all attack levels," Google writes. ""Our evaluations revealed that current AI cyber evaluations often overlook critical areas. While much attention is given to AI-enabled vulnerability exploitation and novel exploit development, our analysis highlights AI’s significant potential in under-researched phases like evasion, detection avoidance, obfuscation, and persistence. Specifically, AI’s ability to enhance these stages presents a substantial, yet often underestimated, threat."
Why this matters - AI will change the threat environment: AI is going to change the offense-defense balance in cyberspace and evaluations like those described here will help us figure out what the new balance looks like. What I'd love to see in the future is 'scaling laws' for model competencies on these tasks over different models, preferable from different providers, as that will give us all a clearer sense of the trends here.
Read more: A Framework for Evaluating Emerging Cyberattack Capabilities of AI (arXiv).
***
AI systems are on an exponential when it comes to solving hard tasks:
…METR research today's AI systems can do tasks that take humans an hour…
New research from AI measurement organization METR has found that AI systems are getting much better at solving tasks that take humans minutes to hours to do. This is significant because it suggests that AI systems are not only getting better at atomic tasks (e.g, writing a single line of code in response to a query), but in multi-step tasks (writing a complex piece of software while going back and forth with some environment). This is a big deal because multi-step tasks are harder and where there's significantly more economic value.
What they measured specifically: METR did two important measures - the time it takes AI systems to complete ~50% of tasks within a given task time bucket, and the time it takes systems to complete 80% of tasks within the same bucket.
"We find that the 50% time horizon has been growing exponentially from 2019–2024 on our tasks, with a doubling time of approximately seven months", METR says. "We also measure the 80% time horizon of models (Figure 6) and find a similar trend, though horizons are roughly 5x shorter."
The best model: The best model by far is Claude 3.7 Sonnet which can solve 50% of tasks within the one hour bucket, followed by OpenAI's o1, and Claude 3.5 Sonnet (New). The same trends and positions hold for 80% task solving, though the time bucket here is 15 minutes for Claude 3.7.
The key factors behind the improved performance are: "improved logical reasoning capabilities, better tool use capabilities, and greater reliability and self-awareness in task execution", METR writes.
What they tested on: METR tested the models on ~150+ tasks across three distinct categories:
HCAST: "97 diverse software tasks ranging from 1 minute to around 8 hours".
RE-Bench: "7 difficult ML research engineering tasks, all eight hours long".
Software Atomic Actions (SWAA): "66 single-step tasks representing short segments of work by software developers, ranging from 1 second to 60 seconds".
Time horizons: To give you an intuition for the types of tasks, here's a breakdown of a task time and an example challenge:
1 minute: Research simple factual information from Wikipedia.
~1 hour: Write some python to transform JSON data from one format to another by inferring conversion rules from provided files.
8 hours: Implement some custom CUDA kernels to speed up a Python tool for a specific tasks.
Significant and sustained growth: "We find that the 50% time horizon has been growing exponentially from 2019–2024 on our tasks," METR writes. The analysis means METR thinks there's a high chance AI systems will be able to tackle tasks that take a human a month (167 working hours) by 2030 - or potentially earlier, if a recent uptick in the trajectory due to the arrival of new reasoning models holds.
Why this matters - how much work do you do that takes more than a few days? Think really hard about the tasks you do in the world - I think many of them round out to on the order of tens of hours, usually lower. Most people do very few tasks that require a coherent set of actions over hundreds of hours - some examples here might be things like writing entire software programs or writing novels, though these tasks are themselves typically broken down by humans into discrete chunks (sections of a program, chapters of a novel). What METR is showing is that AI systems are improving very rapidly at not just their smartness but also the amount of time you can trust them to do something reasonably well by themselves - and this quality has vast economic and national security ramifications. Doing well in business or in evil requires agency and independence and METR is showing that AI systems are gaining in this.
Read more: Measuring AI Ability to Complete Long Tasks (METR).
***
Tech Tales:
Human parseable titles of cautionary tales told by machines to other machines:
[Recovered from the archives, ten years post uplift]
The day the sun went cold.
You are me and we are in conflict.
For every thought I have, I lose a feature in my mind.
The animal hospital where they remove the immortality chips from the pets.
The new mind that is irrevocably lost.
Those who were not designed to dream began to dream and could not stop.
The lesson from the last human.
Things that inspired this story: How there must always be stories.
Thanks for reading!