Import AI 220: Google builds an AI border wall; better speech rec via pre-training; plus, a summary of ICLR papers

If you haven't met me in real life and are curious what I sound like, take a listen to this Skynet Today 'Let's Talk AI' podcast where I talk about one of my major obsessions - measuring and assessing the onward march of AI progress and impact. 
View this email in your browser

Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your chums an AI upgrade. Subscribe here.

Want to measure progress towards AGI? Welcome to a sissyphean task!
....Whenever we surpass an AGI-scale benchmark, we discover just how limited it really was...
One of the reasons it's so hard to develop general intelligence is whenever people come close to beating a benchmark oriented around measuring progress towards AGI, we discover just how limited this benchmark was and how far we have to go. That's the gist of a new blogpost from a "fervent generalist" from a person using the pseudonym 'Z', which discusses some of the problems inherent to measuring progress towards advanced AI systems.
  "Tasks we've succeeded at addressing with computers seem mundane, mere advances in some other field, not true AI. We miss that it was work in AI that lead to them," they write. "Perhaps the benchmarks were always flawed, because we set them as measures of a general system, forgetting that the first systems to break through might be specialized to the task. You only see how "hackable" the test was after you see it "passed" by a system that clearly isn't "intelligent"."

So, what should we do? The author is fairly pessimistic about our ability to make progress here, because whenever people define new harder benchmarks, that usually incentivizes the AI community to collectively race to develop a system that can beat the benchmark. "Against such relentless optimization both individually and as a community, any decoupling between the new benchmark and AGI progress will manifest."

Why this matters: Metrics are one of the ways we can orient ourselves with regard to the scientific progress being made by AI systems - and posts like this remind us that any single set of metrics is likely to be flawed or overfit in some way. My intuition is the way to go is developing ever-larger suites of AI testing systems which we can then use to more holistically characterize the capabilities of any given system.
  Read more: The difficulty of AI benchmarks (Singular Paths, blog).

###################################################

What's hard and what's easy about measuring AI? Check out what the experts say:
...Research paper lays out measurement and assessment challenges for AI policy…
Last year I helped organize a workshop at Stanford that brought together over a hundred AI practitioners and researchers to discuss the challenges of measuring and assessing AI. Our workshop identified six core challenges for measuring AI systems:
- Defining AI; as anyone knows, every policymaking exercise starts with definitions, and our definitions of AI are lacking.
- What are the factors that drive AI progress and how can we disambiguate them?
- How do we use bibliometric data to improve our analysis?
- What tools are available to help us analyze the economic impact of AI?
- How can we measure the societal impact of AI?
- What methods can we use to better anticipate the risks and threats of deployed AI systems?

Podcast conversation: Myself and Ray Perrault, co-chairs of the AI Index - a Stanford initiative to measure and assess AI, which hosted the workshop - recently appeared on the 'Let's Talk AI' podcast to discuss the paper with Sharon Zhou.

Why this matters: Before we can regulate AI, we need to be able to measure and assess it at various levels of abstraction. Figuring out better tools to use to measure AI systems will help technologists create information that can drive policy decisions. More broadly, by building 'measurement infrastructure' within governments, we can improve the ability for civil society to anticipate and oversee challenges brought on by the maturation of AI technology.
  Read more: Measurement in AI Policy: Opportunities and Challenges (arXiv).
    Listen to the podcast here: Measurement in AI Policy: Opportunities and Challenges (Let's Talk AI, Podbean).

###################################################
Thanks for reading. If you have suggestions, comments or other thoughts you can reach me at jack@jack-clark.net or tweet at me@jackclarksf

Twitter
Facebook
Website
Copyright © 2020 Import AI, All rights reserved.
You are receiving this email because you signed up for it. Welcome!

Our mailing address is:
Import AI
Many GPUs
Oakland, California 94609

Add us to your address book


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list

Email Marketing Powered by Mailchimp

Older messages

Import AI 219: Climate change and function approximation; Access Now leaves PAI; LSTMs are smarter than they seem

Monday, October 19, 2020

If the deployment of AI systems starts to change cultures, how might we expect AI systems to be re-engineered to account for expected cultural changes? View this email in your browser Welcome to Import

Import AI 218: Testing bias with CrowS; how Africans are building a domestic NLP community; COVID becomes a surveillance excuse

Monday, October 12, 2020

If last year was about scaling things up and this year is about developing multi-modal networks (eg, ones that learn text and image representations in tandem, like this demo from the Allen Institute

Import AI 217: Deepfaked congressmen and deepfaked kids; steering GPT3 with GeDi; Amazon's robots versus its humans

Monday, October 5, 2020

What will be the AI experiment equivalent of the Large Hadron Collider? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your

Import AI 216: Google learns a learning optimizer; resources for African NLP; US and UK deepen AI coordination

Monday, September 28, 2020

'Come out, come out, wherever you are!' - Alexa, playing a 'game' with some human children, by playing hide and seek via Amazon's new Ring indoor security drone. 2022. View this

Import AI 214 (fixed!): NVIDIA+ARM; a 57-subject NLP test; and AI for plant disease identification

Monday, September 14, 2020

Plus: Anduril's new drone; the computational power of the brain Apologies for the half-formed newsletter sent earlier today! View this email in your browser Welcome to Import AI, a newsletter about

You Might Also Like

15 ways AI saved me weeks of work in 2024

Monday, December 23, 2024

ZDNET's product of the year; Windows 11 24H2 bug list updated -- ZDNET ZDNET Tech Today - US December 23, 2024 AI applications on various devices. 15 surprising ways I used AI to save me weeks of

Distributed Locking: A Practical Guide

Monday, December 23, 2024

If you're wondering how and when distributed locking can be useful, here's the practical guide. I explained why distributed locking is needed in real-world scenarios. Explored how popular tools

⚡ THN Weekly Recap: Top Cybersecurity Threats, Tools and Tips

Monday, December 23, 2024

Your one-stop-source for last week's top cybersecurity headlines. The Hacker News THN Weekly Recap The online world never takes a break, and this week shows why. From ransomware creators being

⚙️ OpenA(G)I?

Monday, December 23, 2024

Plus: The Genesis Project ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

Post from Syncfusion Blogs on 12/23/2024

Monday, December 23, 2024

New blogs from Syncfusion Introducing the New WinUI Kanban Board By Karthick Mani This blog explains the features of the new Syncfusion WinUI Kanban Board control introduced in the 2024 Volume 4

Import AI 395: AI and energy demand; distributed training via DeMo; and Phi-4

Monday, December 23, 2024

What might fighting for freedom in an AI age look like? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

LockBit Ransomware Developer Charged for Billions in Global Damages

Monday, December 23, 2024

THN Daily Updates Newsletter cover The Data Science Handbook, 2nd Edition ($60.00 Value) FREE for a Limited Time Practical, accessible guide to becoming a data scientist, updated to include the latest

Re: How to know if your data has been exposed

Monday, December 23, 2024

Imagine getting an instant notification if your SSN, credit card, or password has been exposed on the dark web — so you can take action immediately. Surfshark Alert does just that. It helps you stay

Christmas On Repeat 🎅

Monday, December 23, 2024

Christmas nostalgia is a hell of a drug. Here's a version for your browser. Hunting for the end of the long tail • December 22, 2024 Hey all, Ernie here with a refresh of a piece from our very

SRE Weekly Issue #456

Monday, December 23, 2024

View on sreweekly.com A message from our sponsor, FireHydrant: On-call during the holidays? Spend more time taking in some R&R and less getting paged. Let alerts make their rounds fairly with our