Import AI 220: Google builds an AI border wall; better speech rec via pre-training; plus, a summary of ICLR papers

If you haven't met me in real life and are curious what I sound like, take a listen to this Skynet Today 'Let's Talk AI' podcast where I talk about one of my major obsessions - measuring and assessing the onward march of AI progress and impact. 
View this email in your browser

Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your chums an AI upgrade. Subscribe here.

Want to measure progress towards AGI? Welcome to a sissyphean task!
....Whenever we surpass an AGI-scale benchmark, we discover just how limited it really was...
One of the reasons it's so hard to develop general intelligence is whenever people come close to beating a benchmark oriented around measuring progress towards AGI, we discover just how limited this benchmark was and how far we have to go. That's the gist of a new blogpost from a "fervent generalist" from a person using the pseudonym 'Z', which discusses some of the problems inherent to measuring progress towards advanced AI systems.
  "Tasks we've succeeded at addressing with computers seem mundane, mere advances in some other field, not true AI. We miss that it was work in AI that lead to them," they write. "Perhaps the benchmarks were always flawed, because we set them as measures of a general system, forgetting that the first systems to break through might be specialized to the task. You only see how "hackable" the test was after you see it "passed" by a system that clearly isn't "intelligent"."

So, what should we do? The author is fairly pessimistic about our ability to make progress here, because whenever people define new harder benchmarks, that usually incentivizes the AI community to collectively race to develop a system that can beat the benchmark. "Against such relentless optimization both individually and as a community, any decoupling between the new benchmark and AGI progress will manifest."

Why this matters: Metrics are one of the ways we can orient ourselves with regard to the scientific progress being made by AI systems - and posts like this remind us that any single set of metrics is likely to be flawed or overfit in some way. My intuition is the way to go is developing ever-larger suites of AI testing systems which we can then use to more holistically characterize the capabilities of any given system.
  Read more: The difficulty of AI benchmarks (Singular Paths, blog).

###################################################

What's hard and what's easy about measuring AI? Check out what the experts say:
...Research paper lays out measurement and assessment challenges for AI policy…
Last year I helped organize a workshop at Stanford that brought together over a hundred AI practitioners and researchers to discuss the challenges of measuring and assessing AI. Our workshop identified six core challenges for measuring AI systems:
- Defining AI; as anyone knows, every policymaking exercise starts with definitions, and our definitions of AI are lacking.
- What are the factors that drive AI progress and how can we disambiguate them?
- How do we use bibliometric data to improve our analysis?
- What tools are available to help us analyze the economic impact of AI?
- How can we measure the societal impact of AI?
- What methods can we use to better anticipate the risks and threats of deployed AI systems?

Podcast conversation: Myself and Ray Perrault, co-chairs of the AI Index - a Stanford initiative to measure and assess AI, which hosted the workshop - recently appeared on the 'Let's Talk AI' podcast to discuss the paper with Sharon Zhou.

Why this matters: Before we can regulate AI, we need to be able to measure and assess it at various levels of abstraction. Figuring out better tools to use to measure AI systems will help technologists create information that can drive policy decisions. More broadly, by building 'measurement infrastructure' within governments, we can improve the ability for civil society to anticipate and oversee challenges brought on by the maturation of AI technology.
  Read more: Measurement in AI Policy: Opportunities and Challenges (arXiv).
    Listen to the podcast here: Measurement in AI Policy: Opportunities and Challenges (Let's Talk AI, Podbean).

###################################################
Thanks for reading. If you have suggestions, comments or other thoughts you can reach me at jack@jack-clark.net or tweet at me@jackclarksf

Twitter
Facebook
Website
Copyright © 2020 Import AI, All rights reserved.
You are receiving this email because you signed up for it. Welcome!

Our mailing address is:
Import AI
Many GPUs
Oakland, California 94609

Add us to your address book


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list

Email Marketing Powered by Mailchimp

Older messages

Import AI 219: Climate change and function approximation; Access Now leaves PAI; LSTMs are smarter than they seem

Monday, October 19, 2020

If the deployment of AI systems starts to change cultures, how might we expect AI systems to be re-engineered to account for expected cultural changes? View this email in your browser Welcome to Import

Import AI 218: Testing bias with CrowS; how Africans are building a domestic NLP community; COVID becomes a surveillance excuse

Monday, October 12, 2020

If last year was about scaling things up and this year is about developing multi-modal networks (eg, ones that learn text and image representations in tandem, like this demo from the Allen Institute

Import AI 217: Deepfaked congressmen and deepfaked kids; steering GPT3 with GeDi; Amazon's robots versus its humans

Monday, October 5, 2020

What will be the AI experiment equivalent of the Large Hadron Collider? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your

Import AI 216: Google learns a learning optimizer; resources for African NLP; US and UK deepen AI coordination

Monday, September 28, 2020

'Come out, come out, wherever you are!' - Alexa, playing a 'game' with some human children, by playing hide and seek via Amazon's new Ring indoor security drone. 2022. View this

Import AI 214 (fixed!): NVIDIA+ARM; a 57-subject NLP test; and AI for plant disease identification

Monday, September 14, 2020

Plus: Anduril's new drone; the computational power of the brain Apologies for the half-formed newsletter sent earlier today! View this email in your browser Welcome to Import AI, a newsletter about

You Might Also Like

Tuesday Triage #200 and giveaway

Tuesday, May 14, 2024

Your weekly crème de la crème of the Internet is here! The 200th edition featuring annual subscriptions giveaway, thoughts on nearly four years of ... ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

🎮 How AI Tools Are Changing Game Development — Grab a Pixel 8a Instead of Waiting for Pixel 9

Tuesday, May 14, 2024

Also: Sharing Your Google Maps Trip Progress, and More! How-To Geek Logo May 14, 2024 Did You Know In a bid to keep the ingredients secret, WD-40 was never patented. 🤖 The New GPT It's Tuesday!

Meta shuts down Workplace

Tuesday, May 14, 2024

Plus: Everything that happened at Google I/O and AWS CEO steps down View this email online in your browser By Christine Hall Tuesday, May 14, 2024 Hello, and welcome back to TechCrunch PM. The team

Flattening Lists of Lists, Python 3.13, Sets, and More

Tuesday, May 14, 2024

Flattening a List of Lists in Python #629 – MAY 14, 2024 VIEW IN BROWSER The PyCoder's Weekly Logo Flattening a List of Lists in Python In this video course, you'll learn how to flatten a list

Daily Coding Problem: Problem #1441 [Easy]

Tuesday, May 14, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. UTF-8 is a character encoding that maps each symbol to one, two, three, or four

Noonification: 3 Quick Ways to Optimize RecyclerView

Tuesday, May 14, 2024

Top Tech Content sent at Noon! Get Algolia: AI Search that understands How are you, @newsletterest1? 🪐 What's happening in tech today, May 14, 2024? The HackerNoon Newsletter brings the HackerNoon

Using 97 fewer cores thanks to PGO

Tuesday, May 14, 2024

Plus an HNSW indexed vector store library, a new Go game hits the Steam store, and is 'ok' ok?. | #​507 — May 14, 2024 Unsub | Web Version Together with Stytch logo Go Weekly Reclaiming CPU for

Ranked | The Top 6 Economies by Share of Global GDP (1980-2024) 📈

Tuesday, May 14, 2024

Gain a unique perspective on the world's economic order from this graphic showing percentage share of global GDP over time. View Online | Subscribe Presented by: Data that drives the

Free online event this Thursday: Getting ahead with time series data

Tuesday, May 14, 2024

Free Online Event Do you know how your competitors use time series data to get ahead? Join us on Thursday, May 16 at 10am PT/1pm ET for a free, hour-long online fireside chat called “Unleash the Full

Here's the deal

Tuesday, May 14, 2024

We wanted you to be among the first to know about our plans to relaunch the Gigantic training courses that Product Collective now powers! Here's the deal: From May 20th - May 31st, anybody that