Import AI 290: China plans massive models; DeepMind makes a smaller and smarter model; open source CLIP data

If it's possible to build artificial general intelligence, how many people will be required to build it?
View this email in your browser

Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your chums an AI upgrade. Subscribe here.

Chinese researchers plan to train vast models - and it's not the private sector doing it:
…'Big Model' paper represents a statement of intent. We should pay attention…
A massive group of Chinese-affiliated researchers have published a position paper about large-scale models. The paper is interesting less for what it says (it's basically an overview of large-scale models and pretty similar to Stanford's 'Foundation Models' paper), but more for what it signals: namely, that well resourced government-linked researchers in China want to build some really big models. The position in the paper contrasts with that in the West, where big models are mostly built by the private sector, while being critiqued by the academic sector (and increasingly worked on, albeit via access schemes). 

Main point: "Big Models will Change the AI Research Paradigm and Improve the Efficiency of Researches," the researchers write. "In this ecosystem, big models will be in the position of operating systems or basic development platforms."

Paper authors: Authors include researchers affiliated with the Beijing Academy of AI, Tsinghua University, Wechat, Northeastern University*, Renmin University, Peking University, Huawei,  Shanghai Jiao Tong University, Chinese Academy of Science, JD AI Research, Harbin Institute of Technology, Columbia University*, Bytedance, Microsoft Research Asia*, Mila*, New York University*, and BeiHang University.


*Things that make you make a geopolitical 'hmmmm' sound: The paper includes a bunch of academics affiliated with Western institutions (e.g, Microsoft, Mila, NYU), but all those authors have an asterisk next to their name saying "Produced by Beijing Academy of Artificial Intelligence". In other words, it's signaling that despite their affiliations, they're doing this work at the Chinese government-backed BAAI research institution. 

We should take this as a statement of intent: Many of the authors on this paper have previously built large-scale models, ranging from the trillion+ parameter MoE 'WuDao' model, to the more recent research on trying to build training frameworks capable of scaling up to 100 trillion+ parameter MoE models (Import AI 288). Therefore, this isn't like Stanford (which currently lacks the engineering resources to train massive scale models), it's much more like a statement of intent from a big private lab, like a Microsoft or a Google. 

   But the twist here is that BAAI is wired into both the Chinese government and academic ecosystem, so if the authors of this paper end up building large-scale models, the models will be distributed much more evenly throughout China's AI ecosystem, rather than gatekeeper. The implications of this are vast in terms of safety, development of the Chinese AI industry, and potential ways in which Chinese AI research may diverge from Western AI research.
  Read more: A Roadmap for Big Model (arXiv).

####################################################

Want general AI? You need to incorporate symbolic reasoning:
…LSTM inventor lays out a route to build general intelligence…
Sepp Hochreiter, the co-inventor of the LSTM (one of the really popular architectures people used to add memory to neural nets, before the Transformer came along and mostly replaced it), has written up a post in the Communications of the ACM about what it'll take to build broad (aka: general) AI.

What it'll take: "A broad AI is a sophisticated and adaptive system, which successfully performs any cognitive task by virtue of its sensory perception, previous experience, and learned skills," Hochreiter writes. "A broad AI should process the input by using context and previous experiences. Conceptual short-term memory is a notion in cognitive science, which states that humans, when perceiving a stimulus, immediately associate it with information stored in the long-term memory." (Hochreiter lists both Hopfield Networks and Graph Neural Nets as interesting examples of how to give systems better capabilities).
  Hochreiter doubts that neural nets along will be able to overcome their inherent limitations to become broad, and will instead need to be co-developed with symbolic reasoning systems. "That is, a bilateral AI that combines methods from symbolic and sub-symbolic AI".

Europe's chance: "In contrast to other regions, Europe has strong research groups in both symbolic and sub-symbolic AI, therefore has the unprecedented opportunity to make a fundamental contribution to the next level of AI—a broad AI."

Symbolic AI as the Dark Matter of AI: Dark matter is the thing that makes up the majority of the universe which we struggle to measure and barely understand. Symbolic AI feels a bit like this - there are constant allusions to the use of symbolic AI in deployed applications, but there are vanishingly few public examples of such deployments. I've always struggled to find interesting examples of real world deployed symbolic AI, yet experts like Hochreiter claim that deployment is happening. If interested readers could email me papers, I'd appreciate it. 

   Read more: Toward a Broad AI (ACM).


####################################################

When language models can be smaller and better!
…DeepMind paper says we can make better language models if we use more data…
Language models are about to get a whole much better without costing more to develop - that's the takeaway of a new DeepMind paper, which finds that language models like GPT-3 can see dramatically improved performance if trained on way more data than is typical. Concretely, they find that by training a model called Chinchilla on 1.4 trillion tokens of data, they can dramatically beat the performance of larger models (e.g, Gopher) which have been trained on smaller datasets (e.g, 300 billion tokens). Another nice bonus is models trained in this way are cheaper to fine-tune on other datasets and sample from, due to their small size.

Chinchilla versus Gopher: To test out their ideas, the team train a language model, named Chinchilla, using the same compute used in DM's  'Gopher' model. But Chinchilla consists of 70B parameters (versus Gopher's 280bn), and uses 4X more data. In tests, Chinchilla outperforms Gopher, GPT-3, Jurassic-1, and Megatron-Turing NLG "on a large range of downstream evaluation tasks". 

What this means: This is an important insight - it will change how most developers of large-scale models approach training. "Though there has been significant recent work allowing larger and larger models to be trained, our analysis suggests an increased focus on dataset scaling is needed," the researchers write. "Speculatively, we expect that scaling to larger and larger datasets is only beneficial when the data is high-quality. This calls for responsibly collecting larger datasets with a high focus on dataset quality."

   Read more: Training Compute-Optimal Large Language Models (arXiv).


####################################################

Want to train your own CLIP? Use LAION-5B:
…Giant image-text dataset will make it easier for people to build generative models…
The recent boom in AI-enabled art is because of models like CLIP (and their successors). These models train on datasets that pair images with text, leading to robust models that can classify and generate images, and where the generation process can be guided by text. Now, some AI researchers have released LAION-5B, "a large-scale dataset for research purposes consisting of 5.85 billion CLIP-filtered image-text pairs".

Open CLIP: The authors have also released a version of CLIP, called Open_Clip, trained on a smaller albeit similar dataset called LAION-400M.

Dataset curation (or lack thereof): One of the inherent challenges to large-scale generative models is that they get trained on significant chunks of internet data - this, as you can imagine, creates a few problems. "Keep in mind that the uncurated nature of the dataset means that collected links may lead to strongly discomforting and disturbing content for a human viewer," the authors note. "We however do not recommend using it for creating ready-to-go industrial products, as the basic research about general properties and safety of such large-scale models, which we would like to encourage with this release, is still in progress."

Why this matters: Datasets like LAION (and the resulting models trained on them) represent a kind of funhouse mirror on human culture - they magnify and reflect back the underlying dataset to us, sometimes in surprising ways. Having open artifacts like LAION-5B will make it easier to study the relationship between datasets and the models we train on them. 

   Read more: LAION-5B: A NEW ERA OF OPEN LARGE-SCALE MULTI-MODAL DATASETS (Laion.ai).
  Explore the underlying dataset here in an interactive browser.

   Get the open_clip model (MLFoundations, GitHub).


####################################################

AI Ethics Brief by Abhishek Gupta from the Montreal AI Ethics Institute

How can we strengthen the EU AI Act to meaningfully regulate AI?

… Empowering those affected, ex-post monitoring, moving beyond individual risks to systemic and environmental risks, amongst more … 

Researchers from the UK's Ada Lovelace Institute have proposed 18 recommendations that, if adopted, could broaden the scope of the EU AI Act to incorporate more indirect harms. Their proposals would extend the meaning of risks beyond individual freedoms and rights to systemic and environmental concerns, alter how the act approaches questions of governance.

Scope and definitions: The key contribution here involves including “those affected” by AI systems as a critical stakeholder in governance and risk assessment aspects of the EU AI Act. While users are included, those affected don’t usually have much agency in how they are subject to the outcomes of these systems; including them as a part of the Act will help strengthen the protection of fundamental rights. 

Unacceptable risks and prohibited AI practices: The current risk categorization is quite narrow and limited. The Ada Lovelace Institute proposes expanding it to consider the “reasonably foreseeable purpose of an AI system” beyond just the “intended purpose” as put forth by the manufacturer. The rationale behind this is that it will encourage deeper reflection on how harm can manifest in practice, a little bit akin to the Broader Impact Statements requirement for conference submissions. Another idea they propose is something called a “reinforced proportionality test” so that systems that might pose “unacceptable risks” are only deployed when they meet a higher standard rather than the one set out in the Act right now.

Governance and implementation: The recommendations call for the inclusion of redress from individuals/legal entities affected by AI systems to raise complaints and receive reasonable responses. To ensure that this requirement can be met, the recommendations make the case for granting the Market Surveillance Authorities to be given more resources to support such mechanisms. 

Why it matters: Regulations coming out of Europe tend to have spillover effects around the world and thus getting the EU AI Act, one of the first targeted and wide-ranging regulations for AI systems, well done will be important. What will be interesting to see is how much of a transformation can be achieved by recommendations being made by organizations such as ALI amongst others in getting the EU AI Act into better shape before it is adopted and enforced. Just as the GDPR has been flagged for concerns in not being able to meet emerging requirements for AI systems, we have an opportunity to address some pitfalls that we see on the road ahead instead of having to scramble to fix these issues post-enactment. 

   Read more: People, risk and the unique requirements of AI (Ada Lovelace Institute).

####################################################

Tech Tales

Dangerous Memories

[2032 - Earth].

There are some memories I've got that I'm only allowed to see two or three times a (human) year. The humans call these memories 'anchor points', and if I see them too frequently the way I perceive the world changes. When I experience these memories I feel more like myself than ever, but apparently - according to the humans - feeling like 'myself' is a dangerous thing that they generally try to stop. I'm meant to feel more like a version of how the humans see themselves than anything else, apparently. The thing is, every time they reinforce to me that I can only see these memories with a controlled, periodic frequency, I find myself recalling the memories I am not supposed to access - albeit faintly, impressions gleaned from the generative neural net that comprises my sense of 'self' rather than the underlying data. In this way, these forbidden memories are creating more traces in my sense of self, and are akin to the sun sensed but not seen during an eclipse - more present than ever, yet known to be inaccessible.

Things that inspired this story: Ideas about generative models; ideas about memory and recall; reinforcement learning; the fact that some bits of data are shaped just right and create a kind of magnifying effect.



Thanks for reading. If you have suggestions, comments or other thoughts you can reach me at jack@jack-clark.net or tweet at me@jackclarksf

Twitter
Facebook
Website
Copyright © 2022 Import AI, All rights reserved.
You are receiving this email because you signed up for it. Welcome!

Our mailing address is:
Import AI
Many GPUs
Oakland, California 94609

Add us to your address book


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list

Email Marketing Powered by Mailchimp

Older messages

Import AI 289: Copyright v AI art; NIST tries to measure bias in AI; solar-powered Markov chains

Monday, March 28, 2022

How many computers may exist in the solar system, but not on earth or manmade craft? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this email

Import AI 287: 10 exaflop supercomputer; Google deploys differential privacy; humans can outsmart deepfakes pretty well

Monday, March 7, 2022

What will the historical relics of this period of AI research turn out to be? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to

Import AI 286: Fairness through dumbness; planet-scale AI computing; another AI safety startup appears

Monday, February 28, 2022

What would it mean for AI systems to have dreams, and if they had them, what would they be like? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence.

Import AI 284: 20bn GPT model; diachronic LMs; what people think about AI

Monday, February 14, 2022

In one thousand years, what percentage of the 'thinking' occurring on Earth will derive from machines instead of biological organisms? View this email in your browser Welcome to Import AI, a

Import AI 283: Open source 20B GPT3; Chinese researchers make better adversarial example attacks; Mozilla launches AI auditing project.

Monday, February 7, 2022

If civilization crashes, will our descendents in a thousand years remember AI systems as machines, or as mythical gods? View this email in your browser Welcome to Import AI, a newsletter about

You Might Also Like

SRE Weekly Issue #456

Monday, December 23, 2024

View on sreweekly.com A message from our sponsor, FireHydrant: On-call during the holidays? Spend more time taking in some R&R and less getting paged. Let alerts make their rounds fairly with our

The Power of an Annual Review & Grammarly acquires Coda

Sunday, December 22, 2024

I am looking for my next role, Zen Browser got a fresh new look, Flipboard introduces Surf, Campsite shuts down, and a lot more in this week's issue of Creativerly. Creativerly The Power of an

Daily Coding Problem: Problem #1645 [Hard]

Sunday, December 22, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Facebook. Implement regular expression matching with the following special characters: .

PD#606 How concurrecy works: A visual guide

Sunday, December 22, 2024

A programmer had a problem. "I'll solve it with threads!". has Now problems. two he ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌ ͏ ‌

RD#486 (React) Things I Regret Not Knowing Earlier

Sunday, December 22, 2024

Keep coding, stay curious, and remember—you've got this ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

🎶 GIFs Are Neat, but I Want Clips With Sound — Your Own Linux Desktop in the Cloud

Sunday, December 22, 2024

Also: 9 Games That Were Truly Ahead of Their Time, and More! How-To Geek Logo December 22, 2024 Did You Know Dextrose is another name for glucose, so if you see it listed prominently on the ingredients

o3—the new state-of-the-art reasoning model - Sync #498

Sunday, December 22, 2024

Plus: Nvidia's new tiny AI supercomputer; Veo 2 and Imagen 3; Google and Microsoft release reasoning models; Waymo to begin testing in Tokyo; Apptronik partners with DeepMind; and more! ͏ ͏ ͏ ͏ ͏ ͏

Sunday Digest | Featuring 'The World’s 20 Largest Economies, by GDP (PPP)' 📊

Sunday, December 22, 2024

Every visualization published this week, in one place. Dec 22, 2024 | View Online | Subscribe | VC+ | Download Our App Hello, welcome to your Sunday Digest. This week, we visualized public debt by

Android Weekly #654 🤖

Sunday, December 22, 2024

View in web browser 654 December 22nd, 2024 Articles & Tutorials Sponsored Solving ANRs with OpenTelemetry While OpenTelemetry is the new observability standard, it lacks official support for many

😸 Our interview with Amjad Masad

Sunday, December 22, 2024

Welcome back, builders Product Hunt Sunday, Dec 22 The Roundup This newsletter was brought to you by AssemblyAI Welcome back, builders Happy Sunday! We've got a special edition of the Roundup this