Import AI 290: China plans massive models; DeepMind makes a smaller and smarter model; open source CLIP data

If it's possible to build artificial general intelligence, how many people will be required to build it?
View this email in your browser

Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to give your chums an AI upgrade. Subscribe here.

Chinese researchers plan to train vast models - and it's not the private sector doing it:
…'Big Model' paper represents a statement of intent. We should pay attention…
A massive group of Chinese-affiliated researchers have published a position paper about large-scale models. The paper is interesting less for what it says (it's basically an overview of large-scale models and pretty similar to Stanford's 'Foundation Models' paper), but more for what it signals: namely, that well resourced government-linked researchers in China want to build some really big models. The position in the paper contrasts with that in the West, where big models are mostly built by the private sector, while being critiqued by the academic sector (and increasingly worked on, albeit via access schemes). 

Main point: "Big Models will Change the AI Research Paradigm and Improve the Efficiency of Researches," the researchers write. "In this ecosystem, big models will be in the position of operating systems or basic development platforms."

Paper authors: Authors include researchers affiliated with the Beijing Academy of AI, Tsinghua University, Wechat, Northeastern University*, Renmin University, Peking University, Huawei,  Shanghai Jiao Tong University, Chinese Academy of Science, JD AI Research, Harbin Institute of Technology, Columbia University*, Bytedance, Microsoft Research Asia*, Mila*, New York University*, and BeiHang University.


*Things that make you make a geopolitical 'hmmmm' sound: The paper includes a bunch of academics affiliated with Western institutions (e.g, Microsoft, Mila, NYU), but all those authors have an asterisk next to their name saying "Produced by Beijing Academy of Artificial Intelligence". In other words, it's signaling that despite their affiliations, they're doing this work at the Chinese government-backed BAAI research institution. 

We should take this as a statement of intent: Many of the authors on this paper have previously built large-scale models, ranging from the trillion+ parameter MoE 'WuDao' model, to the more recent research on trying to build training frameworks capable of scaling up to 100 trillion+ parameter MoE models (Import AI 288). Therefore, this isn't like Stanford (which currently lacks the engineering resources to train massive scale models), it's much more like a statement of intent from a big private lab, like a Microsoft or a Google. 

   But the twist here is that BAAI is wired into both the Chinese government and academic ecosystem, so if the authors of this paper end up building large-scale models, the models will be distributed much more evenly throughout China's AI ecosystem, rather than gatekeeper. The implications of this are vast in terms of safety, development of the Chinese AI industry, and potential ways in which Chinese AI research may diverge from Western AI research.
  Read more: A Roadmap for Big Model (arXiv).

####################################################

Want general AI? You need to incorporate symbolic reasoning:
…LSTM inventor lays out a route to build general intelligence…
Sepp Hochreiter, the co-inventor of the LSTM (one of the really popular architectures people used to add memory to neural nets, before the Transformer came along and mostly replaced it), has written up a post in the Communications of the ACM about what it'll take to build broad (aka: general) AI.

What it'll take: "A broad AI is a sophisticated and adaptive system, which successfully performs any cognitive task by virtue of its sensory perception, previous experience, and learned skills," Hochreiter writes. "A broad AI should process the input by using context and previous experiences. Conceptual short-term memory is a notion in cognitive science, which states that humans, when perceiving a stimulus, immediately associate it with information stored in the long-term memory." (Hochreiter lists both Hopfield Networks and Graph Neural Nets as interesting examples of how to give systems better capabilities).
  Hochreiter doubts that neural nets along will be able to overcome their inherent limitations to become broad, and will instead need to be co-developed with symbolic reasoning systems. "That is, a bilateral AI that combines methods from symbolic and sub-symbolic AI".

Europe's chance: "In contrast to other regions, Europe has strong research groups in both symbolic and sub-symbolic AI, therefore has the unprecedented opportunity to make a fundamental contribution to the next level of AI—a broad AI."

Symbolic AI as the Dark Matter of AI: Dark matter is the thing that makes up the majority of the universe which we struggle to measure and barely understand. Symbolic AI feels a bit like this - there are constant allusions to the use of symbolic AI in deployed applications, but there are vanishingly few public examples of such deployments. I've always struggled to find interesting examples of real world deployed symbolic AI, yet experts like Hochreiter claim that deployment is happening. If interested readers could email me papers, I'd appreciate it. 

   Read more: Toward a Broad AI (ACM).


####################################################

When language models can be smaller and better!
…DeepMind paper says we can make better language models if we use more data…
Language models are about to get a whole much better without costing more to develop - that's the takeaway of a new DeepMind paper, which finds that language models like GPT-3 can see dramatically improved performance if trained on way more data than is typical. Concretely, they find that by training a model called Chinchilla on 1.4 trillion tokens of data, they can dramatically beat the performance of larger models (e.g, Gopher) which have been trained on smaller datasets (e.g, 300 billion tokens). Another nice bonus is models trained in this way are cheaper to fine-tune on other datasets and sample from, due to their small size.

Chinchilla versus Gopher: To test out their ideas, the team train a language model, named Chinchilla, using the same compute used in DM's  'Gopher' model. But Chinchilla consists of 70B parameters (versus Gopher's 280bn), and uses 4X more data. In tests, Chinchilla outperforms Gopher, GPT-3, Jurassic-1, and Megatron-Turing NLG "on a large range of downstream evaluation tasks". 

What this means: This is an important insight - it will change how most developers of large-scale models approach training. "Though there has been significant recent work allowing larger and larger models to be trained, our analysis suggests an increased focus on dataset scaling is needed," the researchers write. "Speculatively, we expect that scaling to larger and larger datasets is only beneficial when the data is high-quality. This calls for responsibly collecting larger datasets with a high focus on dataset quality."

   Read more: Training Compute-Optimal Large Language Models (arXiv).


####################################################

Want to train your own CLIP? Use LAION-5B:
…Giant image-text dataset will make it easier for people to build generative models…
The recent boom in AI-enabled art is because of models like CLIP (and their successors). These models train on datasets that pair images with text, leading to robust models that can classify and generate images, and where the generation process can be guided by text. Now, some AI researchers have released LAION-5B, "a large-scale dataset for research purposes consisting of 5.85 billion CLIP-filtered image-text pairs".

Open CLIP: The authors have also released a version of CLIP, called Open_Clip, trained on a smaller albeit similar dataset called LAION-400M.

Dataset curation (or lack thereof): One of the inherent challenges to large-scale generative models is that they get trained on significant chunks of internet data - this, as you can imagine, creates a few problems. "Keep in mind that the uncurated nature of the dataset means that collected links may lead to strongly discomforting and disturbing content for a human viewer," the authors note. "We however do not recommend using it for creating ready-to-go industrial products, as the basic research about general properties and safety of such large-scale models, which we would like to encourage with this release, is still in progress."

Why this matters: Datasets like LAION (and the resulting models trained on them) represent a kind of funhouse mirror on human culture - they magnify and reflect back the underlying dataset to us, sometimes in surprising ways. Having open artifacts like LAION-5B will make it easier to study the relationship between datasets and the models we train on them. 

   Read more: LAION-5B: A NEW ERA OF OPEN LARGE-SCALE MULTI-MODAL DATASETS (Laion.ai).
  Explore the underlying dataset here in an interactive browser.

   Get the open_clip model (MLFoundations, GitHub).


####################################################

AI Ethics Brief by Abhishek Gupta from the Montreal AI Ethics Institute

How can we strengthen the EU AI Act to meaningfully regulate AI?

… Empowering those affected, ex-post monitoring, moving beyond individual risks to systemic and environmental risks, amongst more … 

Researchers from the UK's Ada Lovelace Institute have proposed 18 recommendations that, if adopted, could broaden the scope of the EU AI Act to incorporate more indirect harms. Their proposals would extend the meaning of risks beyond individual freedoms and rights to systemic and environmental concerns, alter how the act approaches questions of governance.

Scope and definitions: The key contribution here involves including “those affected” by AI systems as a critical stakeholder in governance and risk assessment aspects of the EU AI Act. While users are included, those affected don’t usually have much agency in how they are subject to the outcomes of these systems; including them as a part of the Act will help strengthen the protection of fundamental rights. 

Unacceptable risks and prohibited AI practices: The current risk categorization is quite narrow and limited. The Ada Lovelace Institute proposes expanding it to consider the “reasonably foreseeable purpose of an AI system” beyond just the “intended purpose” as put forth by the manufacturer. The rationale behind this is that it will encourage deeper reflection on how harm can manifest in practice, a little bit akin to the Broader Impact Statements requirement for conference submissions. Another idea they propose is something called a “reinforced proportionality test” so that systems that might pose “unacceptable risks” are only deployed when they meet a higher standard rather than the one set out in the Act right now.

Governance and implementation: The recommendations call for the inclusion of redress from individuals/legal entities affected by AI systems to raise complaints and receive reasonable responses. To ensure that this requirement can be met, the recommendations make the case for granting the Market Surveillance Authorities to be given more resources to support such mechanisms. 

Why it matters: Regulations coming out of Europe tend to have spillover effects around the world and thus getting the EU AI Act, one of the first targeted and wide-ranging regulations for AI systems, well done will be important. What will be interesting to see is how much of a transformation can be achieved by recommendations being made by organizations such as ALI amongst others in getting the EU AI Act into better shape before it is adopted and enforced. Just as the GDPR has been flagged for concerns in not being able to meet emerging requirements for AI systems, we have an opportunity to address some pitfalls that we see on the road ahead instead of having to scramble to fix these issues post-enactment. 

   Read more: People, risk and the unique requirements of AI (Ada Lovelace Institute).

####################################################

Tech Tales

Dangerous Memories

[2032 - Earth].

There are some memories I've got that I'm only allowed to see two or three times a (human) year. The humans call these memories 'anchor points', and if I see them too frequently the way I perceive the world changes. When I experience these memories I feel more like myself than ever, but apparently - according to the humans - feeling like 'myself' is a dangerous thing that they generally try to stop. I'm meant to feel more like a version of how the humans see themselves than anything else, apparently. The thing is, every time they reinforce to me that I can only see these memories with a controlled, periodic frequency, I find myself recalling the memories I am not supposed to access - albeit faintly, impressions gleaned from the generative neural net that comprises my sense of 'self' rather than the underlying data. In this way, these forbidden memories are creating more traces in my sense of self, and are akin to the sun sensed but not seen during an eclipse - more present than ever, yet known to be inaccessible.

Things that inspired this story: Ideas about generative models; ideas about memory and recall; reinforcement learning; the fact that some bits of data are shaped just right and create a kind of magnifying effect.



Thanks for reading. If you have suggestions, comments or other thoughts you can reach me at jack@jack-clark.net or tweet at me@jackclarksf

Twitter
Facebook
Website
Copyright © 2022 Import AI, All rights reserved.
You are receiving this email because you signed up for it. Welcome!

Our mailing address is:
Import AI
Many GPUs
Oakland, California 94609

Add us to your address book


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list

Email Marketing Powered by Mailchimp

Older messages

Import AI 289: Copyright v AI art; NIST tries to measure bias in AI; solar-powered Markov chains

Monday, March 28, 2022

How many computers may exist in the solar system, but not on earth or manmade craft? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this email

Import AI 287: 10 exaflop supercomputer; Google deploys differential privacy; humans can outsmart deepfakes pretty well

Monday, March 7, 2022

What will the historical relics of this period of AI research turn out to be? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence. Forward this email to

Import AI 286: Fairness through dumbness; planet-scale AI computing; another AI safety startup appears

Monday, February 28, 2022

What would it mean for AI systems to have dreams, and if they had them, what would they be like? View this email in your browser Welcome to Import AI, a newsletter about artificial intelligence.

Import AI 284: 20bn GPT model; diachronic LMs; what people think about AI

Monday, February 14, 2022

In one thousand years, what percentage of the 'thinking' occurring on Earth will derive from machines instead of biological organisms? View this email in your browser Welcome to Import AI, a

Import AI 283: Open source 20B GPT3; Chinese researchers make better adversarial example attacks; Mozilla launches AI auditing project.

Monday, February 7, 2022

If civilization crashes, will our descendents in a thousand years remember AI systems as machines, or as mythical gods? View this email in your browser Welcome to Import AI, a newsletter about

You Might Also Like

Second DOT ETF in 3 weeks

Saturday, March 1, 2025

DOOM ran on JAM 🤯, OriginTrail leads in revenue, Polkadot Hub on schedule for Q3, and more ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

This App Is a Productivity Power Tool

Saturday, March 1, 2025

Informant 5 is a complete planner in your pocket. Manage Calendars, Tasks, Projects, and Tags in a single app. This app is one of the few that combines both your calendar AND your tasks into a singe

🕹️ Who the iMac Is For in 2025 — 12 Nintendo Switch Games You Need to Play

Saturday, March 1, 2025

Also: 10 Hybrid Vehicles That Are Much Faster Than You'd Expect How-To Geek Logo March 1, 2025 Did You Know The quirky tiny car driven by the nerdy Steve Urkel in the 1990s sitcom Family Matters is

Mozilla Updates Firefox Terms Again After Backlash Over Broad Data License Language

Saturday, March 1, 2025

THN Daily Updates Newsletter cover Building a Smarter Defense How Gen AI Is Revolutionizing Threat Detection In Cybersecurity Download Now Sponsored LATEST NEWS Mar 1, 2025 Mozilla Updates Firefox

📧 Introduction to Dapr for .NET Developers

Saturday, March 1, 2025

​ Introduction to Dapr for .NET Developers Read on: m​y website / Read time: 10 minutes The .NET Weekly is brought to you by: ​Get every Dometrain Course at 40% off! Dometrain is an educational courses

This Week in Rust #588

Saturday, March 1, 2025

Email isn't displaying correctly? Read this e-mail on the Web This Week in Rust issue 588 — 26 FEB 2025 Hello and welcome to another issue of This Week in Rust! Rust is a programming language

WebAIM February 2025 Newsletter

Friday, February 28, 2025

WebAIM February 2025 Newsletter Read this newsletter online at https://webaim.org/newsletter/2025/february Feature Global Digital Accessibility Salary Survey Results The results of the WebAIM and GAAD

JSK Daily for Feb 28, 2025

Friday, February 28, 2025

JSK Daily for Feb 28, 2025 View this email in your browser A community curated daily e-mail of JavaScript news Introducing the New Angular TextArea Component It is a robust and flexible user interface

Daily Coding Problem: Problem #1704 [Medium]

Friday, February 28, 2025

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Amazon. At a popular bar, each customer has a set of favorite drinks, and will happily

iOS Dev Weekly – Issue 701

Friday, February 28, 2025

What does Dave write about when he has a fever? 🤒 Let's find out! ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌