The Sequence Chat: Daniel J. Mankowitz, DeepMind on Building AlphaDev to Discover New Computer Science Algorithms
Was this email forwarded to you? Sign up here The Sequence Chat: Daniel J. Mankowitz, DeepMind on Building AlphaDev to Discover New Computer Science AlgorithmsOne of the researchers behind DeepMind's groundbreaking model that discovered new sorting algorithms shares his insights about the experience.👤 Quick bio
I was born and raised in Johannesburg, South Africa. I always had an interest in building things and solving problems, with a specific interest in computers and electronics. This is what led me to doing my first degree in Electrical and Information Engineering at the University of the Witwatersrand in Johannesburg. How I got started in Machine Learning: In the early 2010’s Artificial Intelligence was not yet mainstream, especially in South Africa. I found out about a Master’s program at the University of Edinburgh in Machine Learning and Computer vision and was determined to attend the program. I managed to get a scholarship and discovered a whole new world of possibility with Artificial Intelligence through this program. One such course was Reinforcement Learning which I took a particular liking to and this led me on a path to doing my PhD in hierarchical reinforcement learning at the Technion Israel Institute of Technology. Current role: My current role is focused on applying core Deepmind technologies to real-world products and applications. This requires me to deeply understand product needs and requirements, have a thorough understanding of the capabilities of the research technologies, and using my experience to determine whether a particular research technology fits with a particular product. 🛠 ML Work
Following our success where we incorporated MuZero in Youtube’s video compression pipeline, we realized that we could apply these core RL technologies to real world applications. We then began to brainstorm where we could take these technologies next. We live in an increasingly digital society with more and more applications, screens etc and this leads to an increase in energy demand and consumption. With Moore’s law coming to an end and chips reaching their fundamental physical limits, we need new and innovative ways to optimize the computing stack. This led us to identifying fundamental algorithms (such as sorting and hashing) that are called trillions of times every day. If we could improve these algorithms, we could help optimize one important aspect of the stack - the software that runs on it.
One aspect is that AlphaDev built the algorithms in the assembly language. This was a critical design decision that gave us added flexibility to find new optimizations. This is because we were operating at a much lower level (i.e., manipulating memory and CPU registers) than typical high level programming languages such as C++. Another aspect is that we optimized directly for actual measured latency (speed). This was no small feat. We needed to build a system that could measure sorting algorithms accurately on the order of nanoseconds. There are many sources of noise and as a result, we designed a latency benchmarking server that would take thousands of measurements across multiple machines for generating a single latency value for a given algorithm. This scale of taking measurements enabled us to accurately measure and optimize for latency.
When we first began the AlphaDev project, we decided to try and learn a sort3 algorithm (i.e., sorting 3 elements) from scratch. This means that AlphaDev began with an empty buffer and had to iteratively build an algorithm that can sort three elements from the ground up. This is incredibly challenging as a huge space of algorithms needed to be explored efficiently by AlphaDev - there are more possible algorithms that can be built from scratch, even for sorting three elements, than the number of atoms in the universe! We visually analyzed the state of the art’s assembly code for sort3 and couldn’t find any way to improve it. We were convinced that AlphaDev could not improve this algorithm further and would probably learn to produce the human benchmark. However, to our surprise, it discovered a sort3 algorithm from scratch that had one less assembly instruction. This, we initially thought, was a mistake. However, upon further analysis, we realized that it discovered what we called the AlphaDev swap move. This is a sequence of instructions, that when applied under specific conditions, saves an instruction. For the swap move, it assumes that given 3 values A,B,C - it can be applied when B<=C - See AlphaDev swap move section and Figures 3a,b,c in our Nature paper. AlphaDev discovered this same move in other sorting algorithms too such as when it generated a sort5 algorithm from scratch. We had a similar experience with the AlphaDev copy move which can be applied under the condition that for 4 inputs <A,B,C,D>, we have that D>=min(A,C). See AlphaDev copy move section and Figures 3d,e,f in our paper. Another very interesting case is that of variable sort 4 (VarSort4). Here the input to the sorting algorithm is a vector of up to 4 elements. In this case, AlphaDev discovered, from scratch, VarSort4 algorithms up to 29 instructions shorter than the human benchmark, with fundamentally different structures! (see the image below from the Nature News and Views article - Figure a is the human benchmark, figure b is the fastest discovered AlphaDev VarSort4 algorithm). You can also read the ‘New variable sort algorithms’ section in our paper. As a result, we realized that there are optimizations to algorithms that we don’t know about, or can’t see, but are still out there. And hopefully these optimizations can also influence how we construct other fundamental algorithms too.
Once we discovered faster sorting algorithms, we wanted to see whether AlphaDev could generalize to other fundamental computer science algorithms. Using the analogy of a librarian, sorting algorithms could be used by a librarian to sort books from A to Z. Hashing on the other hand, provides the librarian with a very efficient way to search for and retrieve a particular book. A book is assigned a unique identifier, which we refer to as a hash, and the librarian is then able to know exactly where to find the book as opposed to searching through the entire library. Hashing algorithms for data structures are called trillions of times everyday and are used in applications for information storage and retrieval. Hashing algorithms define their “correctness” in terms of the number of hashing collisions. So AlphaDev learned an improved hashing algorithm that had a minimal number of collisions and was 30% faster than the current algorithm in Abseil. As such it replaced the benchmark in Abseil and is now used by millions of developers, companies and applications around the world.
AlphaZero leverages the power of search and value functions to efficiently explore incredibly large combinatorial problems. For context, discovering a faster sorting algorithm requires searching through a space of algorithms similar in size to the number of possible games in Chess and Go and the number of atoms in the Universe. The value function gives AlphaDev a prediction as to whether the final program will be fast and correct and this is critical to determining how to efficiently explore this massive space of algorithms – AlphaDev’s value function only needs to be trained on the actual measured latency of approximately 0.002% of the algorithms it generates (using its latency benchmarking server). 💥 Miscellaneous – a set of rapid-fire questions
All areas of AI are fascinating and interesting to me. At the moment I am very interested in Reinforcement Learning from Human Feedback. This technology has shown the ability to better align Large Language Models (LLMs) with human preferences. Model alignment is critical in developing language models that provide value to humans such as saving time and resources, as well as ensuring safe and responsible responses too.
AlphaDev can be generalized to many more algorithms. Some examples include compression algorithms, both lossy (e.g., image compression) and lossless (e.g., file compression), cryptographic hashing, serialization, deserialization etc. Currently, AlphaDev does not support algorithms with loops, although this is possible to support in future work. It may also struggle if you want AlphaDev to optimize a huge algorithm or codebase (e.g., an entire library). In this case, additional break-throughs are necessary.
Yes, they both have AlphaZero as their base algorithm. The power of search and value functions are critical in both cases. However, they both play very different single player games which have different properties and algorithmic requirements. For example, in the case of AlphaDev it was critical for us to measure actual measured latency.
This is indeed a tough question :) I think definitions are very important, and I don’t have a good opinion as to whether this is a valid variation or not. What I can say is that we are taking a step forward to optimizing the computing stack which will hopefully enable us to make further progress with AI as well as the evergrowing computational bottleneck from our increasingly digital society. You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
Edge 303: The Top Two Types Retrieval-Augmented Language Models
Tuesday, June 27, 2023
What are the main types of techniques to augment LLMs with external information.
📝 Guest Post: Choosing the Right Vector Index For Your Project*
Monday, June 26, 2023
In this post, Frank Liu. ML Architect at Zilliz, discusses vector databases and different indexing strategies for approximate nearest neighbor search. The options mentioned include brute-force search,
The Generative Audio Momentum
Sunday, June 25, 2023
Sundays, The Sequence Scope brings a summary of the most important research papers, technology releases and VC funding deals in the artificial intelligence space.
Edge 302: Inside MPT-7B: MosaicML's Suite of Open Source LLMs that Supports 65k Tokens
Thursday, June 22, 2023
The new suite of models was released by MosaicML and support models optimized for Instructions, Chats, Stories and More.
The Sequence Chat: Vipul Ved Prakash, CEO, Together on Decentralized, Open Source Foundation Models
Wednesday, June 21, 2023
Together has been behind some of the most interesting releases in open source foundation models.
You Might Also Like
Retro Recomendo: Gift Ideas
Sunday, November 24, 2024
Recomendo - issue #438 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Kotlin Weekly #434
Sunday, November 24, 2024
ISSUE #434 24th of November 2024 Hi Kotliners! Next week is the last one to send a paper proposal for the KotlinConf. We hope to see you there next year. Announcements State of Kotlin Scripting 2024
Weekend Reading — More time to write
Sunday, November 24, 2024
More Time to Write A fully functional clock that ticks backwards, giving you more time to write. Tech Stuff Martijn Faassen (FWIW I don't know how to use any debugger other than console.log) People
🕹️ Retro Consoles Worth Collecting While You Still Can — Is Last Year's Flagship Phone Worth Your Money?
Saturday, November 23, 2024
Also: Best Outdoor Smart Plugs, and More! How-To Geek Logo November 23, 2024 Did You Know After the "flair" that servers wore—buttons and other adornments—was made the butt of a joke in the
JSK Daily for Nov 23, 2024
Saturday, November 23, 2024
JSK Daily for Nov 23, 2024 View this email in your browser A community curated daily e-mail of JavaScript news React E-Commerce App for Digital Products: Part 4 (Creating the Home Page) This component
Not Ready For The Camera 📸
Saturday, November 23, 2024
What (and who) video-based social media leaves out. Here's a version for your browser. Hunting for the end of the long tail • November 23, 2024 Not Ready For The Camera Why hasn't video
Daily Coding Problem: Problem #1617 [Easy]
Saturday, November 23, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Microsoft. You are given an string representing the initial conditions of some dominoes.
Ranked | The Tallest and Shortest Countries, by Average Height 📏
Saturday, November 23, 2024
These two maps compare the world's tallest countries, and the world's shortest countries, by average height. View Online | Subscribe | Download Our App TIME IS RUNNING OUT There's just 3
⚙️ Your own Personal AI Agent, for Everything
Saturday, November 23, 2024
November 23, 2024 | Read Online Subscribe | Advertise Good Morning. Welcome to this special edition of The Deep View, brought to you in collaboration with Convergence. Imagine if you had a digital
Educational Byte: Are Privacy Coins Like Monero and Zcash Legal?
Saturday, November 23, 2024
Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 23, 2024? The HackerNoon