Humanity Redefined - Hot (AI) Chips 2024 - Sync #482
I hope you enjoy this free post. If you do, please like ❤️ or share it, for example by forwarding this email to a friend or colleague. Writing this post took around eight hours to write. Liking or sharing it takes less than eight seconds and makes a huge difference. Thank you! Hot (AI) Chips 2024 - Sync #482Plus: OpenAI Strawberry could be released this autumn; AGI safety and alignment at DeepMind; OpenAI in talks for another funding round; Chinese humanoid robots; Disney's dancing robot; and more!Hello and welcome to Sync #482! Last week, one of the most important events of the year in the world of semiconductors, Hot Chips, took place, and we will take a closer look at what was presented there and how chip designers plan to tackle the demands placed on computing by AI applications. In other news, the highly anticipated update to ChatGPT, codenamed Strawberry, could be released this autumn, while OpenAI is in talks for another funding round that could bring the company’s value beyond $100 billion. Elsewhere in AI, California’s controversial AI bill, SB 1047, needs only the governor’s signature to become law. Nvidia announced its quarterly report, and Mark Zuckerberg and Daniel Ek wrote about why Europe should embrace open-source AI. Meanwhile, in robotics, a trade show in Beijing showcased Chinese humanoid robots, and researchers from ETH Zurich and Disney taught a small robot to dance. We will finish this week’s issue of Sync with news on the first mRNA lung cancer vaccine being tested in patients and the ongoing efforts to ensure synthetic biology remains safe and does not fall into the wrong hands. I also want to acknowledge how late this week’s issue of Sync is. It was supposed to come out on Friday, and now it’s Sunday. Things happened on my end that caused me to miss the deadline, and I’m going to review my processes to ensure this doesn’t happen again. Hot (AI) Chips 2024Unless you are deep into high-performance computing, you might not have heard about the Hot Chips conference. Hot Chips is a conference dedicated to processors, networking, and other components and devices that push the limits of high-performance computing. It brings together not only big companies like Intel, AMD, and Nvidia but also small startups and researchers eager to share their work. In the world of chip design, it is one of the most important events of the year, attracting some of the best chip designers and engineers from all over the world. All accepted presentations are awarded on merit, ensuring a high level of quality from the leading companies in the semiconductor space. Since we are in the middle of the AI revolution (or bubble, depending on your views), AI chips dominated this year’s Hot Chips conference, which took place last week. I will be linking to presentations posted on ServeTheHome, an excellent publication for all things server-related. Please note that Hot Chips is a conference for chip designers and engineers, so the presentations are more on the technical side. Let’s start with big players—Nvidia, AMD and Intel. Nvidia’s presentation focused on the upcoming Blackwell GPUs and how the company is aiming to deliver an entire machine learning package at scale, encompassing everything from GPUs, CPUs, and server racks to networking and the software running on those systems. One of the few new things Nvidia showcased was Nvidia Quasar Quantization, a system that can determine which workloads can use lower precision numbers, thereby improving system efficiency. AMD, meanwhile, had three presentations. The first one was about the AMD Instinct MI300X, AMD’s AI accelerator and direct competitor to Nvidia’s top GPUs. The talk detailed the architecture of the GPU, which will be replaced by the MI325X—a refreshed version of the MI300X—sometime this year, ahead of the MI350X release next year. The next presentation focused on Zen 5, covering its architecture, performance gains, and how AMD plans to integrate Zen 5 into its products. The final presentation was about the AMD Versal AI Edge Series Gen 2, AMD’s embedded platform for AI applications on the edge, and how it improves upon what was previously known as the Xilinx Versal AI Edge before AMD acquired Xilinx. Intel also had multiple presentations. The company showcased Lunar Lake, their upcoming flagship SoC (System on a Chip) for the next generation of AI PCs, promising improved performance and efficiency, especially in low-power scenarios. This is achieved through architectural changes such as the integration of on-chip memory, enhanced P-cores and E-cores, and a new Xe2 GPU architecture. Next, Intel introduced the Xeon 6 SoC, codenamed Granite Rapids-D, targeted at edge computing. They also unveiled a 4 Tbps optical chiplet designed to provide high-speed connectivity between XPUs (cross processing units). This chiplet aims to overcome the limitations of traditional electrical solutions by offering faster data transfer rates and reduced latency. Finally, Intel presented Gaudi 3, the latest iteration of its AI-focused processor, designed to accelerate both training and inference tasks in AI workloads. The Gaudi 3 boasts significant improvements in performance, efficiency, and scalability compared to its predecessors. It is tailored for large-scale AI models, providing a competitive alternative to existing AI hardware solutions, particularly in cloud and data centre environments. More interesting developments came from newer companies. Tenstorrent presented their Blackhole chips, which are based on RISC-V, an open-source alternative CPU architecture to x86 and ARM. Each Blackhole chip boasts 745 teraFLOPS of FP8 performance (372 teraFLOPS at FP16), 32 GB of GDDR6 memory, and 10x 400 Gbps Ethernet, capable of 512 GB/s of bandwidth. Tenstorrent plans to combine 32 Blackhole chips to form Blackhole Galaxy boxes, which can be configured via software to focus on either AI compute, AI memory, or function as an AI switch. The compute version promises to deliver up to 24 petaFLOPS of FP8 or 12 petaFLOPS at FP16. The memory version offers 1 TB of memory capable of 16 TBps of raw bandwidth, while the switch version can handle up to 12 TB/s of I/O operations. Next, we have Cerebras, a company known for making gigantic chips the size of an entire silicon wafer. To put the size of their chips into perspective, the latest Nvidia H100 chip (which is already huge) is 814 mm² and contains 80 billion transistors. Cerebras WSE-3, presented at Hot Chips, is 57 times larger, with 46,225 mm² and 4 trillion transistors. The Cerebras WSE-3 is designed for AI inference and features 900,000 AI cores, offering 125 petaFLOPS of compute power. Combined with 44 GB of on-board SRAM memory, it is a computing powerhouse. Cerebras claims that the CS-3, a server housing WSE-3 chips, is 20 times faster on Llama 3.1-8B compared to cloud offerings like Microsoft Azure, which use Nvidia H100 chips. In their presentation, Cerebras made a case that using chips as large as a wafer significantly enhances performance and can drastically speed up AI inference. Ampere’s presentation focused on AmpereOne, their server CPU featuring up to 192 cores, with future plans for a 256-core version. In the slides presenting performance comparisons, Ampere claims AmpereOne has up to 50% better perfomance/watt over AMD server offerings. The company also says AmpereOne is ready for AI workloads and, depending on a task, matches or exceeds AMD’s server CPUs. Another interesting new player is FuriosaAI and their RNGD (pronounced as “renegade”) processors for “sustainable AI compute.” Unlike other AI accelerators, which will happily take as much power as possible, RNGD uses only 150W of energy and can be air cooled. RNGD may not be the fastest AI accelerators on the market, but promises to deliver high-perfomance LLM workloads at the level of Nvidia’s edge-oriented chip L40S at a lower power. Although high-perfomance chips were the main topic of Hot Chips, there were also presentations about other ways of improving computing performance. One of those key areas is networking and how to connect multiple chips to efficiently exchange data. Broadcom presented an AI compute ASIC (Application-Specific Integrated Circuit, a specialised chip designed to perform a specific task or set of tasks) with optical attach. The company also showed its co-packaged optics and silicon photonics, which has the potential to unlock new levels of performance. Enfabrica showcased the ACF-S "SuperNIC," an advanced network interface card designed for scaling large AI clusters. This innovative device combines the functionality of multiple NICs (network interface controllers) and PCIe switches into a single unit, delivering an impressive 8 Tbps of bandwidth. This allows for efficient data movement and communication across thousands of GPUs and AI accelerators. Meanwhile, Tesla presented how their custom made Tesla Transport Protocol over Ethernet (TTPoE) helps overcome the limitations of traditional networking protocols to create a faster, custom AI networking solution and make Dojo, their AI supercomputer, more efficient and more scalable. Another way to improve computing performance could be in-memory computing, as presented by SK Hynix, one of the world's leading manufacturers of memory chips. They presented how AiMX-xPU improves LLM inference by bringing computing to memory and removes the big problem of expensive data transfers between memory and computing units. There were also companies you would not associate with chip design that were present at this year’s Hot Chips. One of them was Meta, which shared the development of the next-generation MTIA (Meta Training and Inference Accelerator) designed for processing recommendation. This new chip uses RISC-V architecture to improve energy efficiency and performance, focusing on the specific demands of large-scale recommendation models that company such as Meta faces. Representing the customer and user side of high-perfomance systems, OpenAI shared their experience from massively calling up their AI infrastructure and the unique challenges the company was facing. One thing that is clear from Hot Chips is that computing performance is only going to improve—and not just by single-digit percentages. The demands placed on computing by AI are driving brilliant minds in the semiconductor industry to come up with ideas that drastically improve computing performance by a factor of 10 or more. We have seen some of those ideas at Hot Chips 2024 and sooner or later, they will find their way into servers powering next-generation AI systems. All presentations are available on the Hot Chips 2024 website for the attendees but they should eventually find their way to the Hot Chips YouTube channel. You can also try your luck by asking the presenters directly if they would be happy to share the slides or the recording. If you want to learn more about what was presented at this year’s Hot Chips, I recommend listening to TechTechPotato’s Hot Chips debrief discussion, which is almost a three-hour-long conversation that goes deeper than I can do here. If you enjoy this post, please click the ❤️ button or share it. Do you like my work? Consider becoming a paying subscriber to support it For those who prefer to make a one-off donation, you can 'buy me a coffee' via Ko-fi. Every coffee bought is a generous support towards the work put into this newsletter. Your support, in any form, is deeply appreciated and goes a long way in keeping this newsletter alive and thriving. 🦾 More than a humanMaybe you will be able to live past 122 🔮 Future visionsRay Kurzweil: Technology will let us fully realize our humanity 🧠 Artificial IntelligenceCalifornia legislature passes controversial “kill switch” AI safety bill AGI Safety and Alignment at Google DeepMind: A Summary of Recent Work The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery ChatGPT and GPT-4 could get a sweet upgrade this fall with 'strawberry’ OpenAI reportedly in talks to close a new funding round at $100B+ valuation NVIDIA Announces Financial Results for Second Quarter Fiscal 2025 Mark Zuckerberg and Daniel Ek on why Europe should embrace open-source AI How real is real enough? Unveiling the diverse power of generative AI-enabled virtual influencers and the dynamics of human responses Perplexity AI plans to start running ads in fourth quarter as AI-assisted search gains popularity Procreate’s anti-AI pledge attracts praise from digital creatives Three-quarters of founders in the latest Y Combinator cohort are working on AI startups If you're enjoying the insights and perspectives shared in the Humanity Redefined newsletter, why not spread the word? 🤖 Robotics▶️ VMP: Versatile Motion Priors for Robustly Tracking Motion on Physical Characters (7:23) In this video, researchers from ETH Zurich and Disney Research explain how they taught robots to perform complex moves. To show what their method is capable of, they taught a small robot to dance. China’s own Tesla Optimus? Beijing’s ambitions in humanoid robots in full display at expo A skeptic’s guide to humanoid-robot videos Robot coaches are reading brain signals to support stroke rehabilitation NHS flies blood packs by drone beyond the line of sight in UK first 🧬 BiotechnologyWorld-first lung cancer vaccine trials launched across seven countries Is That DNA Dangerous? Thanks for reading. If you enjoyed this post, please click the ❤️ button or share it. Humanity Redefined sheds light on the bleeding edge of technology and how advancements in AI, robotics, and biotech can usher in abundance, expand humanity's horizons, and redefine what it means to be human. A big thank you to my paid subscribers, to my Patrons: whmr, Florian, dux, Eric, Preppikoma and Andrew, and to everyone who supports my work on Ko-Fi. Thank you for the support! My DMs are open to all subscribers. Feel free to drop me a message, share feedback, or just say "hi!" |
Older messages
California's controversial AI bill - Sync #481
Friday, August 23, 2024
Plus: Neuralink shares a progress update on the second clinical trial; Windows Recall is coming back in October; a $16000 humanoid robot from Unitree; a lot about drones; and more! ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Google goes all in with Gemini - Weekly News Roundup - Issue #480
Tuesday, August 20, 2024
Plus: Grok-2; AI and BCI helps a person with ALS speak; Nvidia delays its new AI chips; new rumours about Apple's secret robotics project; drones to carry cargo missions in the Himalayas; and more!
Figure unveils a new humanoid robot - Weekly News Roundup - Issue #479
Friday, August 9, 2024
Plus: more OpenAI drama; the race to 150; a robot playing table tennis; Groq raised $640 million; video game actors are on strike over AI concerns; and more! ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
A peek into Apple Intelligence - Weekly News Roundup - Issue #478
Friday, August 2, 2024
Plus: EU AI Act is in force now; a titanium heart pumps blood inside a living human; an AI necklace to combat loneliness; autonomous cars drifting in tandem; and more! ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
OpenAI announces SearchGPT - Weekly News Roundup - Issue #477
Friday, July 26, 2024
Plus: Will billionaires live forever; a police robot dog jamming wireless networks; Alphabet to invest $5B into Waymo; warnings about “model collapse”; a new partnership for AI security; and more! ͏ ͏
You Might Also Like
🔎 How to Search Reddit Like a Pro — 9 Reasons to Always Use Windows With a VPN
Tuesday, November 12, 2024
Also: Tips for Setting Up a Mobile VR Office, and More! How-To Geek Logo November 12, 2024 Did You Know In the 2016 film Doctor Strange, the characters of both Doctor Strange and the villain Dormammu (
Web Scraping Tips, Python 3.13 Performance Boosts, Writing Interpreters & More
Tuesday, November 12, 2024
Introduction to Web Scraping With Python #655 – NOVEMBER 12, 2024 VIEW IN BROWSER The PyCoder's Weekly Logo Introduction to Web Scraping With Python In this video course, you'll learn all about
Daily Coding Problem: Problem #1606 [Easy]
Tuesday, November 12, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by PayPal. Given a binary tree, determine whether or not it is height-balanced. A height-
Charted | Breaking Down the U.S. Government's 2024 Fiscal Year 💰
Tuesday, November 12, 2024
Net interest payments cost the US government $882 billion in fiscal year 2024, the third-largest outlay in the final budget. View Online | Subscribe | Download Our App Presented by Hinrich Foundation
Spyglass Dispatch: AI's Independence Race • EU's Bad Meta Ads • AI Chip Shenanigans • Netflix Ads Religion
Tuesday, November 12, 2024
AI's Independence Race • EU's Bad Meta Ads • AI Chip Shenanigans • Netflix Ads Religion The Spyglass Dispatch is a free newsletter sent out daily on weekdays. Feel free to forward it on to
The Big T
Tuesday, November 12, 2024
Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 12, 2024? The HackerNoon
Deadline Extended: 2 Weeks Left to Compete for Over $7000 in the AI-chatbot Writing Contest🔥
Tuesday, November 12, 2024
Great news, newsletterest1 ! The submission deadline for the #ai-chatbot writing contest has been extended! You now have until November 21, 2024, to submit your unique AI chatbot ideas for a chance to
A very demure, very mindful issue
Tuesday, November 12, 2024
Plus a look at memory regions, Go's birthday, and we invent a brand new word. | #531 — November 12, 2024 Unsub | Web Version Together with Frontend Masters logo Go Weekly Happy Birthday, Go! Go
Visual Capitalist is revealing all of its biggest secrets... 📊
Tuesday, November 12, 2024
You can get in on our newest project if you act now. View Online | Subscribe | Download Our App We're revealing our biggest secrets... The question we get asked the most is: "How does Visual
🔓🐍 Unlock Your Python Potential with Instructor-Led Courses
Tuesday, November 12, 2024
Hey there, If you've been looking for a way to go beyond on-demand tutorials and really master Python, we've got something special for you... For the first time, Real Python is launching an