DBRX - the new king of open source LLMs - Weekly News Roundup - Issue #460
DBRX - the new king of open source LLMs - Weekly News Roundup - Issue #460Plus: are Rabbit R1 and Humane AI Pin scams?; Anthropic gets $2.75 billion from Amazon; Grok-1.5; robotic police dog got shot multiple times; drugs made in space; and more!
Hello and welcome to the Weekly News Roundup, Issue #460. The large language model space has seen some major reshuffling in the last few weeks. First, Anthropic’s Claude 3 Opus outperformed GPT-4, and now DBRX, a language model from Databricks, has claimed the title of the best open-source large language model. In other news, Claude 3 Opus dethroned GPT-4 on the Chatbot Arena Leaderboard; Apple is not-so-subtly hinting at AI features to be revealed at WWDC 2024; xAI has announced Grok-1.5, and Anthropic received a $2.75 billion investment from Amazon. Apart from that, this week’s issue includes an open-source robotic cat, a story about a robotic police dog that was shot multiple times and was credited with helping to avoid potential bloodshed, and one company's dream of manufacturing drugs in space. Enjoy! The space of open-source large language models has been dominated by various variants of Meta’s Llama 2, Mistral-7B and Mixtral from Mistral, or small models from Microsoft (Phi-2) and Google (Gemma). All these models offer very good performance and, in some cases, are very close to or even surpass proprietary models. But now, we have a new king of open-source large language models, and it doesn’t come from Meta, Mistral, Google, or Microsoft. It comes from Databricks, and its name is DBRX. Seemingly out of nowhere, Databrick released DBRX just a couple of days ago and immediately claimed the crown of the best open-source large language model, with the fine-tuned model even being on par with Google’s Gemini 1.0 Pro.
DBRX is a Mixture of Experts (MoE) model with a total of 132B parameters with a 32k context window. DBRX has 16 experts and chooses 4, while Mixtral and Grok-1 have 8 experts and choose 2. According to Databricks, choosing 4 out of 16 experts results in 65x more possible combinations of experts which improves the model’s quality. In the post announcing DBRX, Databricks boasts some impressive numbers. According to that post, inference is up to 2x faster than LLaMA2-70B while being more efficient to train. Both Base and Instruct versions of DBRX are available on HuggingFace. DBRX is also available on GitHub. If you don’t want to install DBRX, the model is deployed on HuggingFace so you can try it out yourself. I’m happy to see the open source community keeping up with, and sometimes even surpassing, proprietary models from big tech companies like Microsoft, Google, and OpenAI. Thanks to the companies that contribute their models to the open-source community and the passionate developers behind them, access to state-of-the-art language models is no longer limited to a handful of big companies. Now, everyone can install tools like Ollama, download a language model, and start experimenting. This democratizes the opportunity for those who don’t have access to a warehouse full of Nvidia H100 GPUs to work with large language models without needing to pay OpenAI, Microsoft, or Google for the privilege of using their LLMs. It’s a positive development for researchers and the overall AI community. However, despite calling themselves “open source,” DBRX, Mixtral, Llama, and other models are not fully open source. Sure, the code and weights are publicly available, but there is still more required to truly consider these models open. I’d like to see more details on the training process, including the kind of processing power that was involved, how long it took, and what challenges the developers had to overcome. The biggest omission, however, is the training data. We receive a fully-trained model, but we don’t know what it was trained on. There might be a legal reason for not publishing the training data, as it could reveal copyrighted materials used in training these models, potentially exposing the developers to copyright lawsuits. It might be more accurate to use terms like “open weights” or “open code” models instead of “open source,” since the source is not fully open. If you enjoy this post, please click the ❤️ button or share it. Do you like my work? Consider becoming a paying subscriber to support it For those who prefer to make a one-off donation, you can 'buy me a coffee' via Ko-fi. Every coffee bought is a generous support towards the work put into this newsletter. Your support, in any form, is deeply appreciated and goes a long way in keeping this newsletter alive and thriving. 🦾 More than a humanExosuit Muscle Control Steps Closer to Reality Human Artificial Chromosomes with Reduced Multimerization Constructed 🧠 Artificial IntelligenceApple WWDC 2024 set for June 10-14, promises to be ‘A(bsolutely) I(ncredible)’ Announcing Grok-1.5 Amazon spends $2.75 billion on AI startup Anthropic in its largest venture investment yet OpenAI is expected to release a 'materially better' GPT-5 for its chatbot mid-year “The king is dead”—Claude 3 surpasses GPT-4 on Chatbot Arena for the first time ▶️ Humane AI Pin and Rabbit R1: What Are These Companies Hiding? (13:54) A couple of months ago, two new devices, Rabbit R1 and Humane AI Pin, entered the stage and grabbed everyone’s attention. These AI-first devices promise to create a new category of devices and redefine how we interact with computers and AI assistants. But is this hype, which led to over 100,000 pre-orders for Rabbit R1, justified? Are we just getting devices that are just a gimmick trying to ride the AI wave? This video from Dave2D perfectly expresses the concerns about these devices and their problems. ▶️ Making AI accessible with Andrej Karpathy and Stephanie Zhan (36:58) In this interview, Andrej Karpathy, one of the researchers behind the original deep learning paper and a well-respected AI researcher, shares his thoughts on AGI, the state of the AI industry, the importance of building a more open and vibrant AI ecosystem and how we can make building things with AI more accessible. If you're enjoying the insights and perspectives shared in the Humanity Redefined newsletter, why not spread the word? 🤖 RoboticsBoston Dynamics Unleashes New Spot Variant for Research Robotic police dog shot multiple times, credited with avoiding potential bloodshed If you're interested in building your own four-legged robotic pet, OpenCat could be the perfect project for you. OpenCat is an open-source Arduino and Raspberry Pi-based quadruped robot designed to be an educational and research tool. Since its launch in 2016, OpenCat has come a long way, fostering a community of makers who have expanded upon the original concept and contributed to the project. This includes designing open-source models for 3D printing, among other improvements. Using drone swarms to fight forest fires 🧬 BiotechnologyThe Next Generation of Cancer Drugs Will Be Made in Space Thanks for reading. If you enjoyed this post, please click the ❤️ button or share it. Humanity Redefined sheds light on the bleeding edge of technology and how advancements in AI, robotics, and biotech can usher in abundance, expand humanity's horizons, and redefine what it means to be human. A big thank you to my paid subscribers, to my Patrons: whmr, Florian, dux, Eric, Preppikoma and Andrew, and to everyone who supports my work on Ko-Fi. Thank you for the support! My DMs are open to all subscribers. Feel free to drop me a message, share feedback, or just say "hi!" |
Older messages
Nvidia aims to become the world's first AI foundry - Weekly News Roundup - Issue #459
Saturday, March 23, 2024
Plus: Microsoft + Inflection AI; Apple MM1; Grok is open source; a human with Neuralink implant plays chess and Civilisation; employees at top AI labs fear safety is an afterthought; and more! ͏ ͏ ͏ ͏
How Devin Signals the Age of AI Agent - Weekly News Roundup - Issue #458
Friday, March 15, 2024
Plus: humanoid robot understands human speech; Nvidia gets sued over AI use of copyrighted works; Mercedes-Benz will trial a humanoid robot; DeepMind SIMA; and more! ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Claude 3, the new best LLM on the block - Weekly News Roundup - Issue #457
Friday, March 8, 2024
Plus: OpenAI reveals Elon's emails; Unitree's humanoid robot is available for purchase; Microsoft's engineer raises concerns about Copilot Designer and responsible AI; and more! ͏ ͏ ͏
Your surgeon, a robot, will see you soon
Wednesday, March 6, 2024
How the robotic revolution promises to make surgeons more efficient and help patients recover more quickly from surgeries ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
CYBATHLON - The Olympics for Cyborgs - Weekly News Roundup - Issue #453
Monday, March 4, 2024
Plus: scammers steal $25 million with deepfakes; Bard becomes Gemini and Gemini Ultra is out; playing DOOM on cells; world's first transgenic ants; Atlas does something useful; and more!
You Might Also Like
Daily Coding Problem: Problem #1708 [Medium]
Tuesday, March 4, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Indeed. Given a 32 -bit positive integer N , determine whether it is a power of four in
Underscore Naming, Flask-SQLAlchemy, Kivy, and More
Tuesday, March 4, 2025
Single and Double Underscore Naming Conventions in Python #671 – MARCH 4, 2025 VIEW IN BROWSER The PyCoder's Weekly Logo Single and Double Underscore Naming Conventions in Python In this video
Dial An Advertiser ☎️
Tuesday, March 4, 2025
Things like phone books existed before phone books. Here's a version for your browser. Hunting for the end of the long tail • March 4, 2025 I've decided to stop being so unfair to myself with
Ranked | The World's Top 20 Economies by GDP Growth (2015-2025) 📊
Tuesday, March 4, 2025
Halfway through the 2020s, here's a report card on the top 20 economies and their progress since 2015. View Online | Subscribe | Download Our App Presented by Hinrich Foundation NEW REPORT:
Open Source Isnt Dead...Its Just Forked
Tuesday, March 4, 2025
Top Tech Content sent at Noon! Augment Code: Developer AI for real eng work. Start for free Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, March 4,
LW 172 - How to Make Compare at Pricing Show at Checkout
Tuesday, March 4, 2025
How to Make Compare at Pricing Show at Checkout Shopify Development news and articles Issue 172 -
Issue 165
Tuesday, March 4, 2025
💻🖱️ A single click destroyed this man's entire life. Fake murders get millions of YouTube views. Zuckerberg can now read your silent thoughts. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
This top multitool is under $30
Tuesday, March 4, 2025
Thinnest phone ever?📱; ArcoPlasma; Siri alternatives 🗣️ -- ZDNET ZDNET Tech Today - US March 4, 2025 GOTRAX 4 electric scooter I finally found a high-quality multitool for under $30 Compact and durable
Post from Syncfusion Blogs on 03/04/2025
Tuesday, March 4, 2025
New blogs from Syncfusion ® Stacked vs. Grouped Bar Charts in Blazor: Which is Better for Data Visualization? By Gowrimathi S Learn the difference between the stacked and grouped bar charts and choose
⚙️ GenAI Siri
Tuesday, March 4, 2025
Plus: TSMC's hundred billion dollar investment