DBRX - the new king of open source LLMs - Weekly News Roundup - Issue #460
DBRX - the new king of open source LLMs - Weekly News Roundup - Issue #460Plus: are Rabbit R1 and Humane AI Pin scams?; Anthropic gets $2.75 billion from Amazon; Grok-1.5; robotic police dog got shot multiple times; drugs made in space; and more!Hello and welcome to the Weekly News Roundup, Issue #460. The large language model space has seen some major reshuffling in the last few weeks. First, Anthropic’s Claude 3 Opus outperformed GPT-4, and now DBRX, a language model from Databricks, has claimed the title of the best open-source large language model. In other news, Claude 3 Opus dethroned GPT-4 on the Chatbot Arena Leaderboard; Apple is not-so-subtly hinting at AI features to be revealed at WWDC 2024; xAI has announced Grok-1.5, and Anthropic received a $2.75 billion investment from Amazon. Apart from that, this week’s issue includes an open-source robotic cat, a story about a robotic police dog that was shot multiple times and was credited with helping to avoid potential bloodshed, and one company's dream of manufacturing drugs in space. Enjoy! The space of open-source large language models has been dominated by various variants of Meta’s Llama 2, Mistral-7B and Mixtral from Mistral, or small models from Microsoft (Phi-2) and Google (Gemma). All these models offer very good performance and, in some cases, are very close to or even surpass proprietary models. But now, we have a new king of open-source large language models, and it doesn’t come from Meta, Mistral, Google, or Microsoft. It comes from Databricks, and its name is DBRX. Seemingly out of nowhere, Databrick released DBRX just a couple of days ago and immediately claimed the crown of the best open-source large language model, with the fine-tuned model even being on par with Google’s Gemini 1.0 Pro. DBRX is a Mixture of Experts (MoE) model with a total of 132B parameters with a 32k context window. DBRX has 16 experts and chooses 4, while Mixtral and Grok-1 have 8 experts and choose 2. According to Databricks, choosing 4 out of 16 experts results in 65x more possible combinations of experts which improves the model’s quality. In the post announcing DBRX, Databricks boasts some impressive numbers. According to that post, inference is up to 2x faster than LLaMA2-70B while being more efficient to train. Both Base and Instruct versions of DBRX are available on HuggingFace. DBRX is also available on GitHub. If you don’t want to install DBRX, the model is deployed on HuggingFace so you can try it out yourself. I’m happy to see the open source community keeping up with, and sometimes even surpassing, proprietary models from big tech companies like Microsoft, Google, and OpenAI. Thanks to the companies that contribute their models to the open-source community and the passionate developers behind them, access to state-of-the-art language models is no longer limited to a handful of big companies. Now, everyone can install tools like Ollama, download a language model, and start experimenting. This democratizes the opportunity for those who don’t have access to a warehouse full of Nvidia H100 GPUs to work with large language models without needing to pay OpenAI, Microsoft, or Google for the privilege of using their LLMs. It’s a positive development for researchers and the overall AI community. However, despite calling themselves “open source,” DBRX, Mixtral, Llama, and other models are not fully open source. Sure, the code and weights are publicly available, but there is still more required to truly consider these models open. I’d like to see more details on the training process, including the kind of processing power that was involved, how long it took, and what challenges the developers had to overcome. The biggest omission, however, is the training data. We receive a fully-trained model, but we don’t know what it was trained on. There might be a legal reason for not publishing the training data, as it could reveal copyrighted materials used in training these models, potentially exposing the developers to copyright lawsuits. It might be more accurate to use terms like “open weights” or “open code” models instead of “open source,” since the source is not fully open. If you enjoy this post, please click the ❤️ button or share it. Do you like my work? Consider becoming a paying subscriber to support it For those who prefer to make a one-off donation, you can 'buy me a coffee' via Ko-fi. Every coffee bought is a generous support towards the work put into this newsletter. Your support, in any form, is deeply appreciated and goes a long way in keeping this newsletter alive and thriving. 🦾 More than a humanExosuit Muscle Control Steps Closer to Reality Human Artificial Chromosomes with Reduced Multimerization Constructed 🧠 Artificial IntelligenceApple WWDC 2024 set for June 10-14, promises to be ‘A(bsolutely) I(ncredible)’ Announcing Grok-1.5 Amazon spends $2.75 billion on AI startup Anthropic in its largest venture investment yet OpenAI is expected to release a 'materially better' GPT-5 for its chatbot mid-year “The king is dead”—Claude 3 surpasses GPT-4 on Chatbot Arena for the first time ▶️ Humane AI Pin and Rabbit R1: What Are These Companies Hiding? (13:54) A couple of months ago, two new devices, Rabbit R1 and Humane AI Pin, entered the stage and grabbed everyone’s attention. These AI-first devices promise to create a new category of devices and redefine how we interact with computers and AI assistants. But is this hype, which led to over 100,000 pre-orders for Rabbit R1, justified? Are we just getting devices that are just a gimmick trying to ride the AI wave? This video from Dave2D perfectly expresses the concerns about these devices and their problems. ▶️ Making AI accessible with Andrej Karpathy and Stephanie Zhan (36:58) In this interview, Andrej Karpathy, one of the researchers behind the original deep learning paper and a well-respected AI researcher, shares his thoughts on AGI, the state of the AI industry, the importance of building a more open and vibrant AI ecosystem and how we can make building things with AI more accessible. If you're enjoying the insights and perspectives shared in the Humanity Redefined newsletter, why not spread the word? 🤖 RoboticsBoston Dynamics Unleashes New Spot Variant for Research Robotic police dog shot multiple times, credited with avoiding potential bloodshed If you're interested in building your own four-legged robotic pet, OpenCat could be the perfect project for you. OpenCat is an open-source Arduino and Raspberry Pi-based quadruped robot designed to be an educational and research tool. Since its launch in 2016, OpenCat has come a long way, fostering a community of makers who have expanded upon the original concept and contributed to the project. This includes designing open-source models for 3D printing, among other improvements. Using drone swarms to fight forest fires 🧬 BiotechnologyThe Next Generation of Cancer Drugs Will Be Made in Space Thanks for reading. If you enjoyed this post, please click the ❤️ button or share it. Humanity Redefined sheds light on the bleeding edge of technology and how advancements in AI, robotics, and biotech can usher in abundance, expand humanity's horizons, and redefine what it means to be human. A big thank you to my paid subscribers, to my Patrons: whmr, Florian, dux, Eric, Preppikoma and Andrew, and to everyone who supports my work on Ko-Fi. Thank you for the support! My DMs are open to all subscribers. Feel free to drop me a message, share feedback, or just say "hi!" |
Older messages
Nvidia aims to become the world's first AI foundry - Weekly News Roundup - Issue #459
Saturday, March 23, 2024
Plus: Microsoft + Inflection AI; Apple MM1; Grok is open source; a human with Neuralink implant plays chess and Civilisation; employees at top AI labs fear safety is an afterthought; and more! ͏ ͏ ͏ ͏
How Devin Signals the Age of AI Agent - Weekly News Roundup - Issue #458
Friday, March 15, 2024
Plus: humanoid robot understands human speech; Nvidia gets sued over AI use of copyrighted works; Mercedes-Benz will trial a humanoid robot; DeepMind SIMA; and more! ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Claude 3, the new best LLM on the block - Weekly News Roundup - Issue #457
Friday, March 8, 2024
Plus: OpenAI reveals Elon's emails; Unitree's humanoid robot is available for purchase; Microsoft's engineer raises concerns about Copilot Designer and responsible AI; and more! ͏ ͏ ͏
Your surgeon, a robot, will see you soon
Wednesday, March 6, 2024
How the robotic revolution promises to make surgeons more efficient and help patients recover more quickly from surgeries ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
CYBATHLON - The Olympics for Cyborgs - Weekly News Roundup - Issue #453
Monday, March 4, 2024
Plus: scammers steal $25 million with deepfakes; Bard becomes Gemini and Gemini Ultra is out; playing DOOM on cells; world's first transgenic ants; Atlas does something useful; and more!
You Might Also Like
Daily Coding Problem: Problem #1619 [Hard]
Monday, November 25, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given two non-empty binary trees s and t , check whether tree t has exactly the
Unpacking “Craft” in the Software Interface & The Five Pillars of Creative Flow
Monday, November 25, 2024
Systems Over Substance, Anytype's autumn updates, Ghost's progress with its ActivityPub integration, and a lot more in this week's issue of Creativerly. Creativerly Unpacking “Craft” in the
What Investors Want From AI Startups in 2025
Monday, November 25, 2024
Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 25, 2024? The HackerNoon
GCP Newsletter #426
Monday, November 25, 2024
Welcome to issue #426 November 25th, 2024 News LLM Official Blog Vertex AI Announcing Mistral AI's Large-Instruct-2411 on Vertex AI - Google Cloud has announced the availability of Mistral AI's
⏳ 36 Hours Left: Help Get "The Art of Data" Across the Finish Line 🏁
Monday, November 25, 2024
Visual Capitalist plans to unveal its secrets behind data storytelling, but only if the book hits its minimum funding goal. View Online | Subscribe | Download Our App We Need Your Help Only 36 Hours
DeveloPassion's Newsletter #180 - Black Friday Week
Monday, November 25, 2024
Edition 180 of my newsletter, discussing Knowledge Management, Knowledge Work, Zen Productivity, Personal Organization, and more! Sébastien Dubois DeveloPassion's Newsletter DeveloPassion's
Meet HackerNoon's Latest Features: Boost Stories with Translations, Speech-to-Text & More
Monday, November 25, 2024
Hey, Hacker! HackerNoon's monthly product update is here! Get ready for a new version of the mobile app, more translation developments, a new AI Gallery, backend moves, and more! 🚀 This product
The ultimate holiday gadget gift
Monday, November 25, 2024
AI isn't hitting a wall; $70 off Apple Watch; 60+ Amazon deals -- ZDNET ZDNET Tech Today - US November 25, 2024 Meta Quest 3S Why the Meta Quest 3S is the ultimate 2024 holiday present This $299
Deduplication in Distributed Systems: Myths, Realities, and Practical Solutions
Monday, November 25, 2024
This week, we'll discuss the deduplication strategies. We'll see whether they're useful and consider scenarios where you may need them. We'll also do a reality check with the promises
How to know if your data has been exposed
Monday, November 25, 2024
How do you know if your personal data has been leaked? Imagine getting an instant notification if your SSN, credit card, or password has been exposed on the dark web — so you can take action