👄 A New Open Source Massive Language Model
Was this email forwarded to you? Sign up here 📝 EditorialLarge language models are the norm of the day in deep learning. Every other month, we see news of a new multi-billion parameter pretrained model reaching new milestones on different language tasks. Despite that progress, only a handful of these models are available to the broader machine learning (ML) research community. The issue is not so much about AI giants trying to be protective about their IP and more about the computational and ethical challenges related to making this type of models readily available. Large language models’ high computational and energy requirements represent a high barrier to entry for most organizations. The ethical concerns related to open-sourcing models that can be used for malicious activities, such as fake news/image generation, are even more critical. Regardless of the challenges, we have seen notable steps toward responsible open-sourcing large language models. Last week, Meta AI open-sourced the first version of OPT-175B, an astonishing 175 billion parameter language model that is able to master multiple language tasks. Together with the model source code, Meta AI open-sources the codebase to train the model using about 1/7th of the computation power required by GPT-3. This is not only relevant for computation savings but as a way to be responsible for the energy consumed when training these models. Additionally, Meta AI opened collaboration with different groups to ensure that OPT-175B is regularly evaluated on different ethics and responsible AI benchmarks. The release of OPT-175B is an important step toward making large language models more accessible to the broader deep learning community. 🔺🔻TheSequence Scope – our Sunday edition with the industry’s development overview – is free. To receive high-quality content about the most relevant developments in the ML world every Tuesday and Thursday, please subscribe to TheSequence Edge 🔺🔻 🗓 Next week in TheSequence Edge: Edge#189: we discuss pipeline parallelism; +PipeDream, an important Microsoft Research initiative to scale deep learning architectures; +BigDL, Intel’s open-source library for distributed deep learning on Spark. Edge#190: a deep dive into continuous model observability with Superwise.ai. Now, let’s review the most important developments in the AI industry this week 🔎 ML ResearchAutomated Model Parallelism Google Research published a part detailing Alpha, a framework for seamless model parallelism →read more on Google Research blog Benchmarking GNNs Google Research published a paper introducing a methodology for benchmarking graph neural network models →read more on Google Research blog Rethinking Human-in-the-Loop Berkeley AI Research (BAIR) lab published a paper exploring new ideas for human evaluation of machine learning models →read more on BAIR blog AI for Designing Tax Policy Salesforce Research published a paper discussing the AI Economist, a reinforcement learning model used to design tax policies more effectively →read more on Salesforce Research blog 🤖 Cool AI Tech ReleasesMeta OPT-175B Meta AI Research (FAIR) open-sourced OPT-175B, a massive pretrained language model with 175 billion parameters →read more on FAIR team blog 📌 Follow us on TwitterWe share lots of helpful resources for your data science and ML journey ![]() ![]() ![]() 🛠 Real World MLApache Flume at Walmart Walmart published an insightful blog post about the use of Apache Flume to automate data transfers across their infrastructure →read more on Walmart Global Tech blog 💸 Money in AIML&AI
AI-powered
You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
📝 Guest post: Active Learning 101: A Complete Guide to Higher Quality Data* (part 2)
Friday, May 6, 2022
In this article, Superb AI's team explains the benefits of building an active learning flow for your computer vision project
🧙🏻♂️ Edge#188: Inside Merlin, the Platform Powering Machine Learning at Shopify
Thursday, May 5, 2022
The eCommerce giant published some details about the platform powering its ML workflows
📝 Guest post: Testing feature logic, transformations, and feature pipelines with pytest*
Wednesday, May 4, 2022
Operational machine learning requires the offline and online testing of both features and models. In this guest post, our partner Hopsworks shows you how to design, build, and run offline tests for
🥢 Edge#187: The Different Types of Data Parallelism
Tuesday, May 3, 2022
In this issue: we overview the different types of data parallelism; we explain TF-Replicator, DeepMind's framework for distributed ML training; we explore FairScale, a PyTorch-based library for
📌 Event: SuperAnnotate’s Free Webinar Series on Automated CV Pipelines is Live
Monday, May 2, 2022
Join the Upcoming Session
You Might Also Like
Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator
Friday, February 14, 2025
What came before the golem? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Defining Your Paranoia Level: Navigating Change Without the Overkill
Friday, February 14, 2025
We've all been there: trying to learn something new, only to find our old habits holding us back. We discussed today how our gut feelings about solving problems can sometimes be our own worst enemy
5 ways AI can help with taxes 🪄
Friday, February 14, 2025
Remotely control an iPhone; 💸 50+ early Presidents' Day deals -- ZDNET ZDNET Tech Today - US February 10, 2025 5 ways AI can help you with your taxes (and what not to use it for) 5 ways AI can help
Recurring Automations + Secret Updates
Friday, February 14, 2025
Smarter automations, better templates, and hidden updates to explore 👀 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The First Provable AI-Proof Game: Introducing Butterfly Wings 4
Friday, February 14, 2025
Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? undefined The Market Today #01 Instagram (Meta) 714.52 -0.32%
GCP Newsletter #437
Friday, February 14, 2025
Welcome to issue #437 February 10th, 2025 News BigQuery Cloud Marketplace Official Blog Partners BigQuery datasets now available on Google Cloud Marketplace - Google Cloud Marketplace now offers
Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰
Friday, February 14, 2025
Discover how the share of US wealth held by the top 1% has evolved from 1989 to 2024 in this infographic. View Online | Subscribe | Download Our App Download our app to see thousands of new charts from
The Great Social Media Diaspora & Tapestry is here
Friday, February 14, 2025
Apple introduces new app called 'Apple Invites', The Iconfactory launches Tapestry, beyond the traditional portfolio, and more in this week's issue of Creativerly. Creativerly The Great
Daily Coding Problem: Problem #1689 [Medium]
Friday, February 14, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a linked list, sort it in O(n log n) time and constant space. For example,
📧 Stop Conflating CQRS and MediatR
Friday, February 14, 2025
Stop Conflating CQRS and MediatR Read on: my website / Read time: 4 minutes The .NET Weekly is brought to you by: Step right up to the Generative AI Use Cases Repository! See how MongoDB powers your