Was this email forwarded to you? Sign up here

👄 A New Open Source Massive Language Model

Weekly news digest curated by the industry insiders

May 8

📝 Editorial

Large language models are the norm of the day in deep learning. Every other month, we see news of a new multi-billion parameter pretrained model reaching new milestones on different language tasks. Despite that progress, only a handful of these models are available to the broader machine learning (ML) research community. The issue is not so much about AI giants trying to be protective about their IP and more about the computational and ethical challenges related to making this type of models readily available. Large language models’ high computational and energy requirements represent a high barrier to entry for most organizations. The ethical concerns related to open-sourcing models that can be used for malicious activities, such as fake news/image generation, are even more critical. Regardless of the challenges, we have seen notable steps toward responsible open-sourcing large language models.

Last week, Meta AI open-sourced the first version of OPT-175B, an astonishing 175 billion parameter language model that is able to master multiple language tasks. Together with the model source code, Meta AI open-sources the codebase to train the model using about 1/7th of the computation power required by GPT-3. This is not only relevant for computation savings but as a way to be responsible for the energy consumed when training these models. Additionally, Meta AI opened collaboration with different groups to ensure that OPT-175B is regularly evaluated on different ethics and responsible AI benchmarks. The release of OPT-175B is an important step toward making large language models more accessible to the broader deep learning community.

Share

🔺🔻TheSequence Scope – our Sunday edition with the industry’s development overview – is free. To receive high-quality content about the most relevant developments in the ML world every Tuesday and Thursday, please subscribe to TheSequence Edge 🔺🔻

🗓 Next week in TheSequence Edge:

Edge#189: we discuss pipeline parallelism; +PipeDream, an important Microsoft Research initiative to scale deep learning architectures; +BigDL, Intel’s open-source library for distributed deep learning on Spark.

Edge#190: a deep dive into continuous model observability with Superwise.ai.

Now, let’s review the most important developments in the AI industry this week

🔎 ML Research

Automated Model Parallelism

Google Research published a part detailing Alpha, a framework for seamless model parallelism →read more on Google Research blog

Benchmarking GNNs

Google Research published a paper introducing a methodology for benchmarking graph neural network models →read more on Google Research blog

Rethinking Human-in-the-Loop

Berkeley AI Research (BAIR) lab published a paper exploring new ideas for human evaluation of machine learning models →read more on BAIR blog

AI for Designing Tax Policy

Salesforce Research published a paper discussing the AI Economist, a reinforcement learning model used to design tax policies more effectively →read more on Salesforce Research blog

🤖 Cool AI Tech Releases

Meta OPT-175B

Meta AI Research (FAIR) open-sourced OPT-175B, a massive pretrained language model with 175 billion parameters →read more on FAIR team blog

📌 Follow us on Twitter

We share lots of helpful resources for your data science and ML journey

TheSequence @TheSequenceAI

A free book for you! Learn: 1. NumPy & Pandas 2. Matplotlib: data visualizations 3. Scikit-Learn: efficient & clean ML algorithms Read the open "Python Data Science Handbook: Essential Tools for Working with Data": jakevdp.github.io/PythonDataScie…

FOLLOW US ON TWITTER

🛠 Real World ML

Apache Flume at Walmart

Walmart published an insightful blog post about the use of Apache Flume to automate data transfers across their infrastructure →read more on Walmart Global Tech blog

💸 Money in AI

ML&AI

No-code NLP platform Accern raised a $20 million Series B round co-led by Mighty Capital and Fusion Fund. Hiring in Bangalore/India.
No-code chatbot authoring platform Druid raised a $15 million round led by Karma Ventures and Hoxton Ventures.
Unstructured data intelligence startup Galileo emerged from stealth with $5.1 million in seed funding led by The Factory. Hiring in San Francisco/US and remote.

AI-powered

API security & observability company Traceable AI raised $60 million in Series B funding led by Institutional Venture Partners (IVP). Hiring in San Francisco/US and remote.
Spatial intelligence company Slamcore raised $16 million in a Series A round of funding led by ROBO Global Ventures and Presidio Ventures. Hiring in London/England (hybrid remote).
3D animation platform Kinetix raised $11 million in a funding round led by Adam Ghobarah. Hiring in Paris/France or remote.
Captioning service Ava raised a $10 million Series A fundraising round led by Khosla Ventures. Hiring remote in the US and France.

Like

Comment

Share

You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities.

👄 A New Open Source Massive Language Model

👄 A New Open Source Massive Language Model

Weekly news digest curated by the industry insiders

📝 Editorial

🔎 ML Research

🤖 Cool AI Tech Releases

📌 Follow us on Twitter

🛠 Real World ML

💸 Money in AI

Older messages

📝 Guest post: Active Learning 101: A Complete Guide to Higher Quality Data* (part 2)

🧙🏻‍♂️ Edge#188: Inside Merlin, the Platform Powering Machine Learning at Shopify

📝 Guest post: Testing feature logic, transformations, and feature pipelines with pytest*

🥢 Edge#187: The Different Types of Data Parallelism

📌 Event: SuperAnnotate’s Free Webinar Series on Automated CV Pipelines is Live

You Might Also Like

Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator

Defining Your Paranoia Level: Navigating Change Without the Overkill

5 ways AI can help with taxes 🪄

Recurring Automations + Secret Updates

The First Provable AI-Proof Game: Introducing Butterfly Wings 4

GCP Newsletter #437

Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰

The Great Social Media Diaspora & Tapestry is here

Daily Coding Problem: Problem #1689 [Medium]

📧 Stop Conflating CQRS and MediatR