Which Machine Learning Classifiers are best for small datasets?
#138 — January 11, 2021 | View in browser |
AI Digest
Spread the word, build the community, share the knowledge – invite your friends.
this week's favorite
Which Machine Learning Classifiers are best for small datasets?
Although "big data" and "deep learning" are dominant, my own work at the Gates Foundation involves a lot of small (but expensive) datasets, where the number of rows (subjects, samples) is between 100 and 1000. For example, detailed measurements throughout a pregnancy and subsequent neonatal outcomes from pregnant women. A lot of my collaborative investigations involve fitting machine learning models to small datasets like these, and it's not clear what best practices are in this case.
NLP Datasets: 611 text datasets in 467 languages
Datasets is a lightweight python library providing two main features: one-line data loaders for public dataset and efficient data pre-processing:
Why I’m lukewarm on graph neural networks
GNNs can provide wins over simpler embedding methods, but we’re at a point where other research directions matter more.
DALL·E: Creating Images from Text
We’ve trained a neural network called DALL·E that creates images from text captions for a wide range of concepts expressible in natural language.
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile is a 825 GiB diverse, open source language modelling data set that consists of 22 smaller, high-quality datasets combined together.
newsletters
Older messages
RoboLeague: A clone of Rocket League for AI experiments
Sunday, January 3, 2021
And more news, tutorials and articles about AI, machine learning, and data science in this week's issue. #137 — January 04, 2021 View in browser AI Digest Spread the word, build the community,
How NOT to learn Machine Learning
Sunday, December 27, 2020
And more news, tutorials and articles about AI, machine learning, and data science in this week's issue. #136 — December 28, 2020 View in browser AI Digest Spread the word, build the community,
How to manage your data the way you manage your code
Sunday, December 20, 2020
And more news, tutorials and articles about AI, machine learning, and data science in this week's issue. #135 — December 21, 2020 View in browser AI Digest Spread the word, build the community,
Feature Learning in Infinite-Width Neural Networks
Sunday, December 13, 2020
And more news, tutorials and articles about AI, machine learning, and data science in this week's issue. #134 — December 14, 2020 View in browser AI Digest Spread the word, build the community,
The human side of AI for chess
Sunday, December 6, 2020
And more news, tutorials and articles about AI, machine learning, and data science in this week's issue. #133 — December 07, 2020 View in browser AI Digest Spread the word, build the community,
You Might Also Like
Tuesday Triage #200 and giveaway
Tuesday, May 14, 2024
Your weekly crème de la crème of the Internet is here! The 200th edition featuring annual subscriptions giveaway, thoughts on nearly four years of ...
🎮 How AI Tools Are Changing Game Development — Grab a Pixel 8a Instead of Waiting for Pixel 9
Tuesday, May 14, 2024
Also: Sharing Your Google Maps Trip Progress, and More! How-To Geek Logo May 14, 2024 Did You Know In a bid to keep the ingredients secret, WD-40 was never patented. 🤖 The New GPT It's Tuesday!
Meta shuts down Workplace
Tuesday, May 14, 2024
Plus: Everything that happened at Google I/O and AWS CEO steps down View this email online in your browser By Christine Hall Tuesday, May 14, 2024 Hello, and welcome back to TechCrunch PM. The team
Flattening Lists of Lists, Python 3.13, Sets, and More
Tuesday, May 14, 2024
Flattening a List of Lists in Python #629 – MAY 14, 2024 VIEW IN BROWSER The PyCoder's Weekly Logo Flattening a List of Lists in Python In this video course, you'll learn how to flatten a list
Daily Coding Problem: Problem #1441 [Easy]
Tuesday, May 14, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. UTF-8 is a character encoding that maps each symbol to one, two, three, or four
Noonification: 3 Quick Ways to Optimize RecyclerView
Tuesday, May 14, 2024
Top Tech Content sent at Noon! Get Algolia: AI Search that understands How are you, @newsletterest1? 🪐 What's happening in tech today, May 14, 2024? The HackerNoon Newsletter brings the HackerNoon
Using 97 fewer cores thanks to PGO
Tuesday, May 14, 2024
Plus an HNSW indexed vector store library, a new Go game hits the Steam store, and is 'ok' ok?. | #507 — May 14, 2024 Unsub | Web Version Together with Stytch logo Go Weekly Reclaiming CPU for
Ranked | The Top 6 Economies by Share of Global GDP (1980-2024) 📈
Tuesday, May 14, 2024
Gain a unique perspective on the world's economic order from this graphic showing percentage share of global GDP over time. View Online | Subscribe Presented by: Data that drives the
Free online event this Thursday: Getting ahead with time series data
Tuesday, May 14, 2024
Free Online Event Do you know how your competitors use time series data to get ahead? Join us on Thursday, May 16 at 10am PT/1pm ET for a free, hour-long online fireside chat called “Unleash the Full
Here's the deal
Tuesday, May 14, 2024
We wanted you to be among the first to know about our plans to relaunch the Gigantic training courses that Product Collective now powers! Here's the deal: From May 20th - May 31st, anybody that