Forwarded this email? Subscribe here for more

Data Science Weekly - Issue 533

Curated news, articles and jobs related to Data Science, AI, & Machine Learning

Data Science Weekly

Feb 9

READ IN APP

Issue #533
February 08, 2024

Hello!

Once a week, we write this email to share the links we thought were worth sharing in the Data Science, ML, AI, Data Visualization, and ML/Data Engineering worlds.

If you like what you read, consider becoming a paid member here: https://datascienceweekly.substack.com/subscribe :)

And now…let's dive into some interesting links from this week.

Editor's Picks

Mission Critical -- Satellite Data is a Distinct Modality in Machine Learning
This position paper argues that satellite data constitutes a distinct modality for machine learning research and that we must recognize it as such to advance the quality and impact of SatML research across theory, methods, and deployment. We outline critical discussion questions and actionable suggestions to transform SatML from merely an intriguing application area to a dedicated research discipline that helps move the needle on big challenges for machine learning and society…

How I Know Your Data Science/ML Project Will Fail Before You Even Begin
With a high probability, I can tell that your data science or machine learning project will fail—before you even begin! We’ve seen hundreds of data projects over the past 10+ years and distilled the patterns that correlate with success…
Artificial and Biological Intelligence: Humans, Animals, and Machines
I believe a highly promising direction in AI research is to use artificial intelligence to better understand biological intelligence, and conversely, to use our understanding of biological intelligence to better understand how artificial intelligence works…

A Message from this week's Sponsor:

New Infrastructure to Build Knowledgeable AI

Learn how Pinecone's new serverless vector database helps Notion, Gong, and CS DISCO optimize their AI infrastructure from our VP of R&D, Ram Sriharsha:

Up to 50x lower costs because of the separation of reads, writes, and storage
O(s) fresh results with vector clustering over blob storage
Fast search without sacrificing recall powered by industry-first indexing and retrieval algorithms
Powerful performance with a multi-tenant compute layer
Zero configuration or ongoing management

Read the technical deep dive to understand how it was built and the unique considerations that needed to be made.

* Want to sponsor the newsletter? Email us for details --> team@datascienceweekly.org

Data Science Articles & Videos

The Mind-Boggling Reach of Super Bowl Commercials: A Statistical Analysis
This year's Super Bowl will attract an estimated 120 million viewers, making this program the second-most watched broadcast of all time—after the moon landing…There will be 70 big-budget commercials interspersed throughout the game, with each advertisement costing an absurd $7 million per 30-second timeslot. And yet, unlike all other nights when commercials are a hindrance, viewers will relish these ads, dazzled by the wonders of corporate marketing. So today, we'll explore The Super Bowl's ever-growing cultural dominance, America's odd infatuation with Super Bowl advertisements, and the absurd reach of these commercials…
The Case for Open Source AI
Open source is indisputably one of the biggest drivers of progress in software and by extension AI. The field would be unrecognizable without it. However, it is under existential threat from regulation that will advantage entrenched interests. We believe that open AI is vital for research, innovation, competition, and safety. We must defend it vigorously…
Gradient-based trajectory planning
how much i trust gradient descent. crazy, right? yes. i then succumbed to this temptation and looked for some simple example to test my trust in gradient descent. yes, i know that i should never doubt our lord Gradient Descent, but my belief is simply too weak. so, i decided to use gradient descent for simple trajectory planning given a 2D map…
AMD CTO is here [Reddit]
Hey guys, I introduced Mark Papermaster to this subreddit today. He said he will check it out. He was very kind and nice. We are very much like the new Homebrew Computing club. What questions and requests do you have for Mark?…
Great Tables - Absolutely Delightful Table-making in Python
With Great Tables anyone can make wonderful-looking tables in Python. The philosophy here is that we can construct a wide variety of useful tables by working with a cohesive set of table components. You can mix and match things like a header and footer, attach a stub (which contains row labels), arrange spanner labels over top of the column labels, and much more. Not only that, but you can format the cell values in a variety of awesome ways…
SQL for the Weary in 100 queries
learning outcomes
1. Explain the difference between a database and a database manager.
2. Write SQL to select, filter, sort, group, and aggregate data.
3. Define tables and insert, update, and delete records.
4. Describe different types of join and write queries that use them to combine data.
5. Use windowing functions to operate on adjacent rows.
6. Explain what transactions are and write queries that roll back when constraints are violated.
7. Explain what triggers are and write SQL to create them.
8. Manipulate JSON data using SQL.
9. Interact with a database using Python directly, from a Jupyter notebook, and via an ORM…
MLX Community Projects
Let's collect some cool MLX integrations and community lead projects here for visibility! If you have a project you would like to feature, leave a comment, and we will add it…
Estimating Above Ground Biomass using Random Forest Regression in GEE
In this post, we will learn how to build a regression model in Google Earth Engine and use it to estimate total above-ground biomass using openly available Earth Observation datasets…
The Many Ways to Deploy a Model
Over the past years, we have been helping companies deploy a wildly diverse set of ML workloads in production. Last year, we added open-source large language models (LLMs) in the mix. Continuing the line of research we started with NVIDIA, we recently collaborated with Hamel Husain, an LLM expert at Parlance Labs, to explore various popular solutions to model serving in general, LLM inference in particular. In this article, we share our decision rubric for model deployments using LLM inference as an example…
Anyone else’s company executives losing their shit over GenAI? [Reddit]
The company I work for (large company serving millions of end-users), appear to have completely lost their minds over GenAI. It started quite well. They were interested, I was in a good position as being able to advise them…However, now they are just trying to shoehorn gen AI wherever they can for the sake of the investors. They are not making rational decisions anymore. They aren't even asking me about it anymore. Some exec wakes up one day and has a crazy misguided idea about sticking gen AI somewhere and then asking junior (non DS) devs to build it without DS input. All the while, traditional ML is actually making the company money, projects are going well, but getting ignored. Does this sound familiar?…
You Just Said Something Wrong About Logistic Regression
Congratulations, you just said something wrong about logistic regression. That’s OK, logistic regression is hard and we all have to learn/re-learn some things from time to time…This is a living blog post intended to address some common misconceptions or flat out wrong statements I’ve seen people make about logistic regression…
LoRA From Scratch – Implement Low-Rank Adaptation for LLMs in PyTorch
LoRA, which stands for Low-Rank Adaptation, is a popular technique to finetune LLMs more efficiently. Instead of adjusting all the parameters of a deep neural network, LoRA focuses on updating only a small set of low-rank matrices. This Studio explains how LoRA works by coding it from scratch, which is an excellent exercise for looking under the hood of an algorithm…

Training & Resources

UW’s LING 575: NLP for Cultural Analytics
Surveys tools, frameworks, and skills needed to apply natural language processing methods to applications in the humanities and social sciences, with a focus on the analysis of large digital text corpora, including social media, literature, and historical documents. Topics will include data collection, text processing and machine learning techniques, data visualization, and ethical considerations…
CMU’s Advanced NLP Spring 2024
[Video lectures on YouTube here] CS11-711 Advanced Natural Language Processing is an introductory graduate-level course on natural language processing aimed at students who are interested in doing cutting-edge research in the field. In it, we describe fundamental tasks in natural language processing such as syntactic, semantic, and discourse analysis, as well as methods to solve these tasks. The course focuses on modern methods using neural networks and covers the basic modeling and learning algorithms required. The class culminates in a project in which students attempt to reimplement and improve upon a research paper in a topic of their choosing….
The Math Behind the Adam Optimizer
You’ve likely heard about Adam, a name that has gained notable recognition in many winning Kaggle competitions. It’s common to experiment with a few optimizers like SGD, Adagrad, Adam, or AdamW, but truly understanding their mechanics is a different story. By the end of this post, you’ll be among the select few who not only know about Adam optimization but also understand how to leverage its power effectively…

Last Week's Newsletter's 3 Most Clicked Links

* Based on unique clicks.
** Find last week's issue #532 here.

Whenever you're ready, 2 ways we can help:

Looking to get a job? Check out our “Get A Data Science Job” Course
A comprehensive course that teaches you everything related to getting a data science job based on answers to thousands of emails from readers like you. The course has 3 sections: Section 1 covers how to get started, Section 2 covers how to assemble a portfolio to showcase your experience (even if you don’t have any), and Section 3 covers how to write your resume.
Promote yourself/organization to ~61,000 subscribers by sponsoring this newsletter. 35-45% weekly open rate.

Thank you for joining us this week! :)

Stay Data Science-y!

All our best,
Hannah & Sebastian

P.S. Pay us some money :) The membership program funds the free newsletter: https://datascienceweekly.substack.com/subscribe

P.P.S. “A SQL query walks into a bar, sees two tables, and asks, 'May I join you'?”

You're currently a free subscriber to Data Science Weekly Newsletter. For the full experience, upgrade your subscription.

Data Science Weekly - Issue 532

Friday, February 2, 2024

Curated news, articles and jobs related to Data Science, AI, & Machine Learning ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

Data Science Weekly - Issue 531

Friday, January 26, 2024

Curated news, articles and jobs related to Data Science, AI, & Machine Learning ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

Thank you for supporting Data Science Weekly Newsletter

Saturday, January 20, 2024

Data Science Weekly Newsletter Thank you for reading Data Science Weekly Newsletter. As a token of our appreciation, we're offering you a limited-time offer of 20% off a paid subscription. Redeem

Data Science Weekly - Issue 530

Friday, January 19, 2024

Curated news, articles and jobs related to Data Science, AI, & Machine Learning ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

Data Science Weekly - Issue 529

Friday, January 12, 2024

Curated news, articles and jobs related to Data Science, AI, & Machine Learning ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator

Friday, February 14, 2025

What came before the golem? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Defining Your Paranoia Level: Navigating Change Without the Overkill

Friday, February 14, 2025

We've all been there: trying to learn something new, only to find our old habits holding us back. We discussed today how our gut feelings about solving problems can sometimes be our own worst enemy

5 ways AI can help with taxes 🪄

Friday, February 14, 2025

Remotely control an iPhone; 💸 50+ early Presidents' Day deals -- ZDNET ZDNET Tech Today - US February 10, 2025 5 ways AI can help you with your taxes (and what not to use it for) 5 ways AI can help

Recurring Automations + Secret Updates

Friday, February 14, 2025

Smarter automations, better templates, and hidden updates to explore 👀 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

The First Provable AI-Proof Game: Introducing Butterfly Wings 4

Friday, February 14, 2025

Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? undefined The Market Today #01 Instagram (Meta) 714.52 -0.32%

GCP Newsletter #437

Friday, February 14, 2025

Welcome to issue #437 February 10th, 2025 News BigQuery Cloud Marketplace Official Blog Partners BigQuery datasets now available on Google Cloud Marketplace - Google Cloud Marketplace now offers

Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰

Friday, February 14, 2025

Discover how the share of US wealth held by the top 1% has evolved from 1989 to 2024 in this infographic. View Online | Subscribe | Download Our App Download our app to see thousands of new charts from

The Great Social Media Diaspora & Tapestry is here

Friday, February 14, 2025

Apple introduces new app called 'Apple Invites', The Iconfactory launches Tapestry, beyond the traditional portfolio, and more in this week's issue of Creativerly. Creativerly The Great

Daily Coding Problem: Problem #1689 [Medium]

Friday, February 14, 2025

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a linked list, sort it in O(n log n) time and constant space. For example,

📧 Stop Conflating CQRS and MediatR

Friday, February 14, 2025

Stop Conflating CQRS and MediatR Read on: my website / Read time: 4 minutes The .NET Weekly is brought to you by: Step right up to the Generative AI Use Cases Repository! See how MongoDB powers your

Data Science Weekly - Data Science Weekly - Issue 533

Data Science Weekly - Issue 533

Curated news, articles and jobs related to Data Science, AI, & Machine Learning

Issue #533
February 08, 2024

Editor's Picks

A Message from this week's Sponsor:

New Infrastructure to Build Knowledgeable AI

Data Science Articles & Videos

Training & Resources

Last Week's Newsletter's 3 Most Clicked Links

Whenever you're ready, 2 ways we can help:

Older messages

Data Science Weekly - Issue 532

Data Science Weekly - Issue 531

Thank you for supporting Data Science Weekly Newsletter

Data Science Weekly - Issue 530

Data Science Weekly - Issue 529

You Might Also Like

Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator

Defining Your Paranoia Level: Navigating Change Without the Overkill

5 ways AI can help with taxes 🪄

Recurring Automations + Secret Updates

The First Provable AI-Proof Game: Introducing Butterfly Wings 4

GCP Newsletter #437

Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰

The Great Social Media Diaspora & Tapestry is here

Daily Coding Problem: Problem #1689 [Medium]

📧 Stop Conflating CQRS and MediatR

Data Science Weekly - Data Science Weekly - Issue 533

Curated news, articles and jobs related to Data Science, AI, & Machine Learning

Issue #533February 08, 2024

Editor's Picks

A Message from this week's Sponsor:

Data Science Articles & Videos

Training & Resources

Last Week's Newsletter's 3 Most Clicked Links

Whenever you're ready, 2 ways we can help:

Older messages

You Might Also Like

Issue #533
February 08, 2024