|
|
Editor's Picks
- Searching for Unintended Biases With Saliency
Machine learning models are used for high stakes tasks like steering driverless cars or detecting cancerous tissue from medical scans. If there are spurious correlations¹ in the training data, the model might develop unintended biases that could lead to mistakes. In this post, we explore a technique for detecting these biases by asking for an explanation of how models make decisions...
- Machine Learning in Google Sheets
With Simple ML for Sheets, also referred to as Simple ML, everyone can use Machine Learning (ML) in Google Sheets without knowing ML, without coding, and without sharing data with third parties...This tutorial takes you through the steps of using Simple ML for Sheets to solve three exercises: Predicting missing values (task 1), identifying abnormal values (task 2), and training / evaluating & understanding a model manually (task 3)...
A Message from this week's Sponsor:

12/22 Data Leader Panel: How to Build Data Products that Drive Business Impact
Join this webinar to get practical advice from Data Strategy & Analytics Leaders at Apple, Novartis, and BT on how to build Data Products that drive business impact.
Data Science Articles & Videos
- Selling data science - Data science consulting: Part 1
In one of my last newsletters, I linked to a Reddit thread, “How does data science work in the consulting space?” and said that if there was enough interest, I’d cover some aspects of data science consulting in the newsletter from time to time. This is the first of those pieces...
- Comparison of data wrangling/ETL tools : R, Pandas, Knime, Power Query, Tableau Prep, Alteryx and Easy Data Transform with benchmarks
We struggled to find any benchmarks for a range of data wrangling/ETL[1] software, so we have done our own. This page show results from performance benchmarking the following on-premise (non-cloud) products using a 1 million row dataset: R, R + dplyr, R + data.table, Python + Pandas, Knime, Power Query, Easy Data Transform...
- Understanding Convolutions in Probability: A Mad-Science Perspective
After watching the recent 3Blue1Brown video on convolutions I realized that there is a surprising lack of articles on convolutions as they apply to probability...So in this post we're going to take a look at how to use convolutions, how to compute them and how they are defined mathematically... and we'll also throw in a bit of mad-science!...
- Is AI smarter than an infant? Not even close.
The field of embodied AI is moving quickly and some believe that this progress, coupled with progress in large language and vision models, suggests that AI may soon approach human-level world understanding. But, despite the incredible successes of AI systems, our community has not answered a fundamental question: do these advanced AI models understand how the physical world works?...Answering this question is critical to building AI systems that we can trust. For instance, if we cannot show that a model reliably understands that objects continue to exist when out of view, how can we ever trust it to drive our car? For now, you might prefer an infant to take the wheel...
- How ELT Schedules Can Improve Root Cause Analysis For Data Engineers
In this article, Ryan Kearns, co-author of O’Reilly’s Data Quality Fundamentals and a data scientist, discusses the limitations of segmentation analysis when it comes to root cause analysis for data teams, and proposes a better approach: ELT schedules as Bayesian Networks...
- Why Realtime ML Is Here To Stay
A very powerful trend is playing out right now — more and more top tech companies are making a larger part of their machine learning as realtime as possible...More specifically, the trend is to make the following parts of the ML stack realtime: a) Model scoring and b) Feature extraction...In this post, we’ll look at the reasons why this trend is playing out...
- Angry AI Birds
It turns out that there are AI competitions for the Angry Birds game over at aibirds.org. The long term goal of the competition is to build an intelligent Angry Birds playing agent that can play new levels better than the best human players, but there are sub competitions to help towards this goal...It turns out, it's harder then you might think...
- A Decade of Knowledge Graphs in Natural Language Processing: A Survey
In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years...we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work...
- The cloudy layers of modern-day programming
I’ve come to the realization that much of what we do in modern software development is not true software engineering. We spend the majority of our days trying to configure OpenSprocket 2.3.1 to work with NeoGidgetPro5, both of which were developed by two different third-party vendors and available as only as proprietary services in FoogleServiceCloud...
Tool*

Do more with data, together.
Bring SQL, Python, no-code, and R together in one UI. From exploratory analyses to beautiful data apps to ML modeling and data science, Hex streamlines the entire analytics workflow so your team can focus on generating insights, driving decisions, and moving things forward. No more jumping between tools, struggling with versions, or sharing via screenshot. Try Hex free with a 14 day trial and join companies like Notion, Fivetran and AngelList who are doing more with data.
*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!
Course*

Learn the Essentials of Data Science in the 21st century!
American University’s Institute for Data Science and Big Data is open to early and mid-career professionals looking to enhance their understanding of data science and apply it to their careers. Through seven days of lectures, guest speakers drawn from government, business and academia, and hands-on assignments on American University’s campus in Washington, D.C., you will learn tools, gain skills, and receive a certificate of completion to enhance your credentials.
To join us from Jan. 4 -12, 2023, apply now by December 23, 2022, by clicking here.
*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!
Jobs
- Senior Data Analyst - Epic Games - New York
Epic Games spans across 19 countries with 55 studios and 4,500+ employees globally. For over 25 years, we’ve been making award-winning games and engine technology that empowers others to make visually stunning games and 3D content that bring environments to life like never before.
Use your expert experience in data & analytics to build powerful stories and visuals that inform the games we make, the technology we develop, and business decisions that drive Epic... Epic Games is looking for a Senior Data Analyst to help us create the models that fuel our creator economy. The successful candidate will have excellent SQL knowledge, and enjoy combining analytic skills with business acumen to provide the data and insights that will drive our continued success...
Want to post a job here? Email us for details --> team@datascienceweekly.org
Training & Resources
- Bayesian Decision Analysis Tutorial
This tutorial is a hands-on introduction to Bayesian Decision Analysis (BDA), which is a framework for using probability to guide decision-making under uncertainty. I start with Bayes’s Theorem, which is the foundation of Bayesian statistics, and work toward the Bayesian bandit strategy, which is used for A/B testing, medical tests, and related applications. For each step, I provide a Jupyter notebook where you can run Python code and work on exercises. In addition to the bandit strategy, I summarize two other applications of BDA, optimal bidding and deriving a decision rule. Finally, I suggest resources you can use to learn more...
Last Week's Newsletter's 3 Most Clicked Links
* Based on unique clicks.
** Find last week's newsletter here.
Cutting Room Floor
P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian |
|
|
|