Data Science Weekly - Data Science Weekly - Issue 429

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #429

February 10 2022

Editor Picks
 
  • Data Distribution Shifts and Monitoring
    About two years ago, a company hired a consulting firm to develop an ML model to help them predict how many of each grocery item they’d need next week, so they could restock the items accordingly...When the consulting firm handed the model over, the company deployed it and was very happy with its performance...However, a year later, their numbers went down...the company learned the hard way an important lesson that the rest of the industry is also discovering: deploying a model isn’t the end of the process. A model’s performance degrades over time in production. Once a model has been deployed, we still have to continually monitor its performance to detect issues as well as deploy updates to fix these issues...
 
 

A Message from this week's Sponsor:

 



Live Webinar | How to accelerate AI and BI impact with an effective data strategy

Get practical advice from enterprise Data & Analytics Leaders from Bank of America, The Hanover Insurance Group, Rockwell Automation, and SAP on how they establish deep partnerships with business units and align data strategy to business priorities and financial goals to accelerate AI & BI impact.

 

 

Data Science Articles & Videos

 
  • How Platform Recommenders Work
    A recommender system (or simply ‘recommender’) is an algorithm that takes a large set of items and determines which of those to display to a user — think the Facebook News Feed, the Twitter timeline, Google News, or the YouTube homepage. Recommenders are necessary tools to help navigate the sheer volume of content produced each day, but their scale and rapid development can cause unintended consequences...This post is based on public information found in company blog posts, academic papers written by platform employees, journalistic investigations and leaked documents. Each of these sources has limitations, but taken together they show repeating patterns of design and operation...
  • Startup Scrappiness, Venture Matchmaking, and Thinking In Bets with Leigh-Marie Braswell
    Our wide-ranging conversation touches on her early interest in solving math problems competitively; her undergraduate education at MIT and exposure to machine learning; her 4-year journey as the first product manager at Scale AI building 3D annotation products, scaling operational excellence, and instituting a relentless speed of execution; her foray into angel investing, joining Founders Fund, and being helpful as an investor; common threads between engineering, product, and venture; lesson learned from playing poker; and much more...
  • Statistical exponential families: A digest with flash cards
    This document describes concisely the ubiquitous class of exponential family distributions met in statistics. The first part recalls definitions and summarizes main properties and duality with Bregman divergences (all proofs are skipped). The second part lists decompositions and related formula of common exponential family distributions. We recall the Fisher-Rao-Riemannian geometries and the dual affine connection information geometries of statistical manifolds. It is intended to maintain and update this document and catalog by adding new distribution items...
  • The art of solving problems with Monte Carlo simulations
    This article will explore some examples and applications of Monte Carlo simulations using the Go programming language. To keep this article fun and interactive, after each Go code provided, you will find a link to the Go Playground, where you can run it without installing Go on your machine...
  • Causal Machine Learning and Business Decision Making
    Causal knowledge is critical for strategic and organizational decision-making. By contrast, standard machine learning approaches remain purely correlational and prediction-based, rendering them unsuitable for addressing a wide variety of managerial decision problems. Taking a mixed-methods approach, which relies on multiple sources, including semi-structured interviews with data scientists and senior decision-makers, as well as quantitative survey data, this study argues that causality is a critical boundary condition for the application of machine learning in a business analytical context. It highlights the crucial role of theory in causal inference and offers a new perspective on human-machine interaction for data-augmented decision making...
  • A Theoretical Comparison of Graph Neural Network Extensions
    We study and compare different Graph Neural Network extensions that increase the expressive power of GNNs beyond the Weisfeiler-Leman test. We focus on (i) GNNs based on higher order WL methods, (ii) GNNs that preprocess small substructures in the graph, (iii) GNNs that preprocess the graph up to a small radius, and (iv) GNNs that slightly perturb the graph to compute an embedding...as our main result, we compare the expressiveness of these extensions to each other through a series of example constructions that can be distinguished by one of the extensions, but not by another one...
  • Assistive Teleoperation via Human-in-the-Loop Reinforcement Learning
    Building assistive interfaces for controlling robots through arbitrary, high-dimensional, noisy inputs (e.g., webcam images of eye gaze) can be challenging, especially when it involves inferring the user's desired action in the absence of a natural 'default' interface. Reinforcement learning from online user feedback on the system's performance presents a natural solution to this problem, and enables the interface to adapt to individual users. However, this approach tends to require a large amount of human-in-the-loop training data, especially when feedback is sparse. We propose a hierarchical solution that learns efficiently from sparse user feedback: we use offline pre-training to acquire a latent embedding space of useful, high-level robot behaviors, which, in turn, enables the system to focus on using online user feedback to learn a mapping from user inputs to desired high-level behaviors...
  • Introduction to the A* Algorithm
    In games we often want to find paths from one location to another. We’re not only trying to find the shortest distance; we also want to take into account travel time. Move the blob (start point) and cross (end point) to see the shortest path. To find this path we can use a graph search algorithm, which works when the map is represented as a graph. A* is a popular choice for graph search. Breadth First Search is the simplest of the graph search algorithms, so let’s start there, and we’ll work our way up to A*...
  • GPT-NeoX-20B: An Open-Source Autoregressive Language Model [PDF]
    GPT-NeoX-20B is a 20 billion parameter autoregressive language model whose weights will be made freely and openly available to the public through a permissive license. It is, to the best of our knowledge, the largest dense autoregressive model that has publicly available weights. In this paper, we describe the model architecture and training, evaluate its performance, and discuss the broader impacts of its release. We are open-sourcing the training and evaluation code, as well as the model weights...
 
 

Forum*

 



Check out the new Anaconda Community for all-things data!

Want insights into the newest developments in the world of data, or need help getting “unstuck” on a problem?

Our Community Forums is the place to go! Be the first to engage with other professionals and ask questions to the broader data community. Users can join in conversations around trends, debate new features, post questions to the community, and more. Plus, it’s another avenue for technical help!

Create your free Anaconda Community account now.


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 
 

Jobs

 
  • (Senior) Analytics Engineer - Fabulous - Remote

    Fabulous is a mobile app helping thousands of people every day to change their lifestyles by integrating healthy habits into their lives. Fabulous is using a behavioral economics lens to help everyone achieve their fullest potential. We work closely with researchers based at Duke University and our advisor is Dan Ariely, author of NYT bestseller Predictably Irrational. We are looking for an experienced Analytics Engineer to consolidate the Data Science team and lead the development and enrichment of our Data Pipelines. We have a modern Data-Stack based on Fivetran, dbt, BigQuery, Amplitude, Metabase...

        Want to post a job here? Email us for details >> team@datascienceweekly.org

 
 

Training & Resources

 
  • Hiring Data Scientists and Machine Learning Engineers - A practical guide
    Hiring Data Scientists and Machine Learning Engineers is a concise, practical guide to help you hire the right people for your organization. The book will help you navigate the plethora of data science related roles and skills and help you create an effective hiring strategy to suit your organization's needs...
  • Algorithms for Decision Making
    This book provides a broad introduction to algorithms for decision making under uncertainty. We cover a wide variety of topics related to decision making, introducing the underlying mathematical problem formulations and the algorithms for solving them...
  • Machine Learning Simplified: A Gentle Introduction to Supervised Learning
    The underlying goal of "Machine Learning Simplified" is to develop strong intuition into inner workings of ML. We use simple intuitive examples to explain complex concepts, algorithms or methods, as well as democratize all mathematics "behind the scenes"...After reading this book, you will understand everything that comes into the scope of supervised ML. You will be able to not only understand nitty-gritty details of mathematics, but also explain to anyone how things work on a high level...
 
 

Books

 

 
  • Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits


    Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
     


    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2021 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

Data Science Weekly - Issue 428

Friday, February 4, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #428 February 03 2022 Editor Picks

Data Science Weekly - Issue 427

Friday, January 28, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #427 January 27 2022 Editor Picks

Data Science Weekly - Issue 426

Friday, January 21, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #426 January 20 2022 Editor Picks These

[in case you missed it] Data Science Weekly - Issue 425

Monday, January 17, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #425 January 13 2022 Editor Picks 🚩 red

Data Science Weekly - Issue 425

Friday, January 14, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #425 January 13 2022 Editor Picks 🚩 red

You Might Also Like

Re: Hackers may have stolen everyone's SSN!

Saturday, November 23, 2024

I wanted to make sure you saw Incogni's Black Friday deal, which is exclusively available for iPhone Life readers. Use coupon code IPHONELIFE to save 58%. Here's why we recommend Incogni for

North Korean Hackers Steal $10M with AI-Driven Scams and Malware on LinkedIn

Saturday, November 23, 2024

THN Daily Updates Newsletter cover Generative AI For Dummies ($18.00 Value) FREE for a Limited Time Generate a personal assistant with generative AI Download Now Sponsored LATEST NEWS Nov 23, 2024

📧 Building Async APIs in ASP.NET Core - The Right Way

Saturday, November 23, 2024

​ Building Async APIs in ASP .NET Core - The Right Way Read on: m​y website / Read time: 5 minutes The .NET Weekly is brought to you by: Even the smartest AI in the world won't save you from a

WebAIM November 2024 Newsletter

Friday, November 22, 2024

WebAIM November 2024 Newsletter Read this newsletter online at https://webaim.org/newsletter/2024/november Features Using Severity Ratings to Prioritize Web Accessibility Remediation When it comes to

➡️ Why Your Phone Doesn't Want You to Sideload Apps — Setting the Default Gateway in Linux

Friday, November 22, 2024

Also: Hey Apple, It's Time to Upgrade the Macs Storage, and More! How-To Geek Logo November 22, 2024 Did You Know Fantasy author JRR Tolkien is credited with inventing the main concept of orcs and

JSK Daily for Nov 22, 2024

Friday, November 22, 2024

JSK Daily for Nov 22, 2024 View this email in your browser A community curated daily e-mail of JavaScript news React E-Commerce App for Digital Products: Part 4 (Creating the Home Page) This component

Spyglass Dispatch: The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen

Friday, November 22, 2024

The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen The Spyglass Dispatch is a free newsletter sent out daily on

Charted | How the Global Distribution of Wealth Has Changed (2000-2023) 💰

Friday, November 22, 2024

This graphic illustrates the shifts in global wealth distribution between 2000 and 2023. View Online | Subscribe | Download Our App Presented by: MSCI >> Get the Free Investor Guide Now FEATURED

Daily Coding Problem: Problem #1616 [Easy]

Friday, November 22, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Alibaba. Given an even number (greater than 2), return two prime numbers whose sum will

The problem to solve

Friday, November 22, 2024

​ Use problem framing to define the problem to solve This week, Tom Parson and Krishna Raha share tools and frameworks to identify and address challenges effectively, while Voltage Control highlights