Data Science Weekly - Data Science Weekly - Issue 427

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #427

January 27 2022

Editor Picks
 
  • DeepMind: The Podcast, Season 2 (12 episodes)
    In the highly-praised, award-nominated "DeepMind: The Podcast", mathematician and broadcaster Hannah Fry goes behind the scenes of world-leading research lab DeepMind to find out how AI can benefit our lives and the society we live in....
  • Overdebunked! Six Statistical Critiques That Don’t Quite Work
    Statistical results and data analyses are quite often wrong. Sometimes they’re wrong because of carelessness, sometimes they’re wrong even though we cared a lot because it’s just really hard to get them right, and other times they’re wrong on purpose. It shouldn’t shock anyone to hear this...below, I’ve listed six statistical critiques I commonly see on social media, and why they’re not great critiques...These aren’t technical errors - they’re not about misinterpreting a p-value or whatever, but more about common-sense critiques of published statistical results that anyone could make...
 
 

A Message from this week's Sponsor:

 



A Two-Day Virtual Interactive ML Community Event for AI/ML Developers and Data Scientists.

Learn from 35+ AI experts from DeepMind, Spotify, Twitter, Disney, HuggingFace, Instacart, Colgate, Linkedin, Pinterest, Mobileye, HSBC, AstraZeneca, Verizon, BBC and more in sessions about building real-world AI and machine learning applications, best practices and strategies in AI infrastructure, ML in production, and exciting research that you can apply to your next ML or DL projects.

 

 

Data Science Articles & Videos

 
  • The Non-Engineer’s Guide to Bad Data
    This article is written by a data engineer for a non-technical audience troubleshooting the “broken dashboard” problem and can help data teams educate their stakeholders on the process of tackling broken data pipelines...the reader will learn: a) The role the data engineering team plays in troubleshooting data quality issues and their current responsibilities, b) The impact "bad" data can have on their business, c) A simplified explanation of why data breaks and why it takes time to discover and fix data quality issues, and d) And how data teams rely on data observability to reduce the likelihood of "bad" data entering their Tableau or Looker dashboards and reports...
  • ML and NLP Research Highlights of 2021
    2021 saw many exciting advances in machine learning (ML) and natural language processing (NLP). In this post, I will cover the papers and research areas that I found most inspiring...1) Universal Models, 2) Massive Multi-task Learning, 3) Beyond the Transformer, 4) Prompting, 5) Efficient Methods, 6) Benchmarking, 7) Conditional Image Generation, 8) ML for Science, 9) Program Synthesis, 10) Bias, 11) Retrieval Augmentation, 12) Token-free Models, 13) Temporal Adaptation, 14) The Importance of Data, and 15) Meta-learning...
  • How do you document predictive models just in case they are audited?
    [Reddit Discussion]...I work at a bank and am about to start building my first predictive model. I'm curious how you document your models in case auditors ask to see them? I'm also meeting with our internal auditors next week to come up with a plan, but I'd love to know what you do at your organization if you are willing to share...
  • Two reasons Kubernetes is so complex
    While some of those feelings are fairly universal of learning any new system, Kubernetes really does feel a lot bigger, scarier, and more intractable than some other systems I’ve worked with. As I’ve learned it and worked with it, I’ve tried to understand why it looks the way it does, and which design decisions and tradeoffs lead to it looking the way it does. I don’t claim to have the full answer, but this post is an attempt to commit to paper two specific thoughts or paradigms I have that I reach for as I try to understand why working with Kubernetes feels so hairy sometimes....
  • How to navigate ML research literature
    My slides on "How to navigate ML research literature" for Winter ML school...How to read papers?...How to filter out?...Where to get?...What does peer review mean?...
  • How to run effective ML research
    Gave a talk about ML papers reproducibility at winter school on "How to run effective ML research". Discussed some challenges 🥲 during implementation, objectives 📈, and tips ✍️.Here are the slides...
  • Beginner mistakes to avoid in building Data Pipeline
    [Reddit Discussion] I've recently been promoted to a Data Engineering position at work. That being said, my first project is helping migrate data from SAP ECC to SQL Server and solidify our data pipeline so my Analytics team can extract data in a more streamlined way for our dashboards and modeling...I don't have much guidance from technical leadership or access to technical expertise in this undertaking, and I wanted to see if there were any Sr. DE's that had common "rookie" mistakes they've seen in similar initiatives that I should look out for...
  • Topology and Computability
    Readers of this blog are familiar with notions of computability – basically, the question is, what can machines do without human assistance? And you are familiar with machines. Electronic ones of course, but I always like to think of machines as composed of gears, levers and pulleys...Topology? That’s another story. Rubber doughnuts being continuously stretched but always preserving that hole. Or calculus and differential equations...So what’s the connection? You’d be surprised...
 
 

Forum*

 



Check out the new Anaconda Community for all-things data!

Want insights into the newest developments in the world of data, or need help getting “unstuck” on a problem?

Our Community Forums is the place to go! Be the first to engage with other professionals and ask questions to the broader data community. Users can join in conversations around trends, debate new features, post questions to the community, and more. Plus, it’s another avenue for technical help!

Create your free Anaconda Community account now.


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 
 

Jobs

 
  • (Senior) Analytics Engineer - Fabulous - Remote

    Fabulous is a mobile app helping thousands of people every day to change their lifestyles by integrating healthy habits into their lives. Fabulous is using a behavioral economics lens to help everyone achieve their fullest potential. We work closely with researchers based at Duke University and our advisor is Dan Ariely, author of NYT bestseller Predictably Irrational. We are looking for an experienced Analytics Engineer to consolidate the Data Science team and lead the development and enrichment of our Data Pipelines. We have a modern Data-Stack based on Fivetran, dbt, BigQuery, Amplitude, Metabase...

        Want to post a job here? Email us for details >> team@datascienceweekly.org

 
 

Training & Resources

 
  • A method for explaining machine learning models: Shapley values (SHAP)
    A prediction can be explained by assuming that each feature value of the instance is a “player” in a game where the prediction is the payout. Shapley values – a method from coalitional game theory – tells us how to fairly distribute the “payout” among the features...Shapley values: a) Model-agnostic: Use with any model, b) Theoretic foundation: Game theory, c) Good software ecosystem, and d) Local and global explanations...
  • Regression and Other Stories [Book PDF, Free]
    Most textbooks on regression focus on theory and the simplest of examples. Real statistical problems, however, are complex and subtle. This is not a book about the theory of regression. It is about using regression to solve real problems of comparison, estimation, prediction, and causal inference. Unlike other books, it focuses on practical issues such as sample size and missing data and a wide range of goals and techniques. It jumps right in to methods and computer code you can...
  • Modern Robotics: Mechanics, Planning, and Control [Book PDF, Free]
    This introduction to robotics offers a distinct and unified perspective of the mechanics, planning and control of robots. Ideal for self-learning, or for courses, as it assumes only freshman-level physics, ordinary differential equations, linear algebra and a little bit of computing background. Modern Robotics presents the state-of-the-art, screw-theoretic techniques capturing the most salient physical features of a robot in an intuitive geometrical way. With numerous exercises at the end of each chapter, accompanying software written to reinforce the concepts in the book and video lectures aimed at changing the classroom experience, this is the go-to textbook for learning about this fascinating subject...
 
 

Books

 

 
  • Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits


    Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
     


    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2021 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

Data Science Weekly - Issue 426

Friday, January 21, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #426 January 20 2022 Editor Picks These

[in case you missed it] Data Science Weekly - Issue 425

Monday, January 17, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #425 January 13 2022 Editor Picks 🚩 red

Data Science Weekly - Issue 425

Friday, January 14, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #425 January 13 2022 Editor Picks 🚩 red

Data Science Weekly - Issue 424

Friday, January 7, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #424 January 06 2022 Editor Picks

Data Science Weekly - Issue 423

Friday, December 31, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #423 December 30 2021 Editor Picks 2021:

You Might Also Like

BetterDev #259 - How LLMs Work, Explained Without Math and Turning AirPods into a Fitness Tracker to Fight Cancer

Monday, May 13, 2024

Better Dev #259 May 13, 2024 Hi all, We come back with a new issue this week. If you like BetterDev, please help spead word out by refer to your friends. Buy me a coffee would be great too. Many link

Meet OpenAI’s newest GPT

Monday, May 13, 2024

Plus: White House to fund semiconductors and Cruise tests in Phoenix View this email online in your browser By Christine Hall Monday, May 13, 2024 Good afternoon, and welcome back to TechCrunch PM. We

The Story of Project Management & SEO ruined the internet

Monday, May 13, 2024

My name is Philipp and you are reading Creativerly, the weekly digest about creativity and productivity-boosting tools and resources, combined with useful insights, articles, and findings from the

📱 Don't Travel Without This Cheap iPhone Accessory — Run Your Smart Home With a Raspberry Pi

Monday, May 13, 2024

Also: How to Generate AI Art for Free, and More! How-To Geek Logo May 13, 2024 Did You Know Thanks to serious conservation efforts and sustainable harvesting programs starting in the 1950s, the United

JSK Daily for May 13, 2024

Monday, May 13, 2024

JSK Daily for May 13, 2024 View this email in your browser A community curated daily e-mail of JavaScript news Level Up Your JavaScript: Mastering Array Manipulation Techniques Arrays are a fundamental

You rock(et) my world, moms

Monday, May 13, 2024

If you're looking for a Starliner mission recap, you'll have to wait a little longer -- the mission has officially been delayed. View this email online in your browser By Aria Alamalhodaei

Mapped | U.S. States By Number of Cities Over 250,000 Residents 🌎

Monday, May 13, 2024

Eighteen US States don't have a single incorporated area with more than 250000 people. View Online | Subscribe Presented by: Is your portfolio ready for the internet's next evolution? >>

Daily Coding Problem: Problem #1440 [Easy]

Monday, May 13, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Zillow. A ternary search tree is a trie-like data structure where each node may have up

Deepdive – prioritizing for product managers

Monday, May 13, 2024

As a Product Manager, you're constantly juggling everything – ideas, feature requests, strategic initiatives… the works. You want to do it all, but with limited time and resources, you know you

GCP Newsletter #398

Monday, May 13, 2024

News Official Blog Security Threat Intelligence Introducing Google Threat Intelligence: Actionable threat intelligence at Google scale Official Blog Security Introducing Google Security Operations: