Data Science Weekly - Data Science Weekly - Issue 449

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #449

June 30 2022

Editor Picks

  • Pen and Paper Exercises in Machine Learning
    This is a collection of (mostly) pen-and-paper exercises in machine learning. The exercises are on the following topics: linear algebra, optimisation, directed graphical models, undirected graphical models, expressive power of graphical models, factor graphs and message passing, inference for hidden Markov models, model-based learning (including ICA and unnormalised models), sampling and Monte-Carlo integration, and variational inference...
  • Seeing Like a Toolkit: How Toolkits Envision the Work of AI Ethics
    Numerous toolkits have been developed to support ethical AI development. However, ethical AI toolkits, like all tools, encode assumptions in their design about what the work of “doing ethics” looks like—what work should be done, how, and by whom. We conduct a qualitative analysis of AI ethics toolkits to examine what their creators imagine to be the work of doing ethics, and the gaps that exist between the types of work that the toolkits imagine and support, and the way that the work of ethical AI actually occurs within technology companies and organizations...
  • The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon
    The grokking phenomenon as reported by Power et al., refers to a regime where a long period of overfitting is followed by a seemingly sudden transition to perfect generalization. In this paper, we attempt to reveal the underpinnings of Grokking via a series of empirical studies. Specifically, we uncover an optimization anomaly plaguing adaptive optimizers at extremely late stages of training, referred to as the Slingshot Mechanism...

A Message from this week's Sponsor:


Retool is the fast way to build an interface for any database

With Retool, you don't need to be a developer to quickly build an app or dashboard on top of any data set. Data teams at companies like NBC use Retool to build any interface on top of their data—whether it's a simple read-write visualization or a full-fledged ML workflow.

Drag and drop UI components—like tables and charts—to create apps. At every step, you can jump into the code to define the SQL queries and JavaScript that power how your app acts and connects to data. The result—less time on repetitive work and more time to discover insights.



Data Science Articles & Videos

  • Interpretable Machine Learning in Natural and Social Sciences
    This workshop will convened an interdisciplinary group of scholars to inspire clear foundational formulations of interpretability in a variety of domains where questions of interpretability arise in the application of machine learning, statistics, and data science more broadly...
  • Text Embeddings Visually Explained
    We take a visual approach to gain an intuition behind text embeddings, what use cases they are good for, and how they can be customized using finetuning...
  • Ethical concerns with replacing human relations with humanoid robots
    This paper considers ethical concerns with regard to replacing human relations with humanoid robots. Many have written about the impact that certain types of relations with robots may have on us, and why we should be concerned about robots replacing human relations...This paper first discusses what humanoid robots are, why and how humans tend to anthropomorphise them, and what the literature says about robots crowding out human relations...
  • Minerva: Solving Quantitative Reasoning Problems with Language Models
    Language models have demonstrated remarkable performance on a variety of natural language tasks...Quantitative reasoning is one area in which language models still fall far short of human-level performance...In “Solving Quantitative Reasoning Problems With Language Models”, we present Minerva, a language model capable of solving mathematical and scientific questions using step-by-step reasoning...
  • DALL·E 2 Pre-Training Mitigations
    In order to share the magic of DALL·E 2 with a broad audience, we needed to reduce the risks associated with powerful image generation models. To this end, we put various guardrails in place to prevent generated images from violating our content policy. This post focuses on pre-training mitigations, a subset of these guardrails which directly modify the data that DALL·E 2 learns from. In particular, DALL·E 2 is trained on hundreds of millions of captioned images from the internet, and we remove and reweight some of these images to change what the model learns...
  • Apple Privacy-Preserving Machine Learning Workshop 2022
    Earlier this year, Apple hosted the Workshop on Privacy-Preserving Machine Learning (PPML). This virtual event brought Apple and members of the academic research communities together to discuss the state of the art in the field of privacy-preserving machine learning through a series of talks and discussions over two days...In this post we will introduce a new dataset for community benchmarking in PPML, and share highlights from workshop discussions and recordings of select workshop talks...
  • The Six Conundrums of Building and Deploying Language Technologies for Social Good
    Many researchers, especially those working in core NLP/Speech domains, rely on a combination of individual expertise, experiences or ad hoc surveys for prioritizing between language technologies that provide social good to the end-users. This has been criticized by several scholars who argue that it is critical to include the target community during the LT’s design and development process. However, prioritization of communities, languages, technologies and design approaches presents a very large set of complex challenges to the technologists, for which there are no simple or off-the-shelf solutions. In this position paper, we distill our experiential insights into six fundamental conundrums that technologists face and must resolve while deciding which LT technology to build for which community, and by using what approach. ...
  • Reducing gender-based harms in AI with Sunipa Dev
    Grammar checkers use NLP to come up with grammar suggestions that help people write grammatically correct phrases. But it’s sometimes necessary to have human intervention to identify risks of unfair bias...Sunipa Dev is a research scientist at Google who focuses on Responsible AI. Some of her work focuses specifically on ways to evaluate unfair bias in NLP outcomes, reducing harms for people with queer and non-binary identities. ...
  • Masked World Models for Visual Control
    Masked autoencoders (MAE) has emerged as a scalable and effective self-supervised learning technique. Can MAE be also effective for visual model-based RL? Yes! with the recipe of convolutional feature masking and reward prediction to capture fine-grained and task-relevant information...



Business-Driven Data Analysis

Want to drive more value with your findings? Pragmatic Institute’s Business-Driven Data Analysis course empowers data practitioners to deliver timely analysis with actionable insights.

"This is an amazing course. Its live format provided an efficient environment with instant feedback from both sides. With the instructor's outstanding presenting skills and real-life insights, the course equipped us with a solid framework for tackling every stage of a data analysis project: Define, Prepare, Refine, Analyze, Present," said attendee Viorel Cazacu (Head of Controlling at Inditex).

The next 8-week, part-time session kicks off on July 18.

Register Now

*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!



  • Senior Data Scientist, Startup Creation at Redesign Health - US

    As our Senior Data Scientist for our Startup Creation team, you will set up and configure the data infrastructure for our startups, and work with the startup founding team to define data driven KPIs, and implement automated statistical analyses of customer behavior. Your goal is to make all of the companies that we launch data-driven from day one.

    In this role, you will function as an in-house implementation team for the companies that Redesign Health launches (internally referred to as OpCos). We provide data strategy, data pipeline, data analytics and forecasting services to newly formed companies in a repeatable and scalable manner...


        Want to post a job here? Email us for details -->


Training & Resources

  • How to create a dashboard in Python with Jupyter Notebook
    Would you like to build a data dashboard in 9 lines of Python code? I will show you how to create a dashboard in Python with Jupyter Notebook. The dashboard will present information about stock for selected ticker (data table and chart). The notebook will be published as a web application. I will use an open-source Mercury framework to convert Python notebook to interactive web application...
  • How to Read a Technical Paper
    Multi-pass reading // Write as you read // When and where to read // Set aside time // Which parts to focus on // What to read...

What you’re up to – notes from DSW readers

  • Working on something cool? Let us know here :) ...

* To share your projects and updates, share the details here.

** Want to chat with one of the above people? Hit reply and let us know :)


Last Week's Newsletter's 3 Most Clicked Links


* Based on unique clicks.

** Find last week's newsletter here.


P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2022, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

Data Science Weekly - Issue 448

Friday, June 24, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #448 June 23 2022 Editor Picks Machine

Data Science Weekly - Issue 447

Friday, June 17, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #447 June 16 2022 Editor Picks The

Data Science Weekly - Issue 446

Friday, June 10, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #446 June 09 2022 Editor Picks Literary

Data Science Weekly - Issue 445

Saturday, June 4, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #445 June 02 2022 Editor Picks Best

Data Science Weekly - Issue 444

Thursday, May 26, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #444 May 26 2022 Editor Picks Stanford

Tuesday Triage #108

Tuesday, August 9, 2022

Your weekly crème de la crème of the Internet is here! The 108th edition featuring gesti famosi, Ejection Tie club, and a French focaccia. ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

I replaced all our blog thumbnails using DALL·E 2 — and An incident impacting 5M accounts and private information on Twitter

Tuesday, August 9, 2022

Issue #854 — Top 20 stories of August 10, 2022 Issue #854 — August 10, 2022 You receive this email because you are subscribed to Hacker News Digest. You can open it in the browser if you prefer. 1 I

WhatsApp extends its unsend time limit to 'a little over two days'

Tuesday, August 9, 2022

TechCrunch Newsletter TechCrunch logo The Daily Crunch logo By Christine Hall and Haje Jan Kamps Tuesday, August 09, 2022 Whazzaaaaaaa, we're back with another round of newsy goodness on this fine

Finding & Fixing Python Bugs, Uncommon Usage, NBA Highlights, and More

Tuesday, August 9, 2022

Finding and Fixing Python Code Bugs #537 – AUGUST 9, 2022 VIEW IN BROWSER The PyCoder's Weekly Logo Finding and Fixing Python Code Bugs Learn how to identify and fix logic errors, or bugs, in your

Data Elixir - Issue 399

Tuesday, August 9, 2022

The 8 slide resume. Intro to streaming for data scientists. Random Forest explainer. ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

[New post] Rockin’ The Code World with dotNetDave – Special Guest: Magnus Mårtensson

Tuesday, August 9, 2022

dotNetDave posted: "Join me live on Saturday, August 20th, 2022 at 10:00 PST on C# Corner for show #64 where I will, for the second time, interview my good friend from Sweden, Magnus Mårtensson,

Infographic | Visualizing 10 Years of Global EV Sales by Country 🔋

Tuesday, August 9, 2022

Global EV sales have grown exponentially, more than doubling in 2021 to 6.8 million units. Here's a look at EV sales by country since 2011. View Online | Subscribe Presented by: NEO: NETZ | OTCQB:

[Sublime + Python Setup] Grumpy old greybeard with a whitespace problem

Tuesday, August 9, 2022

One fateful day, the Agile Gods that be decided to “add some firepower” to my little team… And so, developer Paul joined (name changed to protect the guilty). Before I dive into this story, let me ask

New Webinar! IdEM Broadband Macromodeling Tool for Electronic Device Characterization

Tuesday, August 9, 2022

Replace expensive physical tests with high accuracy simulation View this email in your browser Electronic Device Characterization using IdEM Broadband Macromodeling Tool Live Webinar -

3 ways to optimize SaaS sales in a downturn

Tuesday, August 9, 2022

TechCrunch+ Newsletter TechCrunch+ logo TechCrunch+ Roundup logo By Walter Thompson Tuesday, August 09, 2022 Welcome to TechCrunch+ Tuesday Image Credits: Eva Almqvist / Getty Images I have limited