[in case you missed it] Data Science Weekly - Issue 413

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #413

October 21 2021

Editor Picks
 
  • The philosophical and musical failings of “Beethoven X: The AI Project”
    When VAN asked me to do a review of an artificial-intelligence-created realization of Beethoven’s Tenth Symphony called “Beethoven X: The AI Project,” which is based on the skimpy sketches he left when he died, I more or less groaned in my reply. “Not for me,” I said. “I know pretty much what I’ll think about it, and my review could get snarky.” “If so, that would be all right with us,” VAN said. “Well, OK,” I groaned back. So here I am and here goes...At the end of the symphony I found myself more philosophical than annoyed. I’ll start with that...
  • MIT's "The Missing Semester of Your CS Education" Class
    Classes teach you all about advanced topics within CS, from operating systems to machine learning, but there’s one critical subject that’s rarely covered, and is instead left to students to figure out on their own: proficiency with their tools. We’ll teach you how to master the command-line, use a powerful text editor, use fancy features of version control systems, and much more!...
  • Predicting Spreadsheet Formulas from Semi-structured Contexts
    We describe a new model that learns to automatically generate formulas based on the rich context around a target cell. When a user starts writing a formula with the “=” sign in a target cell, the system generates possible relevant formulas for that cell by learning patterns of formulas in historical spreadsheets....
 
 

A Message from this week's Sponsor:

 



Kickstart Your New Career with a Data Science & Analytics Bootcamp

Don’t miss your chance to join a Data Scientist-led, online Metis bootcamp plus get career support until you’re hired. Bootcamps are starting soon! Ready to take your data science or analytics career to the next level? Learn more about the Metis Online Data Science & Analytics Bootcamps.

 

 

Data Science Articles & Videos

 
  • Explaining in Style: Training a GAN to explain a classifier in StyleSpace
    We propose a training procedure for a StyleGAN, which incorporates the classifier model, in order to learn a classifier-specific StyleSpace. Explanatory attributes are then selected from this space. These can be used to visualize the effect of changing multiple attributes per image, thus providing image-specific explanations. We apply StylEx to multiple domains, including animals, leaves, faces and retinal images. For these, we show how an image can be modified in different ways to change its classifier output. Our results show that the method finds attributes that align well with semantic ones, generate meaningful image-specific explanations, and are human-interpretable as measured in user-studies....
  • Challenges in Detoxifying Language Models
    Large language models (LM) generate remarkably fluent text and can be efficiently adapted across NLP tasks. Measuring and guaranteeing the safety of generated text is imperative for deploying LMs in the real world; to this end, prior work often relies on automatic evaluation of LM toxicity...We demonstrate that while basic intervention strategies can effectively optimize previously established automatic metrics on the RealToxicityPrompts dataset, this comes at the cost of reduced LM coverage for both texts about, and dialects of, marginalized groups. Additionally, we find that human raters often disagree with high automatic toxicity scores for texts generated by models with strong toxicity reduction interventions...
  • Composability in Julia: Implementing Deep Equilibrium Models via Neural ODEs
    In this blog post we will show how to easily, efficiently, and robustly use steady state nonlinear solvers with neural networks in Julia. We will showcase the relationship between steady states and ODEs, thus making a connection between the methods for Deep Equilibrium Models (DEQs) and Neural ODEs...
  • ETL Pipelines with Airflow: the Good, the Bad and the Ugly
    In this article, we review how to use Airflow ETL operators to transfer data from Postgres to BigQuery with the ETL and ELT paradigms. Then, we share some challenges you may encounter when attempting to load data incrementally with Airflow DAGs. Finally, we argue why Airflow ETL operators won’t be able to cover the long tail of integrations for your business data...
  • Considerations Before Pushing Machine Learning Models to Production
    I daily see, as a data scientist, the challenges that come with putting AI-based solutions in production. These challenges are numerous and cover a variety of aspects: modeling and system design, data engineering, resource management, SLA, etc...I don’t pretend mastery in any of those fields. I do however know that implementing some software engineering principles and using the right tools helped me a lot in making my work reproducible and ready for production...In this article, I’ll share with you 7 of the considerations I have in mind before productionizing my models....
  • Generative art resources in R
    An extremely incomplete (and probably biased) list of resources to help an aspiring generative artist get started making pretty pictures in R...
  • Who is a Data Scientist in 2021?
    Every year we publishe a study on 1,001 data scientist profiles. The information is collected from public LinkedIn profiles, assuming that the information posted on the social media platform is an unbiased estimator of their resume...This research allows us to gain insights, with a reasonable degree of certainty, about who is a data scientist in 2021. We present only aggregate data to highlight important trends that can be useful to anyone who wants to break into the field, as well as to organizations looking to hire data scientists....
 
 

Tools*

 



Create AI-powered search and recommendation apps with Pinecone

Pinecone is a fully managed vector database that makes it easy to add vector search to production applications. It combines state-of-the-art vector search libraries, advanced features such as filtering, and distributed infrastructure to provide high performance and reliability at any scale. Get started now — it's free!

*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 
 

Jobs

 
  • Entry Level Data Scientist: 2022 - IBM - Multiple Locations

    As a Data Scientist at IBM, you will help transform our clients’ data into tangible business value by analyzing information, communicating outcomes and collaborating on product development. Work with Best in Class open source and visual tools, along with the most flexible and scalable deployment options. Whether it’s investigating patient trends or weather patterns, you will work to solve real world problems for the industries transforming how we live.

        Want to post a job here? Email us for details >> team@datascienceweekly.org

 
 

Training & Resources

 
  • Carnegie Mellon University 10721: Philosophical Foundations of Machine Intelligence
    What is this field? What are its normative aims? What are its modes of inquiry? What are (and have been) its intellectual and ideological commitments? What foundational questions is it in dialogue with, and what foundational obstacles obstruct its progress? Finally: What are our responsibilities as researchers & practitioners deploying this technology?...
  • SHAP: Explain Any Machine Learning Model in Python
    Imagine you are trying to train a machine learning model to predict whether an ad is clicked by a particular person. After receiving some information about a person, the model predicts that a person will not click on an ad...But why does the model predict that? How much does each feature contribute to the prediction? Wouldn’t it be nice if you can see a plot indicating how much each feature contributes to the prediction?...That is when Shapley value comes in handy...
  • Random Forests Algorithm explained with a real-life example and some Python code
    Random Forests is a Machine Learning algorithm that tackles one of the biggest problems with Decision Trees: variance...Even though Decision Trees is simple and flexible, it is greedy algorithm. It focuses on optimizing for the node split at hand, rather than taking into account how that split impacts the entire tree. A greedy approach makes Decision Trees run faster, but makes it prone overfitting...An overfit tree is highly optimized to predicting the values in the training dataset, resulting in a learning model with high-variance. How you calculate variance in a Decision Tree depends on the problem you’re solving...
 
 

Books

 

 
  • Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits


    Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
     


    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2021 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Key phrases

Older messages

Data Science Weekly - Issue 412

Friday, October 15, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #412 October 14 2021 Editor Picks

[in case you missed it] Data Science Weekly - Issue 410

Sunday, October 3, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #410 September 30 2021 Editor Picks Top

Data Science Weekly - Issue 410

Friday, October 1, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #410 September 30 2021 Editor Picks Top

Data Science Weekly - Issue 409

Friday, September 24, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #409 September 23 2021 Editor Picks Tree

Data Science Weekly - Issue 408

Friday, September 17, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #408 September 16 2021 Editor Picks The

You Might Also Like

Data Science Weekly - Issue 544

Friday, April 26, 2024

Curated news, articles and jobs related to Data Science, AI, & Machine Learning ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Develop highly relevant search applications using AI

Friday, April 26, 2024

New Elasticsearch and AI training ㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤ ㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤ ㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤ elastic | Search. Observe. Protect A world of AI possibilities door-test 2.png Explore

Stripe makes more changes

Thursday, April 25, 2024

TikTok is in trouble, and net neutrality is back View this email online in your browser By Christine Hall Thursday, April 25, 2024 Welcome back to TechCrunch PM, your home for all things startups,

💎 Issue 414 - From a Lorry Driver to Ruby on Rails Developer at 38

Thursday, April 25, 2024

This week's Awesome Ruby Newsletter Read this email on the Web The Awesome Ruby Newsletter Issue » 414 Release Date Apr 25, 2024 Your weekly report of the most popular Ruby news, articles and

💻 Issue 414 - JavaScript Features That Most Developers Don’t Know

Thursday, April 25, 2024

This week's Awesome Node.js Weekly Read this email on the Web The Awesome Node.js Weekly Issue » 414 Release Date Apr 25, 2024 Your weekly report of the most popular Node.js news, articles and

💻 Issue 407 - The Performance Impact of C++'s `final` Keyword

Thursday, April 25, 2024

This week's Awesome .NET Weekly Read this email on the Web The Awesome .NET Weekly Issue » 407 Release Date Apr 25, 2024 Your weekly report of the most popular .NET news, articles and projects

💻 Issue 414 - Everyone Has JavaScript, Right?

Thursday, April 25, 2024

This week's Awesome JavaScript Weekly Read this email on the Web The Awesome JavaScript Weekly Issue » 414 Release Date Apr 25, 2024 Your weekly report of the most popular JavaScript news, articles

📱 Issue 408 - All web browsers on iOS are just Safari with different design

Thursday, April 25, 2024

This week's Awesome iOS Weekly Read this email on the Web The Awesome iOS Weekly Issue » 408 Release Date Apr 25, 2024 Your weekly report of the most popular iOS news, articles and projects Popular

💧 Don't Bother Liquid Cooling Your AMD CPU — Why You Should Keep Using Live Photos on iPhone

Thursday, April 25, 2024

Also: We review the Unistellar Odyssey iPhone Telescope, and More! How-To Geek Logo April 25, 2024 Did You Know Charles Darwin and Abraham Lincoln were both born on the same day: February 12, 1809. 💻

💻 Issue 332 - 🥇The first framework that lets you visualize your React/NodeJS app 🤯

Thursday, April 25, 2024

This week's Awesome React Weekly Read this email on the Web The Awesome React Weekly Issue » 332 Release Date Apr 25, 2024 Your weekly report of the most popular React news, articles and projects