Data Science Weekly - Data Science Weekly - Issue 408

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #408

September 16 2021

Editor Picks
 
  • ICLR 2022 Call for Blog Posts
    This year, the ICLR 2022 main conference will host a blog post track. We invite both academic and industrial researchers to submit their posts on a previously published paper at ICLR. We particularly welcome submissions on papers that appeared last year at ICLR...
  • Our Journey towards Data-Centric AI: A Retrospective
    Starting in about 2016, researchers from our lab — the Hazy Research lab — circled through academia and industry giving talks about an intentionally provocative idea: machine learning (ML) models—long the darlings of researchers and practitioners—were no longer the center of AI. In fact, models were becoming commodities. Instead, we claimed that it was the training data that would drive progress towards more performant ML models and systems...
 
 

A Message from this week's Sponsor:

 

 
TransformX Conference: Driving AI from Experimentation to Reality

Join Scale AI for our two-day, virtual conference featuring 100+ speakers and 60+ sessions. We’re bringing together a community of leaders, visionaries, practitioners, and researchers across industries as we explore the shift from research to reality within AI and Machine Learning. Register now to secure your free ticket...
 

 

Data Science Articles & Videos

 
  • The mathematics of adversarial attacks in AI
    It is well established that the current DL methodology produces universally unstable neural networks (NNs). The instability problem has caused an enormous research effort -- with a vast literature on so-called adversarial attacks -- yet there has been no solution to the problem. Our paper addresses why there has been no solution to the problem, as we prove the following mathematical paradox: any training procedure based on training neural networks for classification problems with a fixed architecture will yield neural networks that are either inaccurate or unstable (if accurate) -- despite the provable existence of both accurate and stable neural networks for the same classification problems...
  • Parallelizing Python Code
    Python is great for tasks like training machine learning models...When performing these tasks, you also want to use your underlying hardware as much as possible for quick results. Parallelizing Python code enables this. However, using the standard CPython implementation means you cannot fully use the underlying hardware because of the global interpreter lock (GIL) that prevents running the bytecode from multiple threads simultaneously...This article reviews some common options for parallelizing Python code...
  • Using learning-to-rank to precisely locate where to deliver packages
    For delivery drivers, finding the doorstep where a package should be dropped off can be surprisingly hard. House numbers can be obscured by foliage, or they might be missing entirely; some neighborhoods use haphazard numbering systems that make house numbers hard to guess; and complexes of multiple buildings sometimes share a single street address...I adapt an idea from information retrieval — learning-to-rank — to the problem of predicting the coordinates of a delivery location from past GPS data...
  • Building a smart Robot AI using Hugging Face 🤗 and Unity
    Today we’re going to build this adorable smart robot that will perform actions based on player text input...It uses a deep language model to understand any text input and find the most appropriate action of its list...What’s interesting with that system, contrary to classical game development, is that you don’t need to hard-code every interaction. Instead, you use a language model that selects what’s robot possible action is the most appropriate given user input...
  • Bayesian Media Mix Modeling for Marketing Optimization
    A problem faced by many companies is how to allocate marketing budgets across different media channels. For example, how should funds be allocated across TV, radio, social media, direct mail, or daily deals?...So-called Media Mix Modelling (MMM) can estimate how effective each advertising channel is in gaining new customers. Once we have estimated each channel’s effectiveness we can optimize our budget allocation to maximize customer acquisition and sales...In this blog post, we outline what you can do with MMM’s, introduce how they work, summarise some of the benefits they can provide, as well as covering some of the modeling challenges...
  • Bad Labels: GridSearch is Not Enough
    I write a lot of blog posts on why you need more than grid-search to properly judge a machine learning model. In this blog post I want to demonstrate yet another reason; labels often seem to be wrong...The issue here isn’t just that we might have bad labels in our training set, the issue is that it appears in the validation set. If a machine learning model can become state of the art by squeezing another 0.5% out of a validation set one has to wonder. Are we really making a better model? Or are we creating a model that is better able to overfit on the bad labels?...
  • bad labels: introduction
    Even famous datasets have bad labels in them...Because it's such a big problem we wanted to spend a few videos on this topic. It'd be a shame if our machine learning models are merely optimal because they overfit on the bad labels. That's why we're going to explore heuristics to find bad labels in our training data so that we may try to improve the quality of our training data...
  • Embedding Values in Artificial Intelligence (AI) Systems
    Though there are numerous high-level normative frameworks, it is still quite unclear how or whether values can be implemented in AI systems. Van de Poel and Kroes’s (2014) have recently provided an account of how to embed values in technology. The current article proposes to expand that view to complex AI systems and explain how values can be embedded in technological systems that are “autonomous, interactive, and adaptive”...
  • How To Lead In Data Science
    The Data Exchange Podcast: Jike Chong and Yue Cathy Chang on helping data scientists increase their impact in business and in society...
 
 

Training*

 

 
Quick Question For You: Do you want a Data Science job?

After helping hundred of readers like you get Data Science jobs, we've distilled all the real-world-tested advice into a self-directed course.

The course is broken down into three guides:
  1. Data Science Getting Started Guide. This guide shows you how to figure out the knowledge gaps that MUST be closed in order for you to become a data scientist quickly and effectively (as well as the ones you can ignore)

  2. Data Science Project Portfolio Guide. This guide teaches you how to start, structure, and develop your data science portfolio with the right goals and direction so that you are a hiring manager's dream candidate

  3. Data Science Resume Guide. This guide shows how to make your resume promote your best parts, what to leave out, how to tailor it to each job you want, as well as how to make your cover letter so good it can't be ignored!
Click here to learn more...

*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!
 

 

Jobs

 
  • Senior Data Scientist - TikTok - LA

    TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy by offering a home for creative expression and an experience that is genuine, joyful, and positive.
    • Generate useful features from large amount of data
    • Apply supervised and unsupervised machine learning techniques, such as linear and logistic regression, decision trees, and k-means clustering
    • Develop segmentation models, classification models, propensity models, LTV models, experimental design, optimization models
    • Perform statistical analysis such as KPI deep dives, performance marketing efficiency, behavioral clustering, and user journey analytics
    • Curate audiences and inform engagement tactics to enable differentiated, relevant marketing touches across channels (social, email, in app, push)
    • Synthesize analytics and statistical approaches into easy-to-consume storylines, both visually and verbally, and provide indicated actions for executive audiences
    • Capture business requirements for data and analytic solutions and collaborate XFN to ensure business requirements align with business needs
    • Analyze creatives and surface insights that will help drive engagement and retention
    • Support day-to-day collaboration with performance marketing to communicate insights and recommend data informed strategies

        Want to post a job here? Email us for details >> team@datascienceweekly.org
 

 

Training & Resources

 
  • How percentile approximation works (and why it's more useful than averages)
    As I was researching this piece, I found a number of good blog posts (see examples from the folks at Dynatrace, Elastic, AppSignal, and Optimizely) about how averages aren’t great for understanding application performance, or other similar things, and why it’s better to use percentiles...I won’t spend too long on this, but I think it’s important to provide a bit of background on why and how percentiles can help us better understand our data...First off, let’s consider how percentiles and averages are defined. To understand this, let’s start by looking at a normal distribution...
  • State of PyTorch core: September 2021 edition
    There are a lot of projects currently going on in PyTorch core and it can be difficult to keep track of all of them or how they relate with each other. Here is my personal understanding of all the things that are going on, organized around the people who are working on these projects, and how I think about how they relate to each other...
 
 

Books

 

  • Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits


    Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
     


    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2021 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

You Might Also Like

Ranked | The Tallest and Shortest Countries, by Average Height 📏

Saturday, November 23, 2024

These two maps compare the world's tallest countries, and the world's shortest countries, by average height. View Online | Subscribe | Download Our App TIME IS RUNNING OUT There's just 3

⚙️ Your own Personal AI Agent, for Everything

Saturday, November 23, 2024

November 23, 2024 | Read Online Subscribe | Advertise Good Morning. Welcome to this special edition of The Deep View, brought to you in collaboration with Convergence. Imagine if you had a digital

Educational Byte: Are Privacy Coins Like Monero and Zcash Legal?

Saturday, November 23, 2024

Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 23, 2024? The HackerNoon

🐍 New Python tutorials on Real Python

Saturday, November 23, 2024

Hey there, There's always something going on over at Real Python as far as Python tutorials go. Here's what you may have missed this past week: Black Friday Giveaway @ Real Python This Black

Re: Hackers may have stolen everyone's SSN!

Saturday, November 23, 2024

I wanted to make sure you saw Incogni's Black Friday deal, which is exclusively available for iPhone Life readers. Use coupon code IPHONELIFE to save 58%. Here's why we recommend Incogni for

North Korean Hackers Steal $10M with AI-Driven Scams and Malware on LinkedIn

Saturday, November 23, 2024

THN Daily Updates Newsletter cover Generative AI For Dummies ($18.00 Value) FREE for a Limited Time Generate a personal assistant with generative AI Download Now Sponsored LATEST NEWS Nov 23, 2024

📧 Building Async APIs in ASP.NET Core - The Right Way

Saturday, November 23, 2024

​ Building Async APIs in ASP .NET Core - The Right Way Read on: m​y website / Read time: 5 minutes The .NET Weekly is brought to you by: Even the smartest AI in the world won't save you from a

WebAIM November 2024 Newsletter

Friday, November 22, 2024

WebAIM November 2024 Newsletter Read this newsletter online at https://webaim.org/newsletter/2024/november Features Using Severity Ratings to Prioritize Web Accessibility Remediation When it comes to

➡️ Why Your Phone Doesn't Want You to Sideload Apps — Setting the Default Gateway in Linux

Friday, November 22, 2024

Also: Hey Apple, It's Time to Upgrade the Macs Storage, and More! How-To Geek Logo November 22, 2024 Did You Know Fantasy author JRR Tolkien is credited with inventing the main concept of orcs and

JSK Daily for Nov 22, 2024

Friday, November 22, 2024

JSK Daily for Nov 22, 2024 View this email in your browser A community curated daily e-mail of JavaScript news React E-Commerce App for Digital Products: Part 4 (Creating the Home Page) This component