Data Science Weekly - Data Science Weekly - Issue 421

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #421

December 16 2021

Editor Picks
 
  • Lee Wilkinson’s contribution to interactive visualization
    Upon learning this morning that Lee Wilkinson passed away I also felt compelled to write something on the extent to which his work has influenced interactive visualization research...The Grammar of Graphics was an incredibly ambitious undertaking – Wilkinson set out to create a system that could produce any statistical graphic he’d ever seen, and that could deepen understanding of the meaning of graphics...
  • Announcing the Transactions on Machine Learning Research
    With this post, we’re happy to announce that we (Raia Hadsell, Kyunghyun Cho, Hugo Larochelle) are founding a new journal...the review process will be hosted by OpenReview, and therefore will be open and transparent to the community. Another differentiation from JMLR will be the use of double blind reviewing, the consequence being that the submission of previously published research, even with extension, will not be allowed. Finally, we intend to work hard on establishing a fast-turnaround review process, focusing in particular on shorter-form submissions that are common at machine learning conferences...
  • The Death of Feature Engineering is Greatly Exaggerated
    One of the most exciting aspects of deep learning’s emergence in computer vision a few years ago was that it didn’t appear to require any feature engineering, unlike previous techniques like histograms-of-gradients or Haar cascades. As neural networks ate up other fields like NLP and speech, the hope was that feature engineering would become unnecessary for those domains too. At first I fully bought into this idea, and saw any remaining manually-engineered feature pipelines as legacy code that would soon be subsumed by more advanced models...Over the last few years of working with product teams to deploy models in production I’ve realized I was wrong...
 
 

A Message from this week's Sponsor:

 



Retool is the fast way to build an interface for any database

With Retool, you don't need to be a developer to quickly build an app or dashboard on top of any data set. Data teams at companies like NBC use Retool to build any interface on top of their data—whether it's a simple read-write visualization or a full-fledged ML workflow.

Drag and drop UI components—like tables and charts—to create apps. At every step, you can jump into the code to define the SQL queries and JavaScript that power how your app acts and connects to data. The result—less time on repetitive work and more time to discover insights.

 

 

Data Science Articles & Videos

 
  • PyTorch vs TensorFlow in 2022
    Should you use PyTorch vs TensorFlow in 2022? This guide walks through the major pros and cons of PyTorch vs TensorFlow, and how you can pick the right framework...
  • Improving the factual accuracy of language models through web browsing
    We’ve fine-tuned GPT-3 to more accurately answer open-ended questions using a text-based web browser. Our prototype copies how humans research answers to questions online – it submits search queries, follows links, and scrolls up and down web pages. It is trained to cite its sources, which makes it easier to give feedback to improve factual accuracy. We’re excited about developing more truthful AI,1 but challenges remain, such as coping with unfamiliar types of questions...
  • Optimization Nuggets: Exponential Convergence of SGD
    This is the first of a series of blog posts on short and beautiful proofs in optimization (let me know what you think in the comments!). For this first post in the series I'll show that stochastic gradient descent (SGD) converges exponentially fast to a neighborhood of the solution...
  • Data with a Purpose with Moritz Stefaner
    Meet Moritz Stefaner, a data designer who uses data for storytelling and who helped design the official German Covid-19 vaccine data dashboard. Moritz tells The Data Wranglers — Jeffrey Heer and Adam Wilson — how he creates a character from a dataset to give it emotional meaning and talks about the Covid vaccine clock he created. And, he dives into his data visualizations for train traffic on a German railroad network, the promises and pitfalls of using machine learning for data design, and what it took to visualize 175 years of text from Scientific American...
  • Using AI to bring children’s drawings to life
    We’re excited to announce a first-of-its-kind method for automatically animating children’s hand-drawn figures of people and humanlike characters (i.e., a character with two arms, two legs, a head, etc.) that bring these drawings to life in a matter of minutes using AI. By uploading them to our prototype system, parents and children can experience the excitement of watching their drawings become moving characters that dance, skip, and jump...
  • Best of the visualisation web - August 2021
    Since 2010 I have compiled and published monthly collections of links to some of the best, most interesting, or thought-provoking data visualisation-related content I come across. These collections are not always published immediately after the month in question has ended, but I try to do so as soon as my workload permits! Here's a collection of some of the best content I encountered during August 2021...
  • How AI Happens Podcast
    How AI Happens is a podcast featuring experts and practitioners explaining their work at the cutting edge of Artificial Intelligence. Tune in to hear AI Researchers, Data Scientists, ML Engineers, and the leaders of today’s most exciting AI companies explain the newest and most challenging facets of their field...
  • The Science of Visual Data Communication: What Works
    Effectively designed data visualizations allow viewers to use their powerful visual systems to understand patterns in data across science, education, health, and public policy. But ineffectively designed visualizations can cause confusion, misunderstanding, or even distrust—especially among viewers with low graphical literacy. We review research-backed guidelines for creating effective and intuitive visualizations oriented toward communicating data to students, coworkers, and the general public...
  • Modern Experimentation Platforms
    Che Sharma is the founder and CEO of Eppo, an experimentation framework that integrates with modern data platforms (cloud lakehouses and cloud data warehouses). We discuss the importance of investing in experimentation tools and the power of having a well-oiled experimentation culture within an organization. Che also explains how modern data platforms enable a variety of applications, including experimentation frameworks like Eppo...
  • Training Machine Learning Models More Efficiently with Dataset Distillation
    For a machine learning (ML) algorithm to be effective, useful features must be extracted from (often) large amounts of training data. However, this process can be made challenging due to the costs associated with training on such large datasets, both in terms of compute requirements and wall clock time. The idea of distillation plays an important role in these situations by reducing the resources required for the model to be effective. The most widely known form of distillation is model distillation (a.k.a. knowledge distillation), where the predictions of large, complex teacher models are distilled into smaller models...An alternative option to this model-space approach is dataset distillation, in which a large dataset is distilled into a synthetic, smaller dataset....
 
 

Tools*

 


Free Course: Natural Language Processing (NLP) for Semantic Search

Learn how to build semantic search applications by making machines understand language as people do. This free course covers everything you need to build state-of-the-art language models, from machine translation to question-answering, and more. Brought to you by Pinecone. Start reading now.

*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 
 

Jobs

 
  • Data Scientist, Decisions - Lyft - New York, NY

    Data Science is at the heart of Lyft’s products and decision-making. As a member of the Science team, you will work in a dynamic environment, where we embrace moving quickly to build the world’s best transportation. Data Scientists take on a variety of problems ranging from shaping critical business decisions to building algorithms that power our internal and external products. We’re looking for passionate, driven Data Scientists to take on some of the most interesting and impactful problems in ridesharing...

        Want to post a job here? Email us for details >> team@datascienceweekly.org

 
 

Training & Resources

 
  • Python Practice Problems for Beginner Coders
    From sifting through Twitter data to making your own Minecraft modifications, Python is one of the most versatile programming languages at a coder’s disposal. The open-source, object-oriented language is also quickly becoming one of the most-used languages in data science...To help readers practice the Python fundamentals, datascience@berkeley gathered six coding problems, including some from the W200: Introduction to Data Science Programming course. The questions below cover concepts ranging from basic data types to object-oriented programming using classes....
  • Book Draft: Distributional Reinforcement Learning
    By considering the return distribution, rather than just the expected return, we gain a fresh perspective on the fundamental problems of reinforcement learning. This includes understanding of how optimal decisions should be made, methods for creating effective representations of an agent’s state, and the consequences of interacting with other learningagents. In fact, many of the tools we develop here are useful beyond reinforcement learning and decision making. We call the process of computing return distributions distributional dynamic programming; it can be applied in any situation where probability distributionsshould be propagated within some dependency structure...
 
 

Books

 

 
  • Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits


    Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
     


    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2021 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

Data Science Weekly - Issue 420

Friday, December 10, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #420 December 09 2021 Editor Picks D3

Data Science Weekly - Issue 419

Friday, December 3, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #419 December 02 2021 Editor Picks Flux

Data Science Weekly - Issue 418

Thursday, November 25, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #418 November 25 2021 Editor Picks The

Data Science Weekly - Issue 417

Friday, November 19, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #417 November 18 2021 Editor Picks To Be

[in case you missed it] Data Science Weekly - Issue 416

Sunday, November 14, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #416 November 11 2021 Editor Picks

You Might Also Like

Re: Hackers may have stolen everyone's SSN!

Saturday, November 23, 2024

I wanted to make sure you saw Incogni's Black Friday deal, which is exclusively available for iPhone Life readers. Use coupon code IPHONELIFE to save 58%. Here's why we recommend Incogni for

North Korean Hackers Steal $10M with AI-Driven Scams and Malware on LinkedIn

Saturday, November 23, 2024

THN Daily Updates Newsletter cover Generative AI For Dummies ($18.00 Value) FREE for a Limited Time Generate a personal assistant with generative AI Download Now Sponsored LATEST NEWS Nov 23, 2024

📧 Building Async APIs in ASP.NET Core - The Right Way

Saturday, November 23, 2024

​ Building Async APIs in ASP .NET Core - The Right Way Read on: m​y website / Read time: 5 minutes The .NET Weekly is brought to you by: Even the smartest AI in the world won't save you from a

WebAIM November 2024 Newsletter

Friday, November 22, 2024

WebAIM November 2024 Newsletter Read this newsletter online at https://webaim.org/newsletter/2024/november Features Using Severity Ratings to Prioritize Web Accessibility Remediation When it comes to

➡️ Why Your Phone Doesn't Want You to Sideload Apps — Setting the Default Gateway in Linux

Friday, November 22, 2024

Also: Hey Apple, It's Time to Upgrade the Macs Storage, and More! How-To Geek Logo November 22, 2024 Did You Know Fantasy author JRR Tolkien is credited with inventing the main concept of orcs and

JSK Daily for Nov 22, 2024

Friday, November 22, 2024

JSK Daily for Nov 22, 2024 View this email in your browser A community curated daily e-mail of JavaScript news React E-Commerce App for Digital Products: Part 4 (Creating the Home Page) This component

Spyglass Dispatch: The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen

Friday, November 22, 2024

The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen The Spyglass Dispatch is a free newsletter sent out daily on

Charted | How the Global Distribution of Wealth Has Changed (2000-2023) 💰

Friday, November 22, 2024

This graphic illustrates the shifts in global wealth distribution between 2000 and 2023. View Online | Subscribe | Download Our App Presented by: MSCI >> Get the Free Investor Guide Now FEATURED

Daily Coding Problem: Problem #1616 [Easy]

Friday, November 22, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Alibaba. Given an even number (greater than 2), return two prime numbers whose sum will

The problem to solve

Friday, November 22, 2024

​ Use problem framing to define the problem to solve This week, Tom Parson and Krishna Raha share tools and frameworks to identify and address challenges effectively, while Voltage Control highlights