Data Science Weekly - Data Science Weekly - Issue 420

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #420

December 09 2021

Editor Picks
 
  • D3 and Data Visualization Insights with Mike Bostock
    What’s the secret for D3’s long-time success? Mike Bostock, the creator of D3 shares the reasons for his data visualization tool’s longevity, and why it won the 10-year Test-of-Time award from the IEEE. Mike goes deep on D3 and Observable, which he also founded, and talks about all things visualization with The Data Wranglers Joe Hellerstein and Jeffrey Heer, including when it’s OK to use a bar-chart for getting quick data insights and the applications of time zone wrangling...
  • A Call to Build Models Like We Build Open-Source Software
    This post argues that we should develop tools that will allow us to build pre-trained models in the same way that we build open-source software. Specifically, models should be developed by a large community of stakeholders who continually update and improve them. Realizing this goal will require porting many ideas from open-source software development to building and training models, which motivates many threads of interesting research....
  • AI-DR Program Automated Decision-Making and the Law Clearinghouse Project
    One public perception is that automated decision-making is fairer, or could even be more lawful. This perception stems from the belief that human bias may be eliminated in automated decisions. However, as emerging research has shown, unlawful discrimination can flow from the bias that remains encoded in automated decision-making systems...The aim of this clearinghouse project thus is to highlight seminal and impactful articles focused on issues of AI Decision-Making and the law. The AI-DR Program is pleased to share a searchable database of legal scholarly articles related to AI, automated decision-making and the law...
 
 

A Message from this week's Sponsor:

 



Retool is the fast way to build an interface for any database

With Retool, you don't need to be a developer to quickly build an app or dashboard on top of any data set. Data teams at companies like NBC use Retool to build any interface on top of their data—whether it's a simple read-write visualization or a full-fledged ML workflow.

Drag and drop UI components—like tables and charts—to create apps. At every step, you can jump into the code to define the SQL queries and JavaScript that power how your app acts and connects to data. The result—less time on repetitive work and more time to discover insights.

 

 

Data Science Articles & Videos

 
  • Learning with not Enough Data Part 1: Semi-Supervised Learning
    The performance of supervised learning tasks improves with more high-quality labels available. However, it is expensive to collect a large number of labeled samples. There are several paradigms in machine learning to deal with the scenario when the labels are scarce. Semi-supervised learning is one candidate, utilizing a large amount of unlabeled data conjunction with a small amount of labeled data...
  • Automated Story Generation as Question-Answering
    We propose a novel approach to automated story generation that treats the problem as one of generative question-answering. Our proposed story generation system starts with sentences encapsulating the final event of the story. The system then iteratively (1) analyzes the text describing the most recent event, (2) generates a question about "why" a character is doing the thing they are doing in the event, and then (3) attempts to generate another, preceding event that answers this question...
  • Cloud Wars: The Attack of Snowflakes
    Erik Bern wrote a post last week, combining the counterintuitive ideas that (a) the lowest cloud infrastructure layers are not commodity services, and (b) this means that the cloud providers could be happy ceding ground to others for higher level services, turning into pure play infrastructure platforms....I’m in violent agreement with the first premise that the lowest cloud infra layers are not commodity services¹. But I think it’s unlikely that cloud providers would be happy ceding ground to others on higher level services...
  • Visualize Data on Spirals
    In this vignette, I describe the package spiralize which visualizes data along an Archimedean spiral. It has two major advantages for visualization: a) It is able to visualize data with very long axis with high resolution and b) It is efficient for time series data to reveal periodic patterns...
  • Language Modelling at Scale: Gopher, Ethical considerations, and Retrieval
    Today we [DeepMind] are releasing three papers on language models that reflect this interdisciplinary approach. They include a detailed study of a 280 billion parameter transformer language model called Gopher, a study of ethical and social risks associated with large language models, and a paper investigating a new architecture with better training efficiency...
  • Updated spaCY NLP Course
    We've updated our interactive NLP course for spaCy v3!...💬 The updated course is available in English, Spanish, German and Japanese...📚 4 interactive chapters: from the first steps to your own spaCy model...🍰 New exercises about the training CLI & config...
  • A Cartel of Influential Datasets Is Dominating Machine Learning Research, New Study Suggests
    A new paper from the University of California and Google Research has found that a small number of ‘benchmark’ machine learning datasets, largely from influential western institutions, and frequently from government organizations, are increasingly dominating the AI research sector...the authors contend that ‘widely-used datasets are introduced by only a handful of elite institutions’, and that this ‘consolidation’ has increased to 80% in recent years...
  • PyTorch: Where we are headed and why it looks a lot like Julia (but not exactly like Julia)
    When trying to predict how PyTorch would itself get disrupted, we used to joke a bit about the next version of PyTorch being written in Julia. This was not very serious: a huge factor in moving PyTorch from Lua to Python was to tap into Python’s immense ecosystem (an ecosystem that shows no signs of going away) and even today it is still hard to imagine how a new language can overcome the network effects of Python...However, recently, I have been thinking about various projects we have going on in PyTorch...
  • minitorch
    MiniTorch is a diy teaching library for machine learning engineers who wish to learn about the internal concepts underlying deep learning systems. It is a pure Python re-implementation of the Torch API designed to be simple, easy-to-read, tested, and incremental. The final library can run Torch code. The project was developed for the course 'Machine Learning Engineering' at Cornell Tech...
  • Building a recommendation engine inside Postgres with Python and Pandas
    Earlier today I was starting to wonder why couldn't I do more machine learning directly inside the Postgres database. Yeah, there is madlib, but what if I wanted to write my own recommendation engine? So I set out on a total detour of a few hours and lo and behold, I can probably do a lot more of this in Postgres than I realized before. What follows is a quick walkthrough of getting a recommendation engine setup directly inside Postgres on top of Crunchy Bridge, our database as a service...
 
 

Tools*

 


What's a vector database, and how can you use it for AI/ML applications?

Vector databases help data scientists and ML engineers implement NLP into search, personalization, security, analytics, and monitoring applications. Learn all about them, their use cases, their core components, and how to get started. (It's easy.) Start here: What is a vector database?

*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 
 

Jobs

 
  • R&D Data Scientist - Danaher - Port Washington, NY

    As a Data Scientist at IBM, you will help transform our clients’ data into tangible business value by analyzing information, communicating outcomes and collaborating on product development. Work with Best in Class open source and visual tools, along with the most flexible and scalable deployment options. Whether it’s investigating patient trends or weather patterns, you will work to solve real world problems for the industries transforming how we live.

        Want to post a job here? Email us for details >> team@datascienceweekly.org

 
 

Training & Resources

 
  • Intuitive Bayes Introductory Course
    Have you found most statistics books overly theoretical? Math-heavy? Or lacking a clear focus on application?...Want to keep your skills sharp to improve your career prospects?...Have you heard about these new fangled Probabilistic Programming Languages and want to know what they're all about?...Then this course is for you...
  • How a Kalman filter works, in pictures
    You can use a Kalman filter in any place where you have uncertain information about some dynamic system, and you can make an educated guess about what the system is going to do next. Even if messy reality comes along and interferes with the clean motion you guessed about, the Kalman filter will often do a very good job of figuring out what actually happened. And it can take advantage of correlations between crazy phenomena that you maybe wouldn’t have thought to exploit!...I’ll start with a loose example of the kind of thing a Kalman filter can solve, but if you want to get right to the shiny pictures and math, feel free to jump ahead...
  • Reddit Discussion: Why are Einstein Sum Notations not popular in ML? They changed my life.
    I recently discovered `torch.einsum` and now I am mad at every friend, mentor, acquaintance for not telling me about it...They are just way more intuitive and can handle most operations that I would want to do with tensors so elegantly...It takes only 30 mins or so to learn the notation and become somewhat proficient but then you are sorted for life...What are the arguments for and against using einstein notations for everything?...
 
 

Books

 

 
  • Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits


    Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
     


    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2021 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

Data Science Weekly - Issue 419

Friday, December 3, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #419 December 02 2021 Editor Picks Flux

Data Science Weekly - Issue 418

Thursday, November 25, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #418 November 25 2021 Editor Picks The

Data Science Weekly - Issue 417

Friday, November 19, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #417 November 18 2021 Editor Picks To Be

[in case you missed it] Data Science Weekly - Issue 416

Sunday, November 14, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #416 November 11 2021 Editor Picks

Data Science Weekly - Issue 416

Friday, November 12, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #416 November 11 2021 Editor Picks

You Might Also Like

Re: Hackers may have stolen everyone's SSN!

Saturday, November 23, 2024

I wanted to make sure you saw Incogni's Black Friday deal, which is exclusively available for iPhone Life readers. Use coupon code IPHONELIFE to save 58%. Here's why we recommend Incogni for

North Korean Hackers Steal $10M with AI-Driven Scams and Malware on LinkedIn

Saturday, November 23, 2024

THN Daily Updates Newsletter cover Generative AI For Dummies ($18.00 Value) FREE for a Limited Time Generate a personal assistant with generative AI Download Now Sponsored LATEST NEWS Nov 23, 2024

📧 Building Async APIs in ASP.NET Core - The Right Way

Saturday, November 23, 2024

​ Building Async APIs in ASP .NET Core - The Right Way Read on: m​y website / Read time: 5 minutes The .NET Weekly is brought to you by: Even the smartest AI in the world won't save you from a

WebAIM November 2024 Newsletter

Friday, November 22, 2024

WebAIM November 2024 Newsletter Read this newsletter online at https://webaim.org/newsletter/2024/november Features Using Severity Ratings to Prioritize Web Accessibility Remediation When it comes to

➡️ Why Your Phone Doesn't Want You to Sideload Apps — Setting the Default Gateway in Linux

Friday, November 22, 2024

Also: Hey Apple, It's Time to Upgrade the Macs Storage, and More! How-To Geek Logo November 22, 2024 Did You Know Fantasy author JRR Tolkien is credited with inventing the main concept of orcs and

JSK Daily for Nov 22, 2024

Friday, November 22, 2024

JSK Daily for Nov 22, 2024 View this email in your browser A community curated daily e-mail of JavaScript news React E-Commerce App for Digital Products: Part 4 (Creating the Home Page) This component

Spyglass Dispatch: The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen

Friday, November 22, 2024

The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen The Spyglass Dispatch is a free newsletter sent out daily on

Charted | How the Global Distribution of Wealth Has Changed (2000-2023) 💰

Friday, November 22, 2024

This graphic illustrates the shifts in global wealth distribution between 2000 and 2023. View Online | Subscribe | Download Our App Presented by: MSCI >> Get the Free Investor Guide Now FEATURED

Daily Coding Problem: Problem #1616 [Easy]

Friday, November 22, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Alibaba. Given an even number (greater than 2), return two prime numbers whose sum will

The problem to solve

Friday, November 22, 2024

​ Use problem framing to define the problem to solve This week, Tom Parson and Krishna Raha share tools and frameworks to identify and address challenges effectively, while Voltage Control highlights