Data Science Weekly - Data Science Weekly - Issue 418

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #418

November 25 2021

Editor Picks
 
  • The Fry Universe
    You probably like some types of fries more than others...3D modeling of various fry shapes reveals why you like some more than others...
  • Transformers from Scratch
    I procrastinated a deep dive into transformers for a few years. Finally the discomfort of not knowing what makes them tick grew too great for me. Here is that dive...Transformers were introduced in...2017 paper as a tool for sequence transduction—converting one sequence of symbols to another. The most popular examples of this are translation, as in English to German. It has also been modified to perform sequence completion—given a starting prompt, carry on in the same vein and style. They have quickly become an indispensible tool for research and product development in natural language processing....
 
 

A Message from this week's Sponsor:

 



Retool is the fast way to build an interface for any database

With Retool, you don't need to be a developer to quickly build an app or dashboard on top of any data set. Data teams at companies like NBC use Retool to build any interface on top of their data—whether it's a simple read-write visualization or a full-fledged ML workflow.

Drag and drop UI components—like tables and charts—to create apps. At every step, you can jump into the code to define the SQL queries and JavaScript that power how your app acts and connects to data. The result—less time on repetitive work and more time to discover insights.

 

 

Data Science Articles & Videos

 
  • Data Advantage Matrix: A New Way to Think About Data Strategy
    As the co-founder of two data start-ups, one question I get all the time is, “How do I get started with my data strategy? Where do we start? What do we prioritize?”... In NewVantage Partners’ annual survey, the percentage of companies that invest in data initiatives was near-universal (literally 99% in 2021) for the third year in a row...But while investing in data is a given, actually using data can feel like a crapshoot. In that same survey, only 24% of companies said that they had created a data-driven culture....In this article, I’ll break down how to think about your data strategy...and give examples of how two hypothetical companies would use it...
  • BookNLP
    BookNLP is a natural language processing pipeline that scales to books and other long documents (in English), including: Part-of-speech tagging, Dependency parsing, Entity recognition, Character name clustering and coreference resolution, Quotation speaker identification, Supersense tagging, Event tagging, Referential gender inference, and more...
  • The missing analytics executive
    Despite the apparent discrepancies in title (CTO sounds higher than VP) and responsibilities (leading a department sounds more important than tinkering), the two roles are peers. Both are senior executives, and both often report to the CEO. The division of labor is a recognition not of hierarchy, but that there’s enough important labor in engineering that it needs to be divided: One role to manage, and one to advise...Data departments should follow the same pattern. Rather than being led by a single ambiguously defined and overburdened CDO, data teams should have two representatives in senior management: A VP of data responsible for managing the team’s daily operations, and a chief analytics officer...
  • Getting into the subspace; or what happens when you approximate a Gaussian process
    The problem with Gaussian processes, at least from a computational point of view, is that they’re just too damn complicated. Because they are supported on some infinite dimensional Banach space B, the more we need to see of them (for instance because we have a lot of unique sis) the more computational power they require. So the obvious solution is to somehow make Gaussian processes less complex...This somehow has occupied a lot of people’s time over the last 20 years and there are many many many many possible options. But for the moment, I just want to focus on one of the generic classes of solutions: You can make Gaussian processes less computationally taxing by making them less expressive...
  • Bernoulli's Fallacy & the Crisis of Modern Science, with Aubrey Clayton
    I love epistemology — the study of how we know what we know...So..It was high time I dedicated a whole episode to this topic. And what better guest than Aubrey Clayton, the author of the book Bernoulli's Fallacy: Statistical Illogic and the Crisis of Modern Science...Aubrey is a mathematician in Boston who teaches the philosophy of probability and statistics at the Harvard Extension School and he holds a PhD in mathematics from Berkeley...We talked about what he deems “a catastrophic error in the logic of the standard statistical methods in almost all the sciences” and why this error manifests even outside of science, like in medicine, law, public policy, etc...But don’t worry, we’re not doomed — we’ll also see where we go from there. As a big fan of E.T Jaynes, Aubrey will also tell us how this US scientist influenced his own thinking as well as the field of Bayesian inference in general....
  • A Survey of Generalisation in Deep Reinforcement Learning
    The study of generalisation in deep Reinforcement Learning (RL) aims to produce RL algorithms whose policies generalise well to novel unseen situations at deployment time, avoiding overfitting to their training environments. Tackling this is vital if we are to deploy reinforcement learning algorithms in real world scenarios, where the environment will be diverse, dynamic and unpredictable. This survey is an overview of this nascent field. We provide a unifying formalism and terminology for discussing different generalisation problems, building upon previous works...
  • Machine Learning Street Talk Podcast Episode #53: Quantum Natural Language Processing - Prof. Bob Coecke (Oxford)
    Bob Coercke is a celebrated physicist, he's been a Physics and Quantum professor at Oxford University for the last 20 years. He is particularly interested in Structure which is to say, Logic, Order, and Category Theory. He is well known for work involving compositional distributional models of natural language meaning and he is also fascinated with understanding how our brains work...Bob thinks that interactions between systems in Quantum Mechanics carries naturally over to how word meanings interact in natural language. Bob argues that this interaction embodies the phenomenon of quantum teleportation...
  • Get Ready For Confidential Computing
    In this post we describe the ecosystem of tools focused on protecting data while in use. Our primary focus is on Confidential Computing tools for the development of data, analytic, and AI applications. We believe that companies that are able to use data securely will be well-positioned to build data and AI applications in the future...
 
 

Tools*

 


High quality data labeling, consistently

Edge cases are the most common challenges that ML teams face when training their AI models, making it difficult to reach 95+% accuracy. This can be more complex once you need to scale and start working with 3rd party data labeling solutions.

The evaluation metrics that we use to measure the quality of labeled data - Intersection over Union (IOU) and F1 score - has allowed us to make swift adjustments on the go and continuously improve the quality of our labeling standards. To find out more and start exploring our end-to-end data labeling service, speak to the team at Supahands today


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 
 

Jobs

 
  • R&D Data Scientist - Danaher - Port Washington, NY

    As a Data Scientist at IBM, you will help transform our clients’ data into tangible business value by analyzing information, communicating outcomes and collaborating on product development. Work with Best in Class open source and visual tools, along with the most flexible and scalable deployment options. Whether it’s investigating patient trends or weather patterns, you will work to solve real world problems for the industries transforming how we live.

        Want to post a job here? Email us for details >> team@datascienceweekly.org

 
 

Training & Resources

 
  • Stanford CS223A - Introduction to Robotics
    The purpose of this course is to introduce you to basics of modeling, design, planning, and control of robot systems. In essence, the material treated in this course is a brief survey of relevant results from geometry, kinematics, statics, dynamics, and control...
  • JAX Global Meetup
    JAX Global is an online meetup group that hosts live events from researchers and engineers on topics related to the JAX library, machine learning and scientific computing. Join us an learn more about this wonderful framework and interact with awesome people 😀!...
  • ApplyingML - Papers, Guides, and Interviews with ML practitioners
    ApplyingML collects tacit/tribal/ghost knowledge on applying ML via curated papers/blogs, guides, and interviews with ML practitioners. In a nutshell, it's 1/3 applied-ml, 1/3 ghost knowledge, and 1/3 Tim Ferriss Show. The intent is to make it easier to apply—and benefit from—ML at work...
 
 

Books

 

 
  • Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits


    Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
     


    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2021 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

Data Science Weekly - Issue 417

Friday, November 19, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #417 November 18 2021 Editor Picks To Be

[in case you missed it] Data Science Weekly - Issue 416

Sunday, November 14, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #416 November 11 2021 Editor Picks

Data Science Weekly - Issue 416

Friday, November 12, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #416 November 11 2021 Editor Picks

Data Science Weekly - Issue 415

Friday, November 5, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #415 November 04 2021 Editor Picks

Data Science Weekly - Issue 414

Friday, October 29, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #414 October 28 2021 Editor Picks

You Might Also Like

Re: Hackers may have stolen everyone's SSN!

Saturday, November 23, 2024

I wanted to make sure you saw Incogni's Black Friday deal, which is exclusively available for iPhone Life readers. Use coupon code IPHONELIFE to save 58%. Here's why we recommend Incogni for

North Korean Hackers Steal $10M with AI-Driven Scams and Malware on LinkedIn

Saturday, November 23, 2024

THN Daily Updates Newsletter cover Generative AI For Dummies ($18.00 Value) FREE for a Limited Time Generate a personal assistant with generative AI Download Now Sponsored LATEST NEWS Nov 23, 2024

📧 Building Async APIs in ASP.NET Core - The Right Way

Saturday, November 23, 2024

​ Building Async APIs in ASP .NET Core - The Right Way Read on: m​y website / Read time: 5 minutes The .NET Weekly is brought to you by: Even the smartest AI in the world won't save you from a

WebAIM November 2024 Newsletter

Friday, November 22, 2024

WebAIM November 2024 Newsletter Read this newsletter online at https://webaim.org/newsletter/2024/november Features Using Severity Ratings to Prioritize Web Accessibility Remediation When it comes to

➡️ Why Your Phone Doesn't Want You to Sideload Apps — Setting the Default Gateway in Linux

Friday, November 22, 2024

Also: Hey Apple, It's Time to Upgrade the Macs Storage, and More! How-To Geek Logo November 22, 2024 Did You Know Fantasy author JRR Tolkien is credited with inventing the main concept of orcs and

JSK Daily for Nov 22, 2024

Friday, November 22, 2024

JSK Daily for Nov 22, 2024 View this email in your browser A community curated daily e-mail of JavaScript news React E-Commerce App for Digital Products: Part 4 (Creating the Home Page) This component

Spyglass Dispatch: The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen

Friday, November 22, 2024

The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen The Spyglass Dispatch is a free newsletter sent out daily on

Charted | How the Global Distribution of Wealth Has Changed (2000-2023) 💰

Friday, November 22, 2024

This graphic illustrates the shifts in global wealth distribution between 2000 and 2023. View Online | Subscribe | Download Our App Presented by: MSCI >> Get the Free Investor Guide Now FEATURED

Daily Coding Problem: Problem #1616 [Easy]

Friday, November 22, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Alibaba. Given an even number (greater than 2), return two prime numbers whose sum will

The problem to solve

Friday, November 22, 2024

​ Use problem framing to define the problem to solve This week, Tom Parson and Krishna Raha share tools and frameworks to identify and address challenges effectively, while Voltage Control highlights