Data Science Weekly - Data Science Weekly - Issue 418

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #418

November 25 2021

Editor Picks
 
  • The Fry Universe
    You probably like some types of fries more than others...3D modeling of various fry shapes reveals why you like some more than others...
  • Transformers from Scratch
    I procrastinated a deep dive into transformers for a few years. Finally the discomfort of not knowing what makes them tick grew too great for me. Here is that dive...Transformers were introduced in...2017 paper as a tool for sequence transduction—converting one sequence of symbols to another. The most popular examples of this are translation, as in English to German. It has also been modified to perform sequence completion—given a starting prompt, carry on in the same vein and style. They have quickly become an indispensible tool for research and product development in natural language processing....
 
 

A Message from this week's Sponsor:

 



Retool is the fast way to build an interface for any database

With Retool, you don't need to be a developer to quickly build an app or dashboard on top of any data set. Data teams at companies like NBC use Retool to build any interface on top of their data—whether it's a simple read-write visualization or a full-fledged ML workflow.

Drag and drop UI components—like tables and charts—to create apps. At every step, you can jump into the code to define the SQL queries and JavaScript that power how your app acts and connects to data. The result—less time on repetitive work and more time to discover insights.

 

 

Data Science Articles & Videos

 
  • Data Advantage Matrix: A New Way to Think About Data Strategy
    As the co-founder of two data start-ups, one question I get all the time is, “How do I get started with my data strategy? Where do we start? What do we prioritize?”... In NewVantage Partners’ annual survey, the percentage of companies that invest in data initiatives was near-universal (literally 99% in 2021) for the third year in a row...But while investing in data is a given, actually using data can feel like a crapshoot. In that same survey, only 24% of companies said that they had created a data-driven culture....In this article, I’ll break down how to think about your data strategy...and give examples of how two hypothetical companies would use it...
  • BookNLP
    BookNLP is a natural language processing pipeline that scales to books and other long documents (in English), including: Part-of-speech tagging, Dependency parsing, Entity recognition, Character name clustering and coreference resolution, Quotation speaker identification, Supersense tagging, Event tagging, Referential gender inference, and more...
  • The missing analytics executive
    Despite the apparent discrepancies in title (CTO sounds higher than VP) and responsibilities (leading a department sounds more important than tinkering), the two roles are peers. Both are senior executives, and both often report to the CEO. The division of labor is a recognition not of hierarchy, but that there’s enough important labor in engineering that it needs to be divided: One role to manage, and one to advise...Data departments should follow the same pattern. Rather than being led by a single ambiguously defined and overburdened CDO, data teams should have two representatives in senior management: A VP of data responsible for managing the team’s daily operations, and a chief analytics officer...
  • Getting into the subspace; or what happens when you approximate a Gaussian process
    The problem with Gaussian processes, at least from a computational point of view, is that they’re just too damn complicated. Because they are supported on some infinite dimensional Banach space B, the more we need to see of them (for instance because we have a lot of unique sis) the more computational power they require. So the obvious solution is to somehow make Gaussian processes less complex...This somehow has occupied a lot of people’s time over the last 20 years and there are many many many many possible options. But for the moment, I just want to focus on one of the generic classes of solutions: You can make Gaussian processes less computationally taxing by making them less expressive...
  • Bernoulli's Fallacy & the Crisis of Modern Science, with Aubrey Clayton
    I love epistemology — the study of how we know what we know...So..It was high time I dedicated a whole episode to this topic. And what better guest than Aubrey Clayton, the author of the book Bernoulli's Fallacy: Statistical Illogic and the Crisis of Modern Science...Aubrey is a mathematician in Boston who teaches the philosophy of probability and statistics at the Harvard Extension School and he holds a PhD in mathematics from Berkeley...We talked about what he deems “a catastrophic error in the logic of the standard statistical methods in almost all the sciences” and why this error manifests even outside of science, like in medicine, law, public policy, etc...But don’t worry, we’re not doomed — we’ll also see where we go from there. As a big fan of E.T Jaynes, Aubrey will also tell us how this US scientist influenced his own thinking as well as the field of Bayesian inference in general....
  • A Survey of Generalisation in Deep Reinforcement Learning
    The study of generalisation in deep Reinforcement Learning (RL) aims to produce RL algorithms whose policies generalise well to novel unseen situations at deployment time, avoiding overfitting to their training environments. Tackling this is vital if we are to deploy reinforcement learning algorithms in real world scenarios, where the environment will be diverse, dynamic and unpredictable. This survey is an overview of this nascent field. We provide a unifying formalism and terminology for discussing different generalisation problems, building upon previous works...
  • Machine Learning Street Talk Podcast Episode #53: Quantum Natural Language Processing - Prof. Bob Coecke (Oxford)
    Bob Coercke is a celebrated physicist, he's been a Physics and Quantum professor at Oxford University for the last 20 years. He is particularly interested in Structure which is to say, Logic, Order, and Category Theory. He is well known for work involving compositional distributional models of natural language meaning and he is also fascinated with understanding how our brains work...Bob thinks that interactions between systems in Quantum Mechanics carries naturally over to how word meanings interact in natural language. Bob argues that this interaction embodies the phenomenon of quantum teleportation...
  • Get Ready For Confidential Computing
    In this post we describe the ecosystem of tools focused on protecting data while in use. Our primary focus is on Confidential Computing tools for the development of data, analytic, and AI applications. We believe that companies that are able to use data securely will be well-positioned to build data and AI applications in the future...
 
 

Tools*

 


High quality data labeling, consistently

Edge cases are the most common challenges that ML teams face when training their AI models, making it difficult to reach 95+% accuracy. This can be more complex once you need to scale and start working with 3rd party data labeling solutions.

The evaluation metrics that we use to measure the quality of labeled data - Intersection over Union (IOU) and F1 score - has allowed us to make swift adjustments on the go and continuously improve the quality of our labeling standards. To find out more and start exploring our end-to-end data labeling service, speak to the team at Supahands today


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 
 

Jobs

 
  • R&D Data Scientist - Danaher - Port Washington, NY

    As a Data Scientist at IBM, you will help transform our clients’ data into tangible business value by analyzing information, communicating outcomes and collaborating on product development. Work with Best in Class open source and visual tools, along with the most flexible and scalable deployment options. Whether it’s investigating patient trends or weather patterns, you will work to solve real world problems for the industries transforming how we live.

        Want to post a job here? Email us for details >> team@datascienceweekly.org

 
 

Training & Resources

 
  • Stanford CS223A - Introduction to Robotics
    The purpose of this course is to introduce you to basics of modeling, design, planning, and control of robot systems. In essence, the material treated in this course is a brief survey of relevant results from geometry, kinematics, statics, dynamics, and control...
  • JAX Global Meetup
    JAX Global is an online meetup group that hosts live events from researchers and engineers on topics related to the JAX library, machine learning and scientific computing. Join us an learn more about this wonderful framework and interact with awesome people 😀!...
  • ApplyingML - Papers, Guides, and Interviews with ML practitioners
    ApplyingML collects tacit/tribal/ghost knowledge on applying ML via curated papers/blogs, guides, and interviews with ML practitioners. In a nutshell, it's 1/3 applied-ml, 1/3 ghost knowledge, and 1/3 Tim Ferriss Show. The intent is to make it easier to apply—and benefit from—ML at work...
 
 

Books

 

 
  • Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits


    Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
     


    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2021 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

Data Science Weekly - Issue 417

Friday, November 19, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #417 November 18 2021 Editor Picks To Be

[in case you missed it] Data Science Weekly - Issue 416

Sunday, November 14, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #416 November 11 2021 Editor Picks

Data Science Weekly - Issue 416

Friday, November 12, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #416 November 11 2021 Editor Picks

Data Science Weekly - Issue 415

Friday, November 5, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #415 November 04 2021 Editor Picks

Data Science Weekly - Issue 414

Friday, October 29, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #414 October 28 2021 Editor Picks

You Might Also Like

Kotlin Weekly #407

Sunday, May 19, 2024

ISSUE #407 19th of May 2024 Hello Kotliners! The Google I/O just finished this week with a huge announcement for us, with Google supporting now Kotlin Multiplatform on Android, and the KotlinConf will

Learn How to Use AI to Reach Your Full Potential, newsletterest1!

Sunday, May 19, 2024

3 Ways AI Can Help Your Writing ͏ ‌  ͏ ‌  ͏ ‌  ͏ ‌  ͏ ‌  ͏ ‌  ͏ ‌  ͏ ‌ ͏ ‌  ͏ ‌  ͏ ‌  ͏ ‌  ͏ ‌  ͏ ‌  ͏ ‌  ͏ ‌ ͏ ‌  ͏ ‌  ͏ ‌  ͏ ‌  ͏ ‌  ͏ ‌  ͏ ‌  ͏ ‌ ͏ ‌  ͏ ‌  ͏ ‌  ͏ ‌  ͏ ‌

Software Testing Weekly - Issue 220

Saturday, May 18, 2024

Software Testing Conferences 📚 View on the Web Archives ISSUE 220 May 18th 2024 COMMENT Welcome to the 220th issue! Have you ever been to a testing conference? They're a great way to learn about

📶 Is a Cellular iPad Worth It? — How to Prevent YouTube From Taking Over Your Screensaver

Saturday, May 18, 2024

Also: This Robot Vacuum Can Clean Stairs, and More! How-To Geek Logo May 18, 2024 📩 Get expert reviews, the hottest deals, how-to's, breaking news, and more delivered directly to your inbox by

Weekend Reading — Objection-oriented programming

Saturday, May 18, 2024

This week we find a power-up box, replace GitHub Actions with Maven XMLs, avoid the worst website in the world, revisit RTO policies, “listen” to OpenAI employees, watch our Slack private messages, do

Daily Coding Problem: Problem #1445 [Easy]

Saturday, May 18, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Jane Street. The United States uses the imperial system of weights and measures, which

You don’t have to take our word for it…

Saturday, May 18, 2024

You can probably tell how excited we are to re-launch our Gigantic courses – which bring on-demand product management training for today's modern Product Managers and Product Leaders. In fact, we

🐍 New Python tutorials on Real Python

Saturday, May 18, 2024

Hey there, There's always something going on over at realpython.com as far as Python tutorials go. Here's what you may have missed this past week: What Is the __pycache__ Folder in Python? In

Visualized | Life Expectancy by Region (1950-2050F) 📊

Saturday, May 18, 2024

This map shows life expectancy at birth for key global regions, from 1950 to 2050F. View Online | Subscribe Presented by Voronoi: The App Where Data Tells the Story FEATURED STORY Life Expectancy by

New Wi-Fi Vulnerability Enables Network Eavesdropping via Downgrade Attacks

Saturday, May 18, 2024

THN Daily Updates Newsletter cover The DevSecOps Playbook: Deliver Continuous Security at Speed ($19.00 Value) FREE for a Limited Time A must-read guide to a new and rapidly growing field in