Data Science Weekly - Data Science Weekly - Issue 451

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #451

July 14 2022

Editor's Picks

 

 
  • The Data Science Trap [Reddit Discussion]
    It is no longer open to question that data scientists in the industry are merely glorified data analysts. Businesses are pouring money into STEM graduates to create colorful charts and BS reporting. Aside from hypothesis testing and linear or logistic regressions, nothing they do comes close to statistics or modeling. There have been several threads about how research scientists are the new data scientists - and these threads are full of scorn for the state of the data scientist job market...
  • The Data Science Trap: A Rebuttal [Reddit Discussion]
    More often than not, I see comments on this subreddit suggesting the dilution of the Data Science discipline into a glorified Data Analyst position. Maybe my 10 years in the Data Science field leads me to possessing a level of naivety, but I’ve concluded that Data Science in its academic interpretation is far from its practicality in application...
  • Prof. Noam Chomsky Machine Learning Street Talk Interview [Video]
    Prof. Noam Chomsky is the father of modern linguistics and the most important intellectual of the 20th century...We explore some of the profound misunderstandings of linguistics in general and Chomsky’s own work specifically which have persisted, at the highest levels of academia for over sixty years...We have produced a significant introduction section where we discuss in detail Yann LeCun’s recent position paper on AGI, a recent paper on emergence in LLMs, empiricism related to cognitive science, cognitive templates, “the ghost in the machine” and language. ...
 
 

A Message from this week's Sponsor:

 



Pinecone vector database

The Pinecone vector database makes it easy to build high-performance vector search applications. Developer-friendly, fully managed, and easily scalable without infrastructure hassles.

Use Pinecone to build semantic search, object recognition, recommendations, anomaly detection, and other vector-based functionality into your applications..

 

 

Data Science Articles & Videos

 
  • Job Hunt as a PhD in AI / ML / RL: How it Actually Happens
    Combine a supposedly good tech hiring market with a Ph.D. from Berkeley AI Research and let's see what we get. In the vein of transparency, I wanted to share my experience in the job search and some notes specific to reinforcement learning as an area. Is there something you want to know about and I didn't share? Please let me know...
  • Critical Dataset Studies Reading List
    How should we study datasets in machine learning? As machine learning increasingly becomes a site of sociotechnical inquiry, invoking numerous social, political, legal, and ethical issues, datasets are a crucial component as they are core material used to train models. Inspired by Tarleton Gillespie and Nick Seaver’s Critical Algorithm Studies reading list, this collection is meant to serve as an entry point to the growing literature on ML datasets across the fields of computer science, human-computer interaction, science and technology studies, media studies, and histories of technology...
  • On the Principles of Parsimony and Self-Consistency for the Emergence of Intelligence
    Ten years into the revival of deep networks and artificial intelligence, we propose a theoretical framework that sheds light on understanding deep networks within a bigger picture of Intelligence in general. We introduce two fundamental principles, Parsimony and Self-consistency, that we believe to be cornerstones for the emergence of Intelligence, artificial or natural...
  • Why every statistician should know about cross-validation
    Surprisingly, many statisticians see cross-validation as something data miners do, but not a core statistical technique. I thought it might be helpful to summarize the role of cross-validation in statistics, especially as it is proposed that the Q&A site at stats.stackexchange.com should be renamed CrossValidated.com...
  • Deep Learning with a Small Training Batch (or Lack Thereof)
    This article covers two self-supervised approaches to tackling the image classification issue. Google and Facebook offer these methods. These methods are excellent when you only have a small training batch, and they will significantly simplify manual markup or even allow you to ditch it...
  • How Spotify Uses Semantic Search for Podcasts
    Spotify’s natural language search for podcasts is fascinating. In the past, users had to rely on keyword/term matching to find the podcast episodes they wanted. Now, they can search in natural language, in much the same way we might ask a real person where to find something...This technology relies on what we like to call semantic search. It enables a more intuitive search experience because we tend to have an idea of what we’re looking for, but rarely do we know precisely which terms appear in what we want...
  • BERTopic: Leverage BERT and c-TF-IDF to create easily interpretable topics
    BERTopic is a topic modeling technique that leverages 🤗 transformers and c-TF-IDF to create dense clusters allowing for easily interpretable topics whilst keeping important words in the topic descriptions...BERTopic supports guided, (semi-) supervised, and dynamic topic modeling. It even supports visualizations similar to LDAvis!...
  • An Introduction to Lifelong Supervised Learning
    This primer is an attempt to provide a detailed summary of the different facets of lifelong learning...Chapter 2 provides a high-level overview of lifelong learning systems...Chapter 3 focuses on regularization-based approaches that do not assume access to any data from previous tasks. Chapter 4 discusses memory-based approaches that typically use a replay buffer or an episodic memory to save subset of data across different tasks. Chapter 5 focuses on different architecture families (and their instantiations) that have been proposed for training lifelong learning systems. Following these different classes of learning algorithms, we discuss the commonly used evaluation benchmarks and metrics for lifelong learning (Chapter 6) and wrap up with a discussion of future challenges and important research directions in Chapter 7...
  • 4 Pandas Anti-Patterns to Avoid and How to Fix Them
    Pandas is a powerful data analysis library with a rich API that offers multiple ways to perform any given data manipulation task. Some of these approaches are better than others, and pandas users often learn suboptimal coding practices that become their default workflows. This post highlights four common pandas anti-patterns and outlines a complementary set of techniques that you should use instead...
  • awesome-data-leadership
    A curated list of awesome and useful posts, videos, and articles on leading a data team. This includes leadership at the middle-management, Director/VP, or C-suite level, for organizations both big and small. A few relevant engineering management articles are sprinkled in...
  • Intuitive physics learning in a deep-learning model inspired by developmental psychology
    ‘Intuitive physics’ enables our pragmatic engagement with the physical world and forms a key component of ‘common sense’ aspects of thought. Current artificial intelligence systems pale in their understanding of intuitive physics, in comparison to even very young children. Here we address this gap between humans and machines by drawing on the field of developmental psychology...
 
 

Webinar Panel*

 


Wednesday, July 27 @ 2PM ET / 11AM PT

Get practical advice from Data & Analytics Leaders from PayPal, Penguin Random House, & PartnerRe to learn about fostering an analytics-driven culture to drive better insights. Register Now!


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 

 

Jobs

 
  • Senior Data Scientist, Startup Creation at Redesign Health - US

    As our Senior Data Scientist for our Startup Creation team, you will set up and configure the data infrastructure for our startups, and work with the startup founding team to define data driven KPIs, and implement automated statistical analyses of customer behavior. Your goal is to make all of the companies that we launch data-driven from day one.

    In this role, you will function as an in-house implementation team for the companies that Redesign Health launches (internally referred to as OpCos). We provide data strategy, data pipeline, data analytics and forecasting services to newly formed companies in a repeatable and scalable manner...

     

        Want to post a job here? Email us for details --> team@datascienceweekly.org

 

 

Training & Resources

 
  • Transformers United (Stanford CS 25)
    Since their introduction in 2017, transformers have revolutionized Natural Language Processing (NLP). Now, transformers are finding applications all over Deep Learning, be it computer vision (CV), reinforcement learning (RL), Generative Adversarial Networks (GANs), Speech or even Biology...In this seminar, we examine the details of how transformers work, and dive deep into the different kinds of transformers and how they're applied in different fields. We do this through a combination of instructor lectures, guest lectures, and classroom discussions...
  • Introduction to K-Means Clustering
    While this article will focus most closely on K-means, there are other powerful types of clustering that can be used as well. Let’s take a look at the main ones like hierarchical, density-based, and partitional clustering...
 
 

What you’re up to – notes from DSW readers

 
  • Working on something cool? Let us know here :) ...
 

* To share your projects and updates, share the details here.

** Want to chat with one of the above people? Hit reply and let us know :)

 

Last Week's Newsletter's 3 Most Clicked Links

 

* Based on unique clicks.

** Find last week's newsletter here.

 

P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2022 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Key phrases

Older messages

Data Science Weekly - Issue 450

Friday, July 8, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #450 July 07 2022 Editor's Picks AI

Data Science Weekly - Issue 449

Friday, July 1, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #449 June 30 2022 Editor Picks Pen and

Data Science Weekly - Issue 448

Friday, June 24, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #448 June 23 2022 Editor Picks Machine

Data Science Weekly - Issue 447

Friday, June 17, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #447 June 16 2022 Editor Picks The

Data Science Weekly - Issue 446

Friday, June 10, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #446 June 09 2022 Editor Picks Literary

You Might Also Like

Your 3 AI Incubator Tracks: Curriculum, Coaching, or 1-on-1 Mentorship

Friday, April 19, 2024

How to pick the one that's right for you ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

Logitech's AI Prompt Builder is surprisingly handy

Friday, April 19, 2024

Torvalds on evil devs and AI hype; Quest 2's price drops; Virtual cards explained -- ZDNET ZDNET Tech Today - US April 19, 2024 placeholder Logitech's free AI Prompt Builder is surprisingly

Tesla recalls nearly 4,000 Cybertrucks 

Friday, April 19, 2024

After reports of malfunctioning accelerator pedals, Tesla is recalling Cybertrucks View this email online in your browser By Alex Wilhelm Friday, April 19, 2024 Welcome to TechCrunch AM! Today's

SWLW #595: My role as a founder CTO, AI Product Management, and more.

Friday, April 19, 2024

Weekly articles & videos about people, culture and leadership: everything you need to design the org that makes the product. A weekly newsletter by Oren Ellenbogen with the best content I found

The bill to ban TikTok is barreling ahead

Friday, April 19, 2024

The Morning After It's Friday, April 19, 2024. The bill that could ban TikTok in the United States inches closer to becoming law. The legislation passed the House of Representatives last month,

Digest #134: TDD with Serverless, Terraform AI Conversions, K8s Cost Metrics & OpenTofu Response

Friday, April 19, 2024

Learn to convert Terraform templates, deploy SSR on AWS Amplify, and apply TDD to Serverless. Gain insights on PostgreSQL, microfrontends, and secure APIs. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Akira Ransomware Gang Extorts $42 Million; Now Targets Linux Servers

Friday, April 19, 2024

THN Daily Updates Newsletter cover Webinar -- The Future of Threat Hunting Is Powered by Generative AI From Data to Defense: Step Into the Next Era of Cybersecurity with CensysGPT Download Now

ASP.NET Core News - 04/19/2024

Friday, April 19, 2024

View this email in your browser Get ready for this weeks best blog posts about ASP.NET Core! This newsletter is sponsored by elmah.io - the most advanced, yet so simple to set up, error logging and

Post from Syncfusion Blogs on 04/19/2024

Friday, April 19, 2024

New blogs from Syncfusion Syncfusion Prepares for MS Build 2024 with Cloud-Ready Solutions on Azure Marketplace By gingerr Syncfusion offers cloud-ready solutions in Azure Marketplace for MS Build 2024

Hacker Newsletter #696

Friday, April 19, 2024

The greatest value of a picture is when it forces us to notice what we never expected to see. //John W. Tukey hackernewsletter Issue #696 // 2024-04-19 // View in your browser #Favorites Unlock your