Data Science Weekly - Data Science Weekly - Issue 410

Curated news, articles and jobs related to Data Science.
Keep up with all the latest developments

Email not displaying correctly?
View it in your browser.

Issue #410

September 30 2021

Editor Picks

Top Places to Work for Data Scientists
What would constitute a good place to work for a data scientist? How do you think about it at different stages of your career?...These are important questions to ponder as data science (DS) practitioners witness the field going through a phase of high growth of 37% per year...Depending on your career stage, different types of companies can help you evolve a career in DS. Let’s look at how to quantify this assessment and tailor the opportunities to data scientists of various career stages...

Nowcasting the Next Hour of Rain
At any moment in the UK, according to one study, one third of the country has talked about the weather in the past hour, reflecting the importance of weather in daily life... Our latest research and state-of-the-art model advances the science of Precipitation Nowcasting, which is the prediction of rain (and other precipitation phenomena) within the next 1-2 hours. In a paper written in collaboration with the Met Office and published in Nature, we directly tackle this important grand challenge in weather prediction...

Statistics as algorithmic summarization
Statistics gives us reasonable procedures to estimate properties of a general population by examining only a few individuals from the population. In this regard, statistics is algorithmic: it provides randomized algorithms for extrapolation. In this blog, I’ll review some elementary stats (with as little mathematical formalism as possible), and try to crystalize why this algorithmic view is illuminating...

A Message from this week's Sponsor:

Join Impact 2021 on November 3, 2021: The First-Ever Data Observability Summit. Join Today's Leading Data Pioneers.

Hear from data leaders pioneering the technologies & processes shaping data engineering. Featuring First Chief Data Scientist of the U.S., founder of the Data Mesh and many more! Get Your Free Ticket ...

Data Science Articles & Videos

Chat with *William Shatner* about the future of AI
There’s no doubt that technology has enriched our lives. Our phones can easily answer questions for us. Our cars can drive themselves. Robots can perform complicated surgeries on our fragile, fleshy bodies. Artificial intelligence has come a long way in 50 years, but how far will it go in another 50?...

A Comprehensive Survey and Performance Analysis of Activation Functions in Deep Learning
In this paper, a comprehensive overview and survey is presented for Activation Functions (AFs) in neural networks for deep learning. Different classes of AFs such as Logistic Sigmoid and Tanh based, ReLU based, ELU based, and Learning based are covered. Several characteristics of AFs such as output range, monotonicity, and smoothness are also pointed out. A performance comparison is also performed among 18 state-of-the-art AFs with different networks on different types of data....

Deep Learning over the Internet: Training Language Models Collaboratively
In this blog post, we describe DeDLOC — a new method for collaborative distributed training that can adapt itself to the network and hardware constraints of participants. We show that it can be successfully applied in real-world scenarios by pretraining sahajBERT, a model for the Bengali language, with 40 volunteers. On downstream tasks in Bengali, this model achieves nearly state-of-the-art quality with results comparable to much larger models that used hundreds of high-tier accelerators...

Inconsistency in Conference Peer Review: Revisiting the 2014 NeurIPS Experiment
In this paper we revisit the 2014 NeurIPS experiment that examined inconsistency in conference peer review. We determine that 50% of the variation in reviewer quality scores was subjective in origin. Further, with seven years passing since the experiment we find that for accepted papers, there is no correlation between quality scores and impact of the paper as measured as a function of citation count. We trace the fate of rejected papers, recovering where these papers were eventually published. For these papers we find a correlation between quality scores and impact. We conclude that the reviewing process for the 2014 conference was good for identifying poor papers, but poor for identifying good papers. ...

Interview with Nihit Desai - Staff Engineer at Facebook
Hi! I’m Nihit. I am a Staff Engineer at Facebook, where I currently work on business integrity...Along with a friend of mine, I also write a biweekly newsletter focused on challenges and opportunities associated with real-world applications of ML...

Deep Learning's Diminishing Returns: The cost of improvement is becoming unsustainable
While deep learning's rise may have been meteoric, its future may be bumpy. Like Frank Rosenblatt before them, today's deep-learning researchers are nearing the frontier of what their tools can achieve. To understand why this will reshape machine learning, you must first understand why deep learning has been so successful and what it costs to keep it that way...

A Systematic Literature Review on the Use of Deep Learning in Software Engineering Research
This paper presents a systematic literature review of research at the intersection of SE & DL. The review canvases work appearing in the most prominent SE and DL conferences and journals and spans 128 papers across 23 unique SE tasks. We center our analysis around the components of learning, a set of principles that govern the application of machine learning techniques (ML) to a given problem domain, discussing several aspects of the surveyed work at a granular level. The end result of our analysis is a research roadmap that both delineates the foundations of DL techniques applied to SE research, and highlights likely areas of fertile exploration for the future...

A Plug-and-Play Method for Controlled Text Generation
In this work, we present a plug-and-play decoding method for controlled language generation that is so simple and intuitive, it can be described in a single sentence: given a topic or keyword, we add a shift to the probability distribution over our vocabulary towards semantically similar words. We show how annealing this distribution can be used to impose hard constraints on language generation, something no other plug-and-play method is currently able to do with SOTA language generators...

Machine Learning in Astronomy and Physics
This week’s guest is Dr. Viviana Acquaviva, Professor in the Physics Department at the CUNY NYC College of Technology and at the CUNY Graduate Center...Viviana is currently writing a book for Princeton University Press entitled “Machine Learning techniques for Physics and Astronomy”...This conversation is focused on applications of machine learning and data science to physics, their impact on her research, and how the rise of ML has impacted her teaching and research...

Understanding AWK
I would hear people mention Awk and how often they used it, and I was pretty certain I was missing out on some minor superpower...Like this little off hand comment by Bryan Cantrill: "I write three or four Awk programs a day. And these are one-liners. These super quick programs"...It turns out Awk is pretty simple. It has only a couple of conventions and only a small amount of syntax. As a result, it’s straightforward to learn, and once you understand it, it will come in handy more often than you’d think..l.So in this article, I will teach myself, and you, the basics of Awk...

Training*

How to streamline data science and feature creation workflows in Snowflake

Thurs, Oct 14th, 2:00 PM ET (11:00 AM PT)

Learn how AtScale + Snowflake eliminates data marts like SSAS for native dimensional analysis in the Data Cloud.

*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

Jobs

Senior Data Scientist - TikTok - LA

TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy by offering a home for creative expression and an experience that is genuine, joyful, and positive.
- Generate useful features from large amount of data
- Apply supervised and unsupervised machine learning techniques, such as linear and logistic regression, decision trees, and k-means clustering
- Develop segmentation models, classification models, propensity models, LTV models, experimental design, optimization models
- Perform statistical analysis such as KPI deep dives, performance marketing efficiency, behavioral clustering, and user journey analytics
- Curate audiences and inform engagement tactics to enable differentiated, relevant marketing touches across channels (social, email, in app, push)
- Synthesize analytics and statistical approaches into easy-to-consume storylines, both visually and verbally, and provide indicated actions for executive audiences
- Capture business requirements for data and analytic solutions and collaborate XFN to ensure business requirements align with business needs
- Analyze creatives and surface insights that will help drive engagement and retention
- Support day-to-day collaboration with performance marketing to communicate insights and recommend data informed strategies

Want to post a job here? Email us for details >> team@datascienceweekly.org

Training & Resources

Decision Trees: A Guide with Examples
A tutorial covering Decision Trees, complete with code and interactive visualizations...

Word2vec with PyTorch: Implementing the Original Paper
Covering all the implementation details, skipping high-level overview. Code attached....

Self Attention Tutorial Video
This is the first video on attention mechanisms. We'll start with self attention and try to explain why it's just a re-weighting tactic...

Books

Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits

Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.

P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian

Follow on Twitter

unsubscribe from this list update subscription preferences

Data Science Weekly - Issue 409

Friday, September 24, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #409 September 23 2021 Editor Picks Tree

Data Science Weekly - Issue 408

Friday, September 17, 2021

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #408 September 16 2021 Editor Picks The

Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator

Friday, February 14, 2025

What came before the golem? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Defining Your Paranoia Level: Navigating Change Without the Overkill

Friday, February 14, 2025

We've all been there: trying to learn something new, only to find our old habits holding us back. We discussed today how our gut feelings about solving problems can sometimes be our own worst enemy

5 ways AI can help with taxes 🪄

Friday, February 14, 2025

Remotely control an iPhone; 💸 50+ early Presidents' Day deals -- ZDNET ZDNET Tech Today - US February 10, 2025 5 ways AI can help you with your taxes (and what not to use it for) 5 ways AI can help

Recurring Automations + Secret Updates

Friday, February 14, 2025

Smarter automations, better templates, and hidden updates to explore 👀 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

The First Provable AI-Proof Game: Introducing Butterfly Wings 4

Friday, February 14, 2025

Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? undefined The Market Today #01 Instagram (Meta) 714.52 -0.32%

GCP Newsletter #437

Friday, February 14, 2025

Welcome to issue #437 February 10th, 2025 News BigQuery Cloud Marketplace Official Blog Partners BigQuery datasets now available on Google Cloud Marketplace - Google Cloud Marketplace now offers

Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰

Friday, February 14, 2025

Discover how the share of US wealth held by the top 1% has evolved from 1989 to 2024 in this infographic. View Online | Subscribe | Download Our App Download our app to see thousands of new charts from

The Great Social Media Diaspora & Tapestry is here

Friday, February 14, 2025

Apple introduces new app called 'Apple Invites', The Iconfactory launches Tapestry, beyond the traditional portfolio, and more in this week's issue of Creativerly. Creativerly The Great

Daily Coding Problem: Problem #1689 [Medium]

Friday, February 14, 2025

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a linked list, sort it in O(n log n) time and constant space. For example,

📧 Stop Conflating CQRS and MediatR

Friday, February 14, 2025

Stop Conflating CQRS and MediatR Read on: my website / Read time: 4 minutes The .NET Weekly is brought to you by: Step right up to the Generative AI Use Cases Repository! See how MongoDB powers your

Data Science Weekly - Data Science Weekly - Issue 410

Issue #410

September 30 2021

A Message from this week's Sponsor:

Data Science Articles & Videos

Training*

Jobs

Training & Resources

Books

Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits

Older messages

Data Science Weekly - Issue 409

Data Science Weekly - Issue 408

You Might Also Like

Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator

Defining Your Paranoia Level: Navigating Change Without the Overkill

5 ways AI can help with taxes 🪄

Recurring Automations + Secret Updates

The First Provable AI-Proof Game: Introducing Butterfly Wings 4

GCP Newsletter #437

Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰

The Great Social Media Diaspora & Tapestry is here

Daily Coding Problem: Problem #1689 [Medium]

📧 Stop Conflating CQRS and MediatR

Data Science Weekly - Data Science Weekly - Issue 410

Issue #410 September 30 2021

A Message from this week's Sponsor:

Data Science Articles & Videos

Training*

Jobs

Training & Resources

Books

Older messages

You Might Also Like

Issue #410

September 30 2021