Data Science Weekly - Data Science Weekly - Issue 476

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #476

January 05 2023

Editor's Picks

 
  • On Analogy-Making in Large Language Models
    I read with great interest a recent paper by cognitive scientists Taylor Webb, Keith Holyoak, and Hongjing Lu, entitled “Emergent Analogical Reasoning in Large Language Models. This paper investigates zero-shot analogical reasoning abilities in GPT-3...In this article I give some of my own perspectives on the Webb et al. paper’s results and claims. I discuss the analogy problems that Webb et al. gave to GPT-3 (in this paper, “GPT-3” will refer to text-davinci-003), do some of my own experiments on letter-string analogies (one of their problem types), and draw some conclusions about the robustness and generality of GPT-3’s analogy-making abilities...
  • Prompt Engineering 101: Introduction and resources
    Generative AI models interface with the user through mostly textual input. You tell the model what to do through a textual interface, and the model tries to accomplish the task. What you tell the model to do in a broad sense is the prompt...In this article we'll cover: a) What is a prompt?, b) Elements of a prompt, c) Basic prompt examples, d) So, what is prompt engineering anyways?, e) Some more advanced prompt examples, and f) Resources...


 

A Message from this week's Sponsor:

 



Pinecone vector database

The Pinecone vector database makes it easy to build high-performance vector search applications. Developer-friendly, fully managed, and easily scalable without infrastructure hassles.

Use Pinecone to build semantic search, object recognition, recommendations, anomaly detection, and other vector-based functionality into your applications.



 

Data Science Articles & Videos

 
  • What Do You Median?
    Most empirical studies, even to this day, use Ordinary Least Squares (OLS) to estimate regression models. Many of us even have a modicum of understanding as to why: OLS is "great." Some may even know in what sense OLS is "great" ... it is BLUE, where BLUE stands for the Best Linear Unbiased Estimator. Unfortunately, that may be where most understanding stops...But, this begs two questions that empirical researchers and consumers of empirical research ought to understand: 1) What does it mean to be BLUE and should we care? and 2) What assumptions are required for OLS to be BLUE and what happens if they fail?...
  • Data Pipeline Design Patterns: #1 - Data flow patterns
    Data pipelines can become flakey over time if the data pipeline design foundations are not solid...This post will cover the typical data flow design patterns. We will learn about the pros and cons of each design pattern, when to use them, and, more importantly, when not to use them...
  • Towards Deployable RL - What’s Broken with RL Research and a Potential Fix
    Reinforcement learning (RL) has demonstrated great potential, but is currently full of overhyping and pipe dreams. We point to some difficulties with current research which we feel are endemic to the direction taken by the community. To us, the current direction is not likely to lead to “deployable” RL: RL that works in practice and can work in practical situations yet still is economically viable. We also propose a potential fix to some of the difficulties of the field...
  • ShinyConf 2023 Call For Speakers
    We invite members of the R community to submit talks for this year’s all-virtual ShinyConf on March 15-17, 2023!...Any talks relating to R Shiny are acceptable for consideration – whether the talk is about a Shiny app you’ve created, an introduction to a package you have developed, or an explanation of how you are using Shiny in your research or business...
  • How Shapley Values Work
    Shapley values - and their popular extension, SHAP - are machine learning explainability techniques that are easy to use and interpret. However, trying to make sense of their theory can be intimidating. In this article, we will explore how Shapley values work - not using cryptic formulae, but by way of code and simplified explanations...
  • Writing a Python SQL engine from scratch
    This post will cover why I went through the effort of creating a Python SQL engine and how a simple query goes from a string to actually transforming data. The following steps are briefly summarized: a) Tokenizing, b) Parsing, c) Optimizing, d) Planning, and e) Executing...
  • Datacast Episode 106: Advancing AI Adoption with Dania Meira
    Dania Meira is the founding member/director of AI Guild - the go-to community for data and business professionals advancing AI adoption...Our wide-ranging conversation touches on her upbringing and education in Brazil, her early career in marketing intelligence, her move to Berlin to work as a data scientist in different startups, her current journey with AI Guild building the go-to community for data professionals advancing AI adoption, the evolution of the data field over the past decade, and much more...
  • A Tale of Two Means
    This article is dedicated to learning how to compare two populations, building on our knowledge of a sample mean compared to the population...We may want to investigate if there is a difference on some facet between two populations (or two samples). For example, is there a difference in the age of those attending medical school at Northwestern University or University of Chicago? Here we seek to compare the two means, rather than in the previous bootcamp, where we tried to see if there existed a difference in the mean of a sample to a mean relative to the population...
  • Explaining Reinforcement Learning with Human Feedback
    Reinforcement learning with human feedback is a new technique for training next-gen language models like ChatGPT. Instead of training LLMs merely to predict the next word, we train them to understand instructions and generate helpful responses...Want to learn more about RLHF and how it works? Read on!...
  • 2022 Top Papers in AI — A Year of Generative Models
    This year, we see significant progress in the field of generative models. Stable Diffusion 🎨 creates hyperrealistic art. ChatGPT 💬 answers questions to the meaning of life. Galactica 🧬 learns humanity’s scientific knowledge but also reveals the limitations of large language models...This article is my take on the 20 most impactful AI papers of 2022...


 

Tool*

 



Build powerful ML visualizations with Comet

With just 2 lines of code, Comet automatically logs metrics, hyperparameters, libraries, and more. This means automatic chart generation so you can easily manage training runs in real time. When you combine that with:
  • built-in visualizations (like the image panel),
  • custom project views, and
  • your own python panels,
Comet is a powerful tool for optimizing your ML workflow. All for free! Less friction, more ML.

Create your free account.


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!



 

Jobs

 
  • Data Scientist / Machine Learning Engineer - Epsilon - NYC

    Epsilon Strategy and Insights, Data Sciences team is looking for a talented team player in a Data Scientist/Machine Learning Engineer role. You are an expert, mentor and advocate. You have strong machine learning and deep learning background and are passionate about transforming data into ml models. You welcome the challenge of data science and are proficient in Python, Spark MLLib, Tensorflow, Keras, ML algorithms and Deep Neural Networks, Big Data. You must be self-driven, take initiative and want to work in a dynamic, busy and innovative group...
     
Want to post a job here? Email us for details --> team@datascienceweekly.org



 

Training & Resources

 
  • Probabilistic Machine Learning: Advanced Topics
    I am delighted to announce that the "real" camera-ready version of my new book, "Probabilistic Machine Learning: Advanced Topics", is now available. It will appear in print this summer, but it is already freely available online at...
  • The Illustrated Machine Learning Website
    Welcome to our website, where we strive to make the complex world of Machine Learning more approachable through clear and concise illustrations. Our goal is to provide a visual aid for students, professionals, and anyone preparing for a technical interview to better understand the underlying concepts of Machine Learning...
  • Diffusion Models - Live Coding Tutorial [YouTube]
    This is my live (to the most extent) coding video, where I implement from a scratch a diffusion model that generates 32 x 32 RGB images. The tutorial assumes a basic knowledge of deep learning and Python...
 

Last Week's Newsletter's 3 Most Clicked Links

 
* Based on unique clicks.
** Find last week's newsletter here.

 


Cutting Room Floor

 


P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2022 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Key phrases

Older messages

Data Science Weekly - Issue 475

Thursday, December 29, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #475 December 29 2022 Editor's Picks

Data Science Weekly - Issue 474

Friday, December 23, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #474 December 22 2022 Editor's Picks

Data Science Weekly - Issue 473

Friday, December 16, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #473 December 15 2022 Editor's Picks

Data Science Weekly - Issue 472

Friday, December 9, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #472 December 08 2022 Editor's Picks

Data Science Weekly - Issue 471

Thursday, December 1, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #471 December 01 2022 Editor's Picks

You Might Also Like

😩Not Another iPad Caaaase!

Thursday, April 25, 2024

The last iPad case you need. See the most loved features you can't live without. The form and style of ZUGU cases have evolved naturally, resulting from designing products that safeguard your

Edge 390: Diving Into Databricks' DBRX: One of the Most Impressive Open Source LLMs Released Recently

Thursday, April 25, 2024

The model uses an MoE architecture which exhibits remarkable perfromance on a relatively small budget. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

US TikTok ban 📱, Meta's $200B drop 📉, Node.js 22 👨‍💻

Thursday, April 25, 2024

President Joe Biden has signed into law a bill that orders TikTok owner ByteDance to sell the company within 270 days or lose access to the US market Sign Up |Advertise|View Online TLDR Together With

Learning about Android Runtime

Thursday, April 25, 2024

View in browser 🔖 Articles Learning about Android Runtime I always enjoy reading articles that explore how something works under the hood. Here's an article that does exactly that, providing

Stripe changes its … stripes

Wednesday, April 24, 2024

TikTok on the president's docket and Nvidia acquires Run:ai View this email online in your browser By Christine Hall Wednesday, April 24, 2024 Good afternoon, and welcome to TechCrunch PM! Today

💪 You Can Use Copilot AI as a Personal Trainer — Why Your Laptop Needs a Docking Station

Wednesday, April 24, 2024

Also: Here's How to Make Your Apple ID Recoverable, and More! How-To Geek Logo April 24, 2024 📩 Get expert reviews, the hottest deals, how-to's, breaking news, and more delivered directly to

JSK Daily for Apr 24, 2024

Wednesday, April 24, 2024

JSK Daily for Apr 24, 2024 View this email in your browser A community curated daily e-mail of JavaScript news JSK Weekly - 24th April, 2024 React 19 has introduced many great functionalities and

Daily Coding Problem: Problem #1422 [Hard]

Wednesday, April 24, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Airbnb. Given a list of integers, write a function that returns the largest sum of non-

Charted | Artificial Intelligence Patents, by Country 🤖

Wednesday, April 24, 2024

This visualization shows which countries have been granted the most AI patents each year, from 2012 to 2022. View Online | Subscribe Presented by: New on VC+: Our Visual Briefing on the IMF's World

Save your seat: 1Password’s 2024 Security report insights webinar

Wednesday, April 24, 2024

Join us April 25th. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏