Data Science Weekly - Data Science Weekly - Issue 440

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #441

May 5 2022

Editor Picks

 
  • How Gaussian Is It?
    This article is an excerpt from the current draft of my book Probably Overthinking It, to be published by the University of Chicago Press in early 2023...How tall are you? How long are your arms? How far it is from the radiale landmark on your right elbow to the stylion landmark on your right wrist?...
  • Democratizing access to large-scale language models with OPT-175B
    In line with Meta AI’s commitment to open science, we are sharing Open Pretrained Transformer (OPT-175B), a language model with 175 billion parameters trained on publicly available data sets, to allow for more community engagement in understanding this foundational new technology...
  • OPT-175 Logbook [PDF]
    [Editor's note: click on the download button]...Goal: Get a 175B dense model up and running by any means necessary...Purpose of this document: To provide a source of truth of what we did, when, and why, and any context that was important to those decisions. To provide each other with a clear place to find information about what is happening without having to ping....
 
 

A Message from this week's Sponsor:

 



Free Course: Natural Language Processing (NLP) for Semantic Search

Learn how to build semantic search applications by making machines understand language as people do. This free course covers everything you need to build state-of-the-art language models, from machine translation to question-answering, and more. Brought to you by Pinecone. Start reading now.

 

 

Data Science Articles & Videos

 
  • JAX vs Julia (vs PyTorch)
    A while ago there was an interesting thread on the Julia Discourse about the “state of machine learning in Julia”. I posted a response discussing the differences between Julia and Python (both JAX and PyTorch), and it seemed to be really well received!...Since then this topic seems to keep coming up, so I thought I’d tidy up that post and put it somewhere I could link to easily...To my mind JAX and Julia are unquestionably the current state-of-the-art frameworks for autodifferentiation, scientific computing, and ML computing. So let’s dig into the differences....
  • Working on build systems full-time at Meta
    Summary: I joined Meta 2.5 years ago to work on build systems. I’m enjoying it...I'll cover What I’ve learnt about build systems as well as What's different moving from finance to tech...
  • Advances in Neural Compression with Auke Wiggers
    Today we’re joined by Auke Wiggers, an AI research scientist at Qualcomm...we discuss his team’s recent research on data compression using generative models. We discuss the relationship between historical compression research and the current trend of neural compression, and the benefit of neural codecs, which learn to compress data from examples. We also explore the performance evaluation process and the recent developments that show that these models can operate in real-time on a mobile device. Finally, we discuss another ICLR paper, “Transformer-based transform coding”, that proposes a vision transformer-based architecture for image and video coding...
  • Training Language Models with Natural Language Feedback
    Pretrained language models often do not perform tasks in ways that are in line with our preferences, e.g., generating offensive text or factually incorrect summaries. Recent work approaches the above issue by learning from a simple form of human evaluation: comparisons between pairs of model-generated task outputs. Comparison feedback conveys limited information about human preferences per human evaluation. Here, we propose to learn from natural language feedback, which conveys more information per human evaluation. We learn from language feedback on model outputs using a three-step learning algorithm...
  • What Data Visualization Reveals: Elizabeth Palmer Peabody and the Work of Knowledge Production
    This essay offers the chronological charts of Elizabeth Palmer Peabody (1804–1894), the 19th-century educator and intellectual, as early examples of how data visualization can reveal a range of forms of knowledge. It challenges the universality of the goals of clarity and efficiency when designing data visualizations, and argues for the value of visualizations that encourage sustained reflection and imaginative response...
  • Hiring Data Scientists With Intention
    I met Tara Robertson in 2019 when I joined Mozilla, where she was the Global Diversity and Inclusion Lead at the time. When I needed to grow my team, Tara and I worked together to develop an inclusive hiring process. Since then, Tara and I have kept the conversation going and wanted to share some of our thoughts here!...
  • Handling and Presenting Harmful Text
    Textual data can pose a risk of serious harm. These harms can be categorised along three axes: (1) the harm type, (2) whether it is elicited as a feature of the research design from directly studying harmful content, and (3) who it affects...It is an unsolved problem in NLP as to how textual harms should be handled, presented, and discussed; but, stopping work on content which poses a risk of harm is untenable. Accordingly, we provide practical advice and introduce HARMCHECK, a resource for reflecting on research into textual harms...
  • Datacast Episode 90: Operational Analytics, Reverse Etl, And Finding Product-Market Fit With Kashish Gupta
    Our wide-ranging conversation touches on his education at the University of Pennsylvania studying Computer Science; his learning about venture capital at Bessemer Venture Partners; his first startup Carry that went through Y Combinator; his current journey with Hightouch building a data activation platform; lessons learned creating the Operational Analytics category, pivoting through various startup ideas, identifying design partners, hiring talent, fundraising; and much more...
  • New from Anaconda: Python in the Browser
    Say Hello to PyScript PyScript is a framework that allows users to create rich Python applications in the browser using a mix of Python with standard HTML. PyScript aims to give users a first-class programming language that has consistent styling rules, is more expressive, and is easier to learn...What is PyScript? Well, here are some of the core components...
 
 

Conference*

 



Join us at apply(), the ML data engineering conference - it’s free.

Speakers include practitioners from the Wikimedia Foundation, Facebook, Gojek, Snapchat, Instacart, Walmart, Stripe, Uber, Volvo, Snowflake, Databricks, and more. We’d love for you to join us.

Agenda highlights:
  • Smitha Shyam, Director of Engineering at Uber: Uber's Michelangelo: Then and Now
  • Chris Albon, Director of Machine Learning at Wikimedia Foundation: More Ethical Machine Learning Using Model Card at Wikimedia
  • Matei Zaharia, Co-Founder and Chief Technologist at Databricks: The Future of Data for Machine Learning
  • Chip Huyen, Co-Founder at Claypot AI: Machine Learning Platform for Online Prediction and Continual Learning
  • Clem Delangue, CEO at Hugging Face: Is Open-Source Machine Learning Becoming the Most Impactful Technology of the Decade?

See the full agenda and register for free.


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 
 

Jobs

 
  • Data Scientist - Hungryroot - Remote

    Hungryroot is looking for a Data Scientist to join our growing Data Team. As a Data Scientist, you will work closely with other Data Scientists and Data Engineers to develop various Machine Learning models that power Hungryroot and it’s AI functions. These models include traditional forecasting models, as well as more industry-specific optimization challenges.

    As a Data Scientist at Hungryroot, you will work on answering questions like: how do you tell what food someone would like to eat this week, how do you determine whether they enjoyed it or not, maybe they liked their means last week, but are now looking for different options, maybe they like the same food on Tuesdays, but variety on Fridays, what about spicy food, is Green Chilly as spicy as Green Curry?

     

        Want to post a job here? Email us for details --> team@datascienceweekly.org

 
 

Training & Resources

 
  • Scientific Visualization: Python + Matplotlib
    This book is organized into four parts. The first part considers the fundamental principles of the Matplotlib library...The second part is dedicated to the actual design of a figure...The third part is dedicated to more advanced concepts, namely 3D figures, optimization & animation. The fourth and final part is a collection of showcases...
 
 

Books

 

 
  • Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits


    Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
     


    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2022 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

Data Science Weekly - Issue 440

Thursday, April 28, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #440 April 28 2022 Editor Picks Beyond

Data Science Weekly - Issue 439

Thursday, April 21, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #439 April 21 2022 Editor Picks Real

Data Science Weekly - Issue 437

Thursday, April 7, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #437 April 07 2022 Editor Picks

Data Science Weekly - Issue 436

Thursday, March 31, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #436 March 31 2022 Editor Picks Stop

Data Science Weekly - Issue 435

Friday, March 25, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #435 March 24 2022 Editor Picks

You Might Also Like

📧 Building Async APIs in ASP.NET Core - The Right Way

Saturday, November 23, 2024

​ Building Async APIs in ASP .NET Core - The Right Way Read on: m​y website / Read time: 5 minutes The .NET Weekly is brought to you by: Even the smartest AI in the world won't save you from a

WebAIM November 2024 Newsletter

Friday, November 22, 2024

WebAIM November 2024 Newsletter Read this newsletter online at https://webaim.org/newsletter/2024/november Features Using Severity Ratings to Prioritize Web Accessibility Remediation When it comes to

➡️ Why Your Phone Doesn't Want You to Sideload Apps — Setting the Default Gateway in Linux

Friday, November 22, 2024

Also: Hey Apple, It's Time to Upgrade the Macs Storage, and More! How-To Geek Logo November 22, 2024 Did You Know Fantasy author JRR Tolkien is credited with inventing the main concept of orcs and

JSK Daily for Nov 22, 2024

Friday, November 22, 2024

JSK Daily for Nov 22, 2024 View this email in your browser A community curated daily e-mail of JavaScript news React E-Commerce App for Digital Products: Part 4 (Creating the Home Page) This component

Spyglass Dispatch: The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen

Friday, November 22, 2024

The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen The Spyglass Dispatch is a free newsletter sent out daily on

Charted | How the Global Distribution of Wealth Has Changed (2000-2023) 💰

Friday, November 22, 2024

This graphic illustrates the shifts in global wealth distribution between 2000 and 2023. View Online | Subscribe | Download Our App Presented by: MSCI >> Get the Free Investor Guide Now FEATURED

Daily Coding Problem: Problem #1616 [Easy]

Friday, November 22, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Alibaba. Given an even number (greater than 2), return two prime numbers whose sum will

The problem to solve

Friday, November 22, 2024

​ Use problem framing to define the problem to solve This week, Tom Parson and Krishna Raha share tools and frameworks to identify and address challenges effectively, while Voltage Control highlights

Issue #568: Random mazes, train clock, and ReKill

Friday, November 22, 2024

View this email in your browser Issue #568 - November 22nd 2024 Weekly newsletter about Web Game Development. If you have anything you want to share with our community please let me know by replying to

Whats Next for AI: Interpreting Anthropic CEOs Vision

Friday, November 22, 2024

Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 22, 2024? The HackerNoon