Data Science Weekly - Data Science Weekly - Issue 440

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #441

May 5 2022

Editor Picks

 
  • How Gaussian Is It?
    This article is an excerpt from the current draft of my book Probably Overthinking It, to be published by the University of Chicago Press in early 2023...How tall are you? How long are your arms? How far it is from the radiale landmark on your right elbow to the stylion landmark on your right wrist?...
  • Democratizing access to large-scale language models with OPT-175B
    In line with Meta AI’s commitment to open science, we are sharing Open Pretrained Transformer (OPT-175B), a language model with 175 billion parameters trained on publicly available data sets, to allow for more community engagement in understanding this foundational new technology...
  • OPT-175 Logbook [PDF]
    [Editor's note: click on the download button]...Goal: Get a 175B dense model up and running by any means necessary...Purpose of this document: To provide a source of truth of what we did, when, and why, and any context that was important to those decisions. To provide each other with a clear place to find information about what is happening without having to ping....
 
 

A Message from this week's Sponsor:

 



Free Course: Natural Language Processing (NLP) for Semantic Search

Learn how to build semantic search applications by making machines understand language as people do. This free course covers everything you need to build state-of-the-art language models, from machine translation to question-answering, and more. Brought to you by Pinecone. Start reading now.

 

 

Data Science Articles & Videos

 
  • JAX vs Julia (vs PyTorch)
    A while ago there was an interesting thread on the Julia Discourse about the “state of machine learning in Julia”. I posted a response discussing the differences between Julia and Python (both JAX and PyTorch), and it seemed to be really well received!...Since then this topic seems to keep coming up, so I thought I’d tidy up that post and put it somewhere I could link to easily...To my mind JAX and Julia are unquestionably the current state-of-the-art frameworks for autodifferentiation, scientific computing, and ML computing. So let’s dig into the differences....
  • Working on build systems full-time at Meta
    Summary: I joined Meta 2.5 years ago to work on build systems. I’m enjoying it...I'll cover What I’ve learnt about build systems as well as What's different moving from finance to tech...
  • Advances in Neural Compression with Auke Wiggers
    Today we’re joined by Auke Wiggers, an AI research scientist at Qualcomm...we discuss his team’s recent research on data compression using generative models. We discuss the relationship between historical compression research and the current trend of neural compression, and the benefit of neural codecs, which learn to compress data from examples. We also explore the performance evaluation process and the recent developments that show that these models can operate in real-time on a mobile device. Finally, we discuss another ICLR paper, “Transformer-based transform coding”, that proposes a vision transformer-based architecture for image and video coding...
  • Training Language Models with Natural Language Feedback
    Pretrained language models often do not perform tasks in ways that are in line with our preferences, e.g., generating offensive text or factually incorrect summaries. Recent work approaches the above issue by learning from a simple form of human evaluation: comparisons between pairs of model-generated task outputs. Comparison feedback conveys limited information about human preferences per human evaluation. Here, we propose to learn from natural language feedback, which conveys more information per human evaluation. We learn from language feedback on model outputs using a three-step learning algorithm...
  • What Data Visualization Reveals: Elizabeth Palmer Peabody and the Work of Knowledge Production
    This essay offers the chronological charts of Elizabeth Palmer Peabody (1804–1894), the 19th-century educator and intellectual, as early examples of how data visualization can reveal a range of forms of knowledge. It challenges the universality of the goals of clarity and efficiency when designing data visualizations, and argues for the value of visualizations that encourage sustained reflection and imaginative response...
  • Hiring Data Scientists With Intention
    I met Tara Robertson in 2019 when I joined Mozilla, where she was the Global Diversity and Inclusion Lead at the time. When I needed to grow my team, Tara and I worked together to develop an inclusive hiring process. Since then, Tara and I have kept the conversation going and wanted to share some of our thoughts here!...
  • Handling and Presenting Harmful Text
    Textual data can pose a risk of serious harm. These harms can be categorised along three axes: (1) the harm type, (2) whether it is elicited as a feature of the research design from directly studying harmful content, and (3) who it affects...It is an unsolved problem in NLP as to how textual harms should be handled, presented, and discussed; but, stopping work on content which poses a risk of harm is untenable. Accordingly, we provide practical advice and introduce HARMCHECK, a resource for reflecting on research into textual harms...
  • Datacast Episode 90: Operational Analytics, Reverse Etl, And Finding Product-Market Fit With Kashish Gupta
    Our wide-ranging conversation touches on his education at the University of Pennsylvania studying Computer Science; his learning about venture capital at Bessemer Venture Partners; his first startup Carry that went through Y Combinator; his current journey with Hightouch building a data activation platform; lessons learned creating the Operational Analytics category, pivoting through various startup ideas, identifying design partners, hiring talent, fundraising; and much more...
  • New from Anaconda: Python in the Browser
    Say Hello to PyScript PyScript is a framework that allows users to create rich Python applications in the browser using a mix of Python with standard HTML. PyScript aims to give users a first-class programming language that has consistent styling rules, is more expressive, and is easier to learn...What is PyScript? Well, here are some of the core components...
 
 

Conference*

 



Join us at apply(), the ML data engineering conference - it’s free.

Speakers include practitioners from the Wikimedia Foundation, Facebook, Gojek, Snapchat, Instacart, Walmart, Stripe, Uber, Volvo, Snowflake, Databricks, and more. We’d love for you to join us.

Agenda highlights:
  • Smitha Shyam, Director of Engineering at Uber: Uber's Michelangelo: Then and Now
  • Chris Albon, Director of Machine Learning at Wikimedia Foundation: More Ethical Machine Learning Using Model Card at Wikimedia
  • Matei Zaharia, Co-Founder and Chief Technologist at Databricks: The Future of Data for Machine Learning
  • Chip Huyen, Co-Founder at Claypot AI: Machine Learning Platform for Online Prediction and Continual Learning
  • Clem Delangue, CEO at Hugging Face: Is Open-Source Machine Learning Becoming the Most Impactful Technology of the Decade?

See the full agenda and register for free.


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 
 

Jobs

 
  • Data Scientist - Hungryroot - Remote

    Hungryroot is looking for a Data Scientist to join our growing Data Team. As a Data Scientist, you will work closely with other Data Scientists and Data Engineers to develop various Machine Learning models that power Hungryroot and it’s AI functions. These models include traditional forecasting models, as well as more industry-specific optimization challenges.

    As a Data Scientist at Hungryroot, you will work on answering questions like: how do you tell what food someone would like to eat this week, how do you determine whether they enjoyed it or not, maybe they liked their means last week, but are now looking for different options, maybe they like the same food on Tuesdays, but variety on Fridays, what about spicy food, is Green Chilly as spicy as Green Curry?

     

        Want to post a job here? Email us for details --> team@datascienceweekly.org

 
 

Training & Resources

 
  • Scientific Visualization: Python + Matplotlib
    This book is organized into four parts. The first part considers the fundamental principles of the Matplotlib library...The second part is dedicated to the actual design of a figure...The third part is dedicated to more advanced concepts, namely 3D figures, optimization & animation. The fourth and final part is a collection of showcases...
 
 

Books

 

 
  • Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits


    Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
     


    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2022 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

Data Science Weekly - Issue 440

Thursday, April 28, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #440 April 28 2022 Editor Picks Beyond

Data Science Weekly - Issue 439

Thursday, April 21, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #439 April 21 2022 Editor Picks Real

Data Science Weekly - Issue 437

Thursday, April 7, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #437 April 07 2022 Editor Picks

Data Science Weekly - Issue 436

Thursday, March 31, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #436 March 31 2022 Editor Picks Stop

Data Science Weekly - Issue 435

Friday, March 25, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #435 March 24 2022 Editor Picks

You Might Also Like

wpmail.me issue#664

Thursday, April 25, 2024

wpMail.me wpmail.me issue#664 - The weekly WordPress newsletter. No spam, no nonsense. - April 24, 2024 Is this email not displaying correctly? View it in your browser. News & Articles WordPress

📧 Modular Monolith Architecture is now LIVE! 🎉

Thursday, April 25, 2024

​ MMA is now LIVE! The day has finally come. ​Modular Monolith Architecture is now open for enrollment. ​ I can't wait for you to see everything I prepared! 10 in-depth chapters 60+ high-quality

Testing the Rabbit R1's AI assistant

Thursday, April 25, 2024

The Morning After It's Thursday, April 25, 2024. Back in January, startup Rabbit revealed its first device at CES 2024. The R1 is an adorable, vibrant orange AI machine with a camera, scroll wheel,

Zero-Day Alert: State-Sponsored Hackers Exploting Two Cisco Flaws for Espionage

Thursday, April 25, 2024

THN Daily Updates Newsletter cover Coding with AI For Dummies ($18.00 Value) FREE for a Limited Time Boost your coding output and accuracy with artificial intelligence tools Download Now Sponsored

Post from Syncfusion Blogs on 04/25/2024

Thursday, April 25, 2024

New blogs from Syncfusion How BoldSign Improved HR Operations at Syncfusion By Syncfusion HR Team Let's see how Syncfusion's BoldSign revolutionizes HR operations with seamless document

😩Not Another iPad Caaaase!

Thursday, April 25, 2024

The last iPad case you need. See the most loved features you can't live without. The form and style of ZUGU cases have evolved naturally, resulting from designing products that safeguard your

Edge 390: Diving Into Databricks' DBRX: One of the Most Impressive Open Source LLMs Released Recently

Thursday, April 25, 2024

The model uses an MoE architecture which exhibits remarkable perfromance on a relatively small budget. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

US TikTok ban 📱, Meta's $200B drop 📉, Node.js 22 👨‍💻

Thursday, April 25, 2024

President Joe Biden has signed into law a bill that orders TikTok owner ByteDance to sell the company within 270 days or lose access to the US market Sign Up |Advertise|View Online TLDR Together With

Learning about Android Runtime

Thursday, April 25, 2024

View in browser 🔖 Articles Learning about Android Runtime I always enjoy reading articles that explore how something works under the hood. Here's an article that does exactly that, providing

Stripe changes its … stripes

Wednesday, April 24, 2024

TikTok on the president's docket and Nvidia acquires Run:ai View this email online in your browser By Christine Hall Wednesday, April 24, 2024 Good afternoon, and welcome to TechCrunch PM! Today