Data Science Weekly - Data Science Weekly - Issue 433

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #433

March 10 2022

Editor Picks

 
  • MLOps Is a Mess But That's to be Expected
    Today machine learning continues to be one of the most talked about and touted technology waves, promising to revolutionize every corner of society...And yet the ecosystem is in a frenzied state...New fundamental science advances come out of every week. Startups and enterprises spray new developer tools into the market trying to capture a chunk of what many speculate to be a market worth between $40-120 billion by 2025...And yet if you’re just entering the discourse, how do you make sense of it all?...In this post, I want to focus the discussion about the state of machine learning operations (MLOps) today, where we are, where we are going...
  • Anatomy of an AI System
    The Amazon Echo as an anatomical map of human labor, data and planetary resources...This is an interaction with Amazon’s Echo device. 3 A brief command and a response is the most common form of engagement with this consumer voice-enabled AI device...How can we begin to see it, to grasp its immensity and complexity as a connected form? We start with an outline: an exploded view of a planetary system across three stages of birth, life and death, accompanied by an essay in 21 parts. Together, this becomes an anatomical map of a single AI system...
 
 

A Message from this week's Sponsor:

 



Live Webinar | How to Create & Foster a Data-Driven Culture

Wednesday, Mar 23 at 2PM ET (11AM PT)

Join Data & Analytics Leaders from PayPal, Direct Supply, BetMGM, and Kolibri Games & learn about their initiatives in Business Intelligence, Advanced Analytics, and Data Science to make faster, smarter data-driven decisions at scale.

 

 

Data Science Articles & Videos

 
  • µTransfer: A technique for hyperparameter tuning of enormous neural networks
    In this post, we relay how our fundamental research enabled us, for the first time, to tune enormous neural networks that are too expensive to train more than once. We achieved this by showing that a particular parameterization preserves optimal hyperparameters across different model sizes...By greatly reducing the need to guess which training hyperparameters to use, this technique can accelerate research on enormous neural networks, such as GPT-3 and potentially larger successors in the future...
  • Restoring and attributing ancient texts using deep neural networks
    Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow...
  • The 2030 Self-Driving Car Bet
    It's my honor to announce that John Carmack and I (Jeff Atwood) have initiated a friendly bet of $10,000* to the 501(c)(3) charity of the winner’s choice: By January 1st, 2030, completely autonomous self-driving cars meeting SAE J3016 level 5 will be commercially available for passenger use in major cities...I am betting against, and John is betting for...
  • Data Visualization Standards
    The Data Visualization Standards (DVS) are a collection of guidance and resources to help create better data visualizations with less effort from the US Census Bureau. These standards contain both requirements and recommendations for both novices and experts to follow when creating new data visualizations...
  • NN Template - Generic template to bootstrap your PyTorch project
    Generic cookiecutter template to bootstrap PyTorch projects and to avoid writing boilerplate code to integrate: a) PyTorch Lightning, lightweight PyTorch wrapper for high-performance AI research, b) Hydra, a framework for elegantly configuring complex applications, c) Weights and Biases, organize and analyze machine learning experiments. (educational account available, d) Streamlit, turns data scripts into shareable web apps in minutes, e) MkDocs and Material for MkDocs, a fast, simple and downright gorgeous static site generator, f) DVC, track large files, directories, or ML models. Think "Git for data", g) GitHub Actions, to run the tests, publish the documentation and to PyPI automatically, and h) Python best practices for developing and publishing research projects...
  • A visual introduction to machine learning
    n machine learning, computers apply statistical learning techniques to automatically identify patterns in data. These techniques can be used to make highly accurate predictions...Keep scrolling. Using a data set about homes, we will create a machine learning model to distinguish homes in New York from homes in San Francisco...
  • A Concrete Introduction to Probability (using Python)
    This notebook will explore these concepts in a concrete way using Python code. The code is meant to be succint and explicit, and fast enough to handle sample spaces with millions of outcomes. If you need to handle trillions, you'll want a more efficient implementation. I also have another notebook that covers paradoxes in Probability Theory...
  • CS 329S: Machine Learning Systems Design [Stanford, Winter 2022]
    This course aims to provide an iterative framework for developing real-world machine learning systems that are deployable, reliable, and scalable...It starts by considering all stakeholders of each machine learning project and their objectives. Different objectives require different design choices, and this course will discuss the tradeoffs of those choices...Students will learn about data management, data engineering, feature engineering, approaches to model selection, training, scaling, how to continually monitor and deploy changes to ML systems, as well as the human side of ML projects such as team structure and business metrics...
  • Explainable Machine Learning in NLP: Methods and Evaluation [PDF]
    This talk is based on four recent papers: a) The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations (2021), b) FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging (2021), c) Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language? (2020), d) Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior? (2020)...as well as feflection on these papers and notes from “Opinions on Interpretable Machine Learning and 70 Summaries of Recent Papers” (2021)...
  • Rotational AI Science & Engineering (RAISE) Employment Opportunity
    RAISE is an 18-month rotational program offering participants full-time employment with the Meta AI team. It is designed to bring a diverse cohort of software engineers with little to no AI/ML experience together from a wide range of industry experiences and backgrounds -- ranging from individual contributors a few years out of school to experienced engineers. After completing the 18-month program, the goal is for RAISE alumni to seamlessly continue their employment at Meta by selecting a permanent Meta AI team to join that aligns with their skills and interests. Throughout the program, all RAISErs are full-time employees, providing each individual with all the benefits of being a software engineer at Meta...
  • Implicit Kinematic Policies: Unifying Joint and Cartesian Action Spaces in End-to-End Robot Learning
    Action representation is an important yet often overlooked aspect in end-to-end robot learning with deep networks. Choosing one action space over another (e.g. target joint positions, or Cartesian end-effector poses) can result in surprisingly stark performance differences between various downstream tasks -- and as a result, considerable research has been devoted to finding the right action space for a given application...in this work, we instead investigate how our models can discover and learn for themselves which action space to use. Leveraging recent work on implicit behavioral cloning, which takes both observations and actions as input, we demonstrate that it is possible to present the same action in multiple different spaces to the same policy -- allowing it to learn inductive patterns from each space.....
 
 

Summit*

 



You're invited to the first-ever Metrics Store Summit

Transform is hosting the first-ever industry summit on the metrics layer. The first-ever Metrics Store Summit on April 26, 2022 will bring discussions around the semantic layer into one event—providing context with use cases for metrics stores, highlighting applications for metrics, and sharing ideas from leaders across the modern data stack.You can expect to hear from Airbnb, Slack, Spotify, Atlan, Hex, Mode, Hightouch, AtScale and many more in this action-packed 1-day event. We would love to see you there! Register today for free.



*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 
 

Jobs

 
  • Lead Data Engineer - electricityMap - Copenhagen

    The electricityMap team is hiring a data engineer to help us build and maintain a scalable data pipeline and database that forms the foundation of our mission to accelerate the energy system to a zero-carbon future.

    Our mission is to organise the world’s electricity data to drive tangible reductions in carbon emissions. electricityMap started as a popular open-source project 5 years ago and is now used every day by citizens, companies, universities, NGOs, and policy makers around the world to understand and reduce the climate impact of electricity.

    You will be joining a fun, international and inclusive team in our mission to tackle climate change – while simultaneously building your professional experience in the rapidly growing industry of climate tech...

        Want to post a job here? Email us for details --> team@datascienceweekly.org

 
 

Training & Resources

 
  • Hands-On Reinforcement Learning Course (Part 1 of 6)
    Let’s walk this beautiful path from the fundamentals to cutting edge reinforcement learning (RL), step-by-step, with coding examples and tutorials in Python, together!...This first part covers the bare minimum concept and theory you need to embark on this journey. Then, in each following chapter, we will solve a different problem, with increasing difficulty...
  • Introduction to variational autoencoders
    ay we want to fit a model to some data. In mathematical terms, we want to find a distribution pp that maximizes the probability of observed data x \in Xx∈X. In this case, we will make the assumption that there is some latent (unobserved) variable zz that affects the production of xx behind the scenes...This document explains methods and challenges for training latent variable models. Our end goal is to derive the variational autoencoder (VAE) framework with justifications for each step along the way...
 
 

Books

 

 
  • Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits


    Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
     


    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2021 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

Data Science Weekly - Issue 432

Thursday, March 3, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #432 March 03 2022 Editor Picks The

Data Science Weekly - Issue 431

Friday, February 25, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #431 February 24 2022 Editor Picks A

Data Science Weekly - Issue 430

Thursday, February 17, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #430 February 17 2022 Editor Picks The

Data Science Weekly - Issue 429

Thursday, February 10, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #429 February 10 2022 Editor Picks

Data Science Weekly - Issue 428

Friday, February 4, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #428 February 03 2022 Editor Picks

You Might Also Like

📧 Building Async APIs in ASP.NET Core - The Right Way

Saturday, November 23, 2024

​ Building Async APIs in ASP .NET Core - The Right Way Read on: m​y website / Read time: 5 minutes The .NET Weekly is brought to you by: Even the smartest AI in the world won't save you from a

WebAIM November 2024 Newsletter

Friday, November 22, 2024

WebAIM November 2024 Newsletter Read this newsletter online at https://webaim.org/newsletter/2024/november Features Using Severity Ratings to Prioritize Web Accessibility Remediation When it comes to

➡️ Why Your Phone Doesn't Want You to Sideload Apps — Setting the Default Gateway in Linux

Friday, November 22, 2024

Also: Hey Apple, It's Time to Upgrade the Macs Storage, and More! How-To Geek Logo November 22, 2024 Did You Know Fantasy author JRR Tolkien is credited with inventing the main concept of orcs and

JSK Daily for Nov 22, 2024

Friday, November 22, 2024

JSK Daily for Nov 22, 2024 View this email in your browser A community curated daily e-mail of JavaScript news React E-Commerce App for Digital Products: Part 4 (Creating the Home Page) This component

Spyglass Dispatch: The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen

Friday, November 22, 2024

The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen The Spyglass Dispatch is a free newsletter sent out daily on

Charted | How the Global Distribution of Wealth Has Changed (2000-2023) 💰

Friday, November 22, 2024

This graphic illustrates the shifts in global wealth distribution between 2000 and 2023. View Online | Subscribe | Download Our App Presented by: MSCI >> Get the Free Investor Guide Now FEATURED

Daily Coding Problem: Problem #1616 [Easy]

Friday, November 22, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Alibaba. Given an even number (greater than 2), return two prime numbers whose sum will

The problem to solve

Friday, November 22, 2024

​ Use problem framing to define the problem to solve This week, Tom Parson and Krishna Raha share tools and frameworks to identify and address challenges effectively, while Voltage Control highlights

Issue #568: Random mazes, train clock, and ReKill

Friday, November 22, 2024

View this email in your browser Issue #568 - November 22nd 2024 Weekly newsletter about Web Game Development. If you have anything you want to share with our community please let me know by replying to

Whats Next for AI: Interpreting Anthropic CEOs Vision

Friday, November 22, 2024

Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 22, 2024? The HackerNoon