Data Science Weekly - Data Science Weekly - Issue 433

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #433

March 10 2022

Editor Picks

 
  • MLOps Is a Mess But That's to be Expected
    Today machine learning continues to be one of the most talked about and touted technology waves, promising to revolutionize every corner of society...And yet the ecosystem is in a frenzied state...New fundamental science advances come out of every week. Startups and enterprises spray new developer tools into the market trying to capture a chunk of what many speculate to be a market worth between $40-120 billion by 2025...And yet if you’re just entering the discourse, how do you make sense of it all?...In this post, I want to focus the discussion about the state of machine learning operations (MLOps) today, where we are, where we are going...
  • Anatomy of an AI System
    The Amazon Echo as an anatomical map of human labor, data and planetary resources...This is an interaction with Amazon’s Echo device. 3 A brief command and a response is the most common form of engagement with this consumer voice-enabled AI device...How can we begin to see it, to grasp its immensity and complexity as a connected form? We start with an outline: an exploded view of a planetary system across three stages of birth, life and death, accompanied by an essay in 21 parts. Together, this becomes an anatomical map of a single AI system...
 
 

A Message from this week's Sponsor:

 



Live Webinar | How to Create & Foster a Data-Driven Culture

Wednesday, Mar 23 at 2PM ET (11AM PT)

Join Data & Analytics Leaders from PayPal, Direct Supply, BetMGM, and Kolibri Games & learn about their initiatives in Business Intelligence, Advanced Analytics, and Data Science to make faster, smarter data-driven decisions at scale.

 

 

Data Science Articles & Videos

 
  • µTransfer: A technique for hyperparameter tuning of enormous neural networks
    In this post, we relay how our fundamental research enabled us, for the first time, to tune enormous neural networks that are too expensive to train more than once. We achieved this by showing that a particular parameterization preserves optimal hyperparameters across different model sizes...By greatly reducing the need to guess which training hyperparameters to use, this technique can accelerate research on enormous neural networks, such as GPT-3 and potentially larger successors in the future...
  • Restoring and attributing ancient texts using deep neural networks
    Ancient history relies on disciplines such as epigraphy—the study of inscribed texts known as inscriptions—for evidence of the thought, language, society and history of past civilizations1. However, over the centuries, many inscriptions have been damaged to the point of illegibility, transported far from their original location and their date of writing is steeped in uncertainty. Here we present Ithaca, a deep neural network for the textual restoration, geographical attribution and chronological attribution of ancient Greek inscriptions. Ithaca is designed to assist and expand the historian’s workflow...
  • The 2030 Self-Driving Car Bet
    It's my honor to announce that John Carmack and I (Jeff Atwood) have initiated a friendly bet of $10,000* to the 501(c)(3) charity of the winner’s choice: By January 1st, 2030, completely autonomous self-driving cars meeting SAE J3016 level 5 will be commercially available for passenger use in major cities...I am betting against, and John is betting for...
  • Data Visualization Standards
    The Data Visualization Standards (DVS) are a collection of guidance and resources to help create better data visualizations with less effort from the US Census Bureau. These standards contain both requirements and recommendations for both novices and experts to follow when creating new data visualizations...
  • NN Template - Generic template to bootstrap your PyTorch project
    Generic cookiecutter template to bootstrap PyTorch projects and to avoid writing boilerplate code to integrate: a) PyTorch Lightning, lightweight PyTorch wrapper for high-performance AI research, b) Hydra, a framework for elegantly configuring complex applications, c) Weights and Biases, organize and analyze machine learning experiments. (educational account available, d) Streamlit, turns data scripts into shareable web apps in minutes, e) MkDocs and Material for MkDocs, a fast, simple and downright gorgeous static site generator, f) DVC, track large files, directories, or ML models. Think "Git for data", g) GitHub Actions, to run the tests, publish the documentation and to PyPI automatically, and h) Python best practices for developing and publishing research projects...
  • A visual introduction to machine learning
    n machine learning, computers apply statistical learning techniques to automatically identify patterns in data. These techniques can be used to make highly accurate predictions...Keep scrolling. Using a data set about homes, we will create a machine learning model to distinguish homes in New York from homes in San Francisco...
  • A Concrete Introduction to Probability (using Python)
    This notebook will explore these concepts in a concrete way using Python code. The code is meant to be succint and explicit, and fast enough to handle sample spaces with millions of outcomes. If you need to handle trillions, you'll want a more efficient implementation. I also have another notebook that covers paradoxes in Probability Theory...
  • CS 329S: Machine Learning Systems Design [Stanford, Winter 2022]
    This course aims to provide an iterative framework for developing real-world machine learning systems that are deployable, reliable, and scalable...It starts by considering all stakeholders of each machine learning project and their objectives. Different objectives require different design choices, and this course will discuss the tradeoffs of those choices...Students will learn about data management, data engineering, feature engineering, approaches to model selection, training, scaling, how to continually monitor and deploy changes to ML systems, as well as the human side of ML projects such as team structure and business metrics...
  • Explainable Machine Learning in NLP: Methods and Evaluation [PDF]
    This talk is based on four recent papers: a) The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations (2021), b) FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging (2021), c) Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language? (2020), d) Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior? (2020)...as well as feflection on these papers and notes from “Opinions on Interpretable Machine Learning and 70 Summaries of Recent Papers” (2021)...
  • Rotational AI Science & Engineering (RAISE) Employment Opportunity
    RAISE is an 18-month rotational program offering participants full-time employment with the Meta AI team. It is designed to bring a diverse cohort of software engineers with little to no AI/ML experience together from a wide range of industry experiences and backgrounds -- ranging from individual contributors a few years out of school to experienced engineers. After completing the 18-month program, the goal is for RAISE alumni to seamlessly continue their employment at Meta by selecting a permanent Meta AI team to join that aligns with their skills and interests. Throughout the program, all RAISErs are full-time employees, providing each individual with all the benefits of being a software engineer at Meta...
  • Implicit Kinematic Policies: Unifying Joint and Cartesian Action Spaces in End-to-End Robot Learning
    Action representation is an important yet often overlooked aspect in end-to-end robot learning with deep networks. Choosing one action space over another (e.g. target joint positions, or Cartesian end-effector poses) can result in surprisingly stark performance differences between various downstream tasks -- and as a result, considerable research has been devoted to finding the right action space for a given application...in this work, we instead investigate how our models can discover and learn for themselves which action space to use. Leveraging recent work on implicit behavioral cloning, which takes both observations and actions as input, we demonstrate that it is possible to present the same action in multiple different spaces to the same policy -- allowing it to learn inductive patterns from each space.....
 
 

Summit*

 



You're invited to the first-ever Metrics Store Summit

Transform is hosting the first-ever industry summit on the metrics layer. The first-ever Metrics Store Summit on April 26, 2022 will bring discussions around the semantic layer into one event—providing context with use cases for metrics stores, highlighting applications for metrics, and sharing ideas from leaders across the modern data stack.You can expect to hear from Airbnb, Slack, Spotify, Atlan, Hex, Mode, Hightouch, AtScale and many more in this action-packed 1-day event. We would love to see you there! Register today for free.



*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 
 

Jobs

 
  • Lead Data Engineer - electricityMap - Copenhagen

    The electricityMap team is hiring a data engineer to help us build and maintain a scalable data pipeline and database that forms the foundation of our mission to accelerate the energy system to a zero-carbon future.

    Our mission is to organise the world’s electricity data to drive tangible reductions in carbon emissions. electricityMap started as a popular open-source project 5 years ago and is now used every day by citizens, companies, universities, NGOs, and policy makers around the world to understand and reduce the climate impact of electricity.

    You will be joining a fun, international and inclusive team in our mission to tackle climate change – while simultaneously building your professional experience in the rapidly growing industry of climate tech...

        Want to post a job here? Email us for details --> team@datascienceweekly.org

 
 

Training & Resources

 
  • Hands-On Reinforcement Learning Course (Part 1 of 6)
    Let’s walk this beautiful path from the fundamentals to cutting edge reinforcement learning (RL), step-by-step, with coding examples and tutorials in Python, together!...This first part covers the bare minimum concept and theory you need to embark on this journey. Then, in each following chapter, we will solve a different problem, with increasing difficulty...
  • Introduction to variational autoencoders
    ay we want to fit a model to some data. In mathematical terms, we want to find a distribution pp that maximizes the probability of observed data x \in Xx∈X. In this case, we will make the assumption that there is some latent (unobserved) variable zz that affects the production of xx behind the scenes...This document explains methods and challenges for training latent variable models. Our end goal is to derive the variational autoencoder (VAE) framework with justifications for each step along the way...
 
 

Books

 

 
  • Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits


    Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
     


    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2021 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

Data Science Weekly - Issue 432

Thursday, March 3, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #432 March 03 2022 Editor Picks The

Data Science Weekly - Issue 431

Friday, February 25, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #431 February 24 2022 Editor Picks A

Data Science Weekly - Issue 430

Thursday, February 17, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #430 February 17 2022 Editor Picks The

Data Science Weekly - Issue 429

Thursday, February 10, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #429 February 10 2022 Editor Picks

Data Science Weekly - Issue 428

Friday, February 4, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #428 February 03 2022 Editor Picks

You Might Also Like

Daily Coding Problem: Problem #1425 [Easy]

Sunday, April 28, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Microsoft. Suppose an arithmetic expression is given as a binary tree. Each leaf is an

PD#571 Software Design Principles I Learned the Hard Way

Sunday, April 28, 2024

If there's two sources of truth, one is probably wrong. And yes, please repeat yourself. ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

When Procrastination is Productive & Ghost integrating with ActivityPub

Sunday, April 28, 2024

Automattic, Texts, and Beeper join forces to build world's best inbox, Reflect launches its iOS app, how to start small rituals, and a lot more in this week's issue of Creativerly. Creativerly

C#503 Building pipelines with System.Threading.Channels

Sunday, April 28, 2024

Concurrent programming challenges can be effectively addressed using channels ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

RD#453 Get your codebase ready for React 19

Sunday, April 28, 2024

Is your app ready for what's coming up in React 19's release ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

☁️ Azure Weekly #464 - 28th April 2024

Sunday, April 28, 2024

Azure Weekly Newsletter Issue #464 powered by endjin Welcome to issue 464 of the Azure Weekly Newsletter. In AI we have a good mix of high-level and deep-dive technical articles. Next-Gen Customer

Tesla profits tumble, Fisker flatlines, and California cities battle for control of AVs

Sunday, April 28, 2024

Plus, an up-close look at the all-electric Mercedes G-Wagen and more View this email online in your browser By Kirsten Korosec Sunday, April 28, 2024 Welcome back to TechCrunch Mobility — your central

Sunday Digest | Featuring 'The Countries With the Most Air Pollution in 2023' 📊

Sunday, April 28, 2024

Every visualization published this week, in one place. Visual Capitalist Sunday Digest logo Apr 28, 2024 | View Online | Subscribe | VC+ The Best of This Week's Visuals Presented by Voronoi: The

Android Weekly #620

Sunday, April 28, 2024

View in web browser 620 April 28th, 2024 Articles & Tutorials Sponsored How DoorDash Manages Mobile Releases Ever wonder how the big names in mobile engineering manage the human side of their app

President Biden signs TikTok bill

Sunday, April 28, 2024

Plus: Robotaxis face new legislation in California and more View this email online in your browser By Anthony Ha Sunday, April 28, 2024 Image Credits: Bryce Durbin/TechCrunch A bill forcing TikTok