Data Science Weekly - Data Science Weekly - Issue 454

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #454

August 04 2022

Editor's Picks

 

 
  • Spotting Topographic Changes from 250 Miles Above
    The Earth Science and Remote Sensing Unit at NASA’s Johnson Space Center in Houston, Texas, is using machine learning to sort through and identify photos taken from the orbital laboratory, making them more searchable and useful to scientists. The Gateway to Astronaut Photography of Earth [Link] service contains nearly 4 million astronaut-captured images. Using AI, researchers have categorized over 2 million photos of Earth’s geographic features, 250,000 images of auroras, 37,000 lightning photos, and 18,000 images of 64 different cities around the world...
  • Do data-driven companies actually win?
    Imagine, if you can, that you're a venture capitalist...Amid the various pitches that land in your inbox, an odd coincidence arrives: Five nearly identical companies reach out to you at once. They're all launching a new clothing line for working from home...They aren’t exactly the same though—different types of experts run each company...the first company has been working in fashion for decades...the second one believes in moving fast and making things...the third company is run by a thirty year-old wonder kid...the fourth company emphasizes operational excellence...and the fifth company believes data will be their competitive edge...Who do you invest in?...
 
 

A Message from this week's Sponsor:

 



Free Access to the Semantic Layer Summit with Bill Inmon, Kirk Borne, and 30+ Enterprise Data Leaders

You're invited to a free one-day virtual event. Explore the importance and impact of using a semantic layer for analytics with an all-star lineup of data leaders from Cigna, Starbucks, Bank of America, and more. Lots to look forward to!

 

 

Data Science Articles & Videos

 
  • Data Center Heatmap
    At Automattic, our systems team manages over 10,000 physical servers located across 30 data centers on 6 continents...Normal data center operating temperatures tend to be between 20F-25C, but cooling failures are somewhat common (they even affect Google), so we have to monitor temperatures carefully...We are big fans of Prometheus and Grafana and for a few years we have had temperature graphs that look like this...
  • What is in your Data Stack? [Reddit Discussion]
    It would be really useful to get a sense of what data tools companies use to get an idea of what are the best options...To contribute, post info in the following format: 1) ETL, 2) Data Warehouse, 3) Data Transformation, 4) BI, 5) Exploratory Data Analysis, 6) Company Size (approx # employees) [optional], 7) Company Industry [optional], 8) Company HQ (city, country) [optional]...
  • Graph Inverse Reinforcement Learning from Diverse Videos
    To learn a reward function from diverse videos, we propose to perform graph abstraction on the videos followed by temporal matching in the graph space to measure the task progress. Our insight is that a task can be described by entity interactions that form a graph, and this graph abstraction can help remove irrelevant information such as textures, resulting in more robust reward functions. We evaluate our approach, GraphIRL, by learning from human demonstrations for real-robot manipulation and via cross-embodiment learning in X-MAGICAL. We show significant improvements in robustness to diverse video demonstrations over previous approaches, and even achieve better results than manual reward design on real robot tasks...
  • An astronomer's introduction to NumPyro
    In this post I’ll focus primarily on providing an introduction to NumPyro, which is a probabilistic programming library that provides an interface for defining probabilistic models and running inference algorithms. At this point, NumPyro is probably the most mature JAX-based probabilistic programming library, and its documentation page has a lot of examples, but I’ve found that these docs are not that user-friendly for my collaborators, so I wanted to provide a different perspective. In the following sections, I’ll present two examples...
  • Escaping Poverty, Benchmarking ML Systems, and Advancing Data-Centric AI with Cody Coleman
    The 97th episode of Datacast is my conversation with Cody Coleman — the Founder and CEO of Coactive AI...Our wide-ranging conversation touches on his remarkable childhood growing up in poverty and finding a few people who have made big differences in his story; his academic experience at MIT studying EE & CS; his industry experience interning at Google and working at JUMP Trading; his Ph.D. work on data selection for deep learning at Stanford, his current journey with Coactive AI; key developments for the Data-Centric AI community; similarities between being a researcher and a founder; and much more....
  • Professional ML engineers: How much of your day to day job involves math and proofs? [Reddit Discussion]
    If you are a professional ML engineer (not data engineer) how much of your day to day work involves doing math and proofs? I can 'do' linear algebra and statistics but I am not sure if doing math and writing proofs on a daily basis would be my cup of tea...EDIT: The reason I asked is because the MS program I am considering requires proofs to pass the ML related classes. I can do that for a couple of classes but not every day...
  • Data Viz Today Podcast - Episode 76: Creativity Mini-Series with Andy Kirk
    Welcome to episode 76 of Data Viz Today. We’re exploring creativity in information design from the perspective of amazingly creative people in the field! If it’s not a magical process, then what is it? Let’s hear how Andy Kirk approaches creativity. We dive into how he defines creativity, what his routines are, what kills his creativity, how he presents creative ideas to clients, where he finds inspiration…and more!...
  • PETs Prize Challenge: Advancing Privacy-Preserving Federated Learning
    Privacy-enhancing technologies (PETs) have the potential to unlock more trustworthy innovation in data analysis and machine learning. Federated learning is one such technology that enables organizations to analyze sensitive data while providing improved privacy protections...That’s why the U.S. and U.K. governments are partnering to deliver a set of prize challenges to unleash the potential of these democracy-affirming technologies to make a positive impact. In particular, this challenge will tackle two critical problems via separate data tracks: Data Track A will help with the identification of financial crime, while Data Track B will bolster pandemic responses...
  • Explaining Complex Models in Production: SHAP Walkthrough
    Understanding and explaining more complex models that we want to use in production is not only critical for legal and ethical reasons but also makes solid business sense before we hand off important decisions to automated systems. For these more complex models, we have to try some indirect approaches. Two of the more common and useful approaches are Shapley Values, which are a way of estimating a particular feature's effect on a specific prediction, and Partial Dependence and Individual Conditional Expectation plots, which are used to visualize the interaction between the features and the prediction values....
 
 

Tool*

 


Data Maturity Assessment

You already know that data is one of an organization’s most valuable assets. But is your organization harnessing the full power of its data? Take Pragmatic Institute’s complimentary Data Maturity Assessment to discover where your organization falls on the data maturity continuum and start building a data-driven culture.

The Data Maturity Assessment is a powerful tool for organizations that want to boost data literacy, democratize data, and leverage data in everyday decision making

Take assessment


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 

 

Jobs

 
  • Data Scientist - Success Academy Charter Schools, Inc - NYC

    This new Data Scientist role will be a key contributor to our mission of driving innovation across the organization. Reporting to the Leader of Enterprise Analytics, this role will be responsible for working with stakeholders in various functions to understand areas of opportunity, developing analytical solutions ranging from dashboards to sophisticated mathematical models, and helping functional teams adopt those solutions. This role will be part of a highly collaborative team of professionals with a wide range of skills including data science, data engineering, business analysis, and project management....
     

        Want to post a job here? Email us for details --> team@datascienceweekly.org

 

 

Training & Resources

 
  • Maths for Machine Learning Map
    This map presents a mathematical-heavy approach, building from the ground up to give a deep understanding of the field. This deep understanding is required to go into Machine Learning research and is also valuable background knowledge if deploying models in production as a Machine Learning Engineer or improving their runtime efficiency as a Software Engineer...
  • New Book: Understanding Deep Learning
    I've been writing a new textbook. It's entitled "Understanding Deep Learning" and will be published by MIT press... A partial draft is now available...
 
 

What you’re up to – notes from DSW readers

 
  • Working on something cool? Let us know here :) ...
 

* To share your projects and updates, share the details here.

** Want to chat with one of the above people? Hit reply and let us know :)

 

Last Week's Newsletter's 3 Most Clicked Links

 

* Based on unique clicks.

** Find last week's newsletter here.

 

P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2022 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

Data Science Weekly - Issue 453

Friday, July 29, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #453 July 28 2022 Editor's Picks

Data Science Weekly - Issue 452

Friday, July 22, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #452 July 21 2022 Editor's Picks Is

Data Science Weekly - Issue 451

Friday, July 15, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #451 July 14 2022 Editor's Picks The

Data Science Weekly - Issue 450

Friday, July 8, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #450 July 07 2022 Editor's Picks AI

Data Science Weekly - Issue 449

Friday, July 1, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #449 June 30 2022 Editor Picks Pen and

You Might Also Like

New Blogs on ThomasMaurer.ch for 04/16/2024

Tuesday, April 16, 2024

View this email in your browser Thomas Maurer Cloud & Datacenter Update This is the update for blog posts on ThomasMaurer.ch. Automate on-premises Windows Server from the cloud using Azure Arc By

April TC39 meeting; Rspack v0.6; future of JS packages; Evan You on Vue, Vite, etc.

Tuesday, April 16, 2024

We have 7 links for you - Stay up-to-date on JavaScript and tools This Week In React - Keeps senior React devs up to date thisweekinreact.com Partner We keep over 37k mid/senior React devs up-to-date

Ingesting & Using CAD Data for Real-Time 3D

Tuesday, April 16, 2024

How engineering firms leverage real-time 3D technology View this email in your browser engineering.com White Paper - Ingesting and Using CAD Data for Real-Time 3D Ingesting and Using CAD Data for Real-

It’s Easy. But Is It Easy Enough? 🤓

Monday, April 15, 2024

Is self-hosting still too hard for normal people? Here's a version for your browser. Hunting for the end of the long tail • April 15, 2024 It's Easy. But Is It Easy Enough? Self-hosted apps are

Re: Free Class: Master the Notes app

Monday, April 15, 2024

Hi there, We are holding a Free Notes App Class tomorrow (Wednesday, April 17) at 4:30 pm ET! We do expect this class to fill up so register soon to save your spot! I wanted to take a minute to answer

Two Tesla execs leave amid layoffs

Monday, April 15, 2024

Tesla execs bid adieu View this email online in your browser By Christine Hall Monday, April 15, 2024 Welcome back to TechCrunch PM, where you can find me each day bringing you the most important

🍏 Why You Should Buy the MacBook Air Over the Pro — Thrift Stores Are a Goldmine for Geeks

Monday, April 15, 2024

Also: How to Play Epic Game Titles on the Steam Deck, and More! How-To Geek Logo April 15, 2024 📩 Get expert reviews, the hottest deals, how-to's, breaking news, and more delivered directly to your

JSK Daily for Apr 15, 2024

Monday, April 15, 2024

JSK Daily for Apr 15, 2024 View this email in your browser A community curated daily e-mail of JavaScript news Embracing Functional Programming: Streamlining Code with Reusability and Maintainability

True Anomaly and Rocket Lab will make big moves on orbit (literally)

Monday, April 15, 2024

The Space Force has contracted out its next "responsive space" mission, and this one is a doozy. View this email online in your browser By Aria Alamalhodaei Monday, April 15, 2024 Hello and

Daily Coding Problem: Problem #1413 [Medium]

Monday, April 15, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Square. Given a string and a set of characters, return the shortest substring containing