Data Science Weekly - Data Science Weekly - Issue 465

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #465

October 20 2022

Editor's Picks

 

  • An AI Might Have Written This
    Every has been building Lex, a word processor with AI baked in. I started working on this piece before we launched Lex, but testing out this tool (among others) has shaped my perspective on the role of AI writing assistants for creatives...Indie fiction writers are using AI assistants to write their novels faster, and a New York Times best-selling author, April Henry, is using AI to help generate story ideas...
  • How Transformers Seem to Mimic Parts of the Brain
    For years, neuroscientists have harnessed many types of neural networks to model the firing of neurons in the brain. In recent work, researchers have shown that the hippocampus, a structure of the brain critical to memory, is basically a special kind of neural net, known as a transformer, in disguise. Their new model tracks spatial information in a way that parallels the inner workings of the brain. They’ve seen remarkable success...
  • State of AI Report 2022
    Now in its fifth year, the State of AI Report 2022 is reviewed by leading AI practioners in industry and research. It considers the following key dimensions, including a new Safety section: a) Research: Technology breakthroughs and their capabilities, b) Industry: Areas of commercial application for AI and its business impact, c) Politics: Regulation of AI, its economic implications and the evolving geopolitics of AI, d) Safety: Identifying and mitigating catastrophic risks that highly-capable future AI systems could pose to us, e) Predictions: What we believe will happen and a performance review to keep us honest...Key themes in the 2022 Report include... ...
 
 

A Message from this week's Sponsor:

 



Out now: new semantic layer whitepapers

Check out this bundle of Semantic Layer whitepapers by best selling authors - download here.

You'll learn the key value propositions to implement a semantic layer and best practices for analytics success with one.

 

 

Data Science Articles & Videos

 
  • Building Transformers from Neurons and Astrocytes
    Glial cells account for roughly 90% of all human brain cells, and serve a variety of important developmental, structural, and metabolic functions. Recent experimental efforts suggest that astrocytes, a type of glial cell, are also directly involved in core cognitive processes such as learning and memory. While it is well-established that astrocytes and neurons are connected to one another in feedback loops across many time scales and spatial scales, there is a gap in understanding the computational role of neuron-astrocyte interactions. To help bridge this gap, we draw on recent advances in artificial intelligence (AI) and astrocyte imaging technology. In particular, we show that neuron-astrocyte networks can naturally perform the core computation of a Transformer...
  • How We Enabled Dev and Data Science Independence With Clear API Boundaries Using Airflow and Databricks
    Your dev team needs to use a data science algorithm to solve a real business problem, but how can you use this algorithm? Usually, data scientists write in R, Python, or Scala (Spark), and these do not expose a microservice you can consume using a clear API like any other service. So, you will often need someone (Dev/ML Platform) to wrap a data science artifact and expose it for consumption...In this post, I will show you how we enabled our data science team to expose their artifacts with a clear API, allowing them to take full ownership of the process from deployment to production...
  • Are you Data Scientists or Software Developers?!
    In my recent talk ‘Really Useful Engines’ I rabbited on about how effective data science teams must necessarily engineer a domain specific capability layer of software functions or packages that become a force multiplier. The simple becomes trivial, and the hard becomes tractable. This makes headroom for the development of more capabilities still. A virtuous cycle. It’s either that or get snared in a quagmire of copy-pasta code tech debt...This post is about what that looks like, and how it can be made better with good data science tooling (or not)...
  • Minimax Estimation and Identity Testing of Markov Chains
    We briefly review the two classical problems of distribution estimation and identity testing (in the context of property testing), then propose to extend them to a Markovian setting. We will see that the sample complexity depends not only on the number of states, but also on the stationary and mixing properties of the chains...
  • Exploring the Frontiers in Earth System Modeling with Machine Learning
    Over the last decade, the volume of data from satellite sensors and Earth system models has increased by at least an order of magnitude...This workshop brings a small but varied group of geoscientists and climate modelers together with machine learners, statisticians, and representatives of other fields where ML has already had a big impact. Discussions will center on how innovative and efficient ML methods will provide new, innovative and transformative ways of modeling and projecting the Earth system and extracting information from massive data volumes...[Videos and PDFs from presentations available]...
  • John Schulman on TalkRL: The Reinforcement Learning Podcast
    John Schulman, OpenAI cofounder and researcher, inventor of PPO/TRPO talks RL from human feedback, tuning GPT-3 to follow instructions (InstructGPT) and answer long-form questions using the internet (WebGPT), AI alignment, AGI timelines, and more!...
  • Memorizing facts about systems I work with [Twitter Thread]
    I've found it unexpectedly useful to memorize facts about systems I work with...Knowing these numbers allows one to 1. sanity check performance, 2. sketch out feasibility of technical solutions, and 3. reason about performance characteristics...Some examples below...
  • Bayesian Structural Timeseries - Forecasting
    We want to show how we can model bayesian structural time series with autoregressive processes can be modeled in pymc and used to predict future unobserved data. How these kinds of models can flexibly incorporate structural assumptions and project future outcomes is only sparsely covered in the PYMC documentation. Hopefully recording the full modeling and prediction loop here is useful for you...
  • How undesired goals can arise with correct rewards
    Exploring examples of goal misgeneralisation – where an AI system's capabilities generalise but its goal doesn't...we explore a more subtle mechanism by which AI systems may unintentionally learn to pursue undesired goals: goal misgeneralisation (GMG)...
  • Obtaining genetics insights from deep learning via explainable artificial intelligence
    AI models based on deep learning now represent the state of the art for making functional predictions in genomics research. However, the underlying basis on which predictive models make such predictions is often unknown. For genomics researchers, this missing explanatory information would frequently be of greater value than the predictions themselves, as it can enable new insights into genetic processes. We review progress in the emerging area of explainable AI (xAI), a field with the potential to empower life science researchers to gain mechanistic insights into complex deep learning models...
  • General-Purpose Pre-Trained Models in Robotics
    The impressive generalization capabilities of large neural network models hinge on the ability to integrate enormous quantities of training data. This presents a major challenge for most downstream tasks where data is scarce...A central benefit of robotic learning should be in enabling rapid and autonomous acquisition of new tasks on command, but if each task requires either a large human-provided demonstration dataset or a long reinforcement learning training run, this benefit will be hard to realize. So how can we develop models and datasets that make it possible to pre-train for a broad range of downstream robotic skills?...
 
 

Tool*

 



Retool is the fast way to build an interface for any database

With Retool, you don't need to be a developer to quickly build an app or dashboard on top of any data set. Data teams at companies like NBC use Retool to build any interface on top of their data—whether it's a simple read-write visualization or a full-fledged ML workflow.

Drag and drop UI components—like tables and charts—to create apps. At every step, you can jump into the code to define the SQL queries and JavaScript that power how your app acts and connects to data. The result—less time on repetitive work and more time to discover insights.



*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 
 

Conference*

 



Global AI Developer Days – 26 October 2022

Join the Global AI Community for a day of inspiring keynotes from industry leaders with a high focus on AI developers.

Highlights during this 3-hour conference include responsible AI by Ruth Yakubu Principal Cloud Advocate at Microsoft. She will talk about how to improve fairness and reliability of AI solutions. Eric Boyd, Corporate Vice President at Microsoft, will show all the latest inventions in Azure AI. Manuvir Das, Head of Enterprise Computing at NVIDIA, takes you on a journey through the new era of AI for developers and many more leaders from the AI community will share their vision.

Don’t miss out on this free day of learning from top leaders in the AI space!

https://devdays.globalai.community


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 

 

Jobs

 
  • Data Scientist - Mount Sinai Data Commons - NYC

    A position is available for an individual with skills in data science, bioinformatics and software engineering to play the key role in running and managing the Mount Sinai Data Commons – known as the Data Ark. The Data Ark team brings together all the most important data sets used by Sinai researchers (e.g. 1000G, GTEx, UK Biobank) in a single location on our HPC server (minvera.org), performs QA/QC processing of the data, conducts initial demographics analyses to showcase the different data sets, and will be tasked with expanding the data commons to host a large range of different data sets of different types (genotype, WES, WGS, RNA-seq, EHR-linked, imaging etc.), which will come with their own computational and platform challenges...
     

        Want to post a job here? Email us for details --> team@datascienceweekly.org

 

 

Training & Resources

 
  • Napkin Math - Techniques and numbers for estimating system's performance
    The goal of this project is to collect software, numbers, and techniques to quickly estimate the expected performance of systems from first-principles. For example, how quickly can you read 1 GB of memory? By composing these resources you should be able to answer interesting questions like: how much storage cost should you expect to pay for logging for an application with 100,000 RPS?...
  • Tutorial on Uncertainty Estimation for Natural Language Processing
    This tutorial is intended for both academic researchers and industry practitioners alike, and provides a comprehensive introduction to uncertainty estimation for NLP problems---from fundamentals in probability calibration, Bayesian inference, and confidence set (or interval) construction, to applied topics in modern out-of-distribution detection and selective inference...
  • Setting up R in Visual Studio Code
    This post will show you how to set up Visual Studio Code as an integrated development environment for the statistical language R. This will include some useful features such as: a) plots that appear within a VS Code panel, b) a language server with autocomplete, c) syntax highlighting of R code in console and scripts, d) interactive window development...Of course, RStudio has all of these features for R too. However, Visual Studio Code does a lot more than just R, and has tons of cutting edge integrated development environment features that we’d like to make use of...
 
 

What you’re up to – notes from DSW readers

 
  • Fill out the form below to appear here :) ...
 

* To share your projects and updates, share the details here.

** Want to chat with one of the above people? Hit reply and let us know :)

 

Last Week's Newsletter's 3 Most Clicked Links

   

* Based on unique clicks.

** Find last week's newsletter here.

 

Cutting Room Floor

 


P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2022 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Key phrases

Older messages

Data Science Weekly - Issue 464

Thursday, October 13, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #464 October 13 2022 Editor's Picks

Data Science Weekly - Issue 463

Thursday, October 6, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #463 October 06 2022 Editor's Picks

Data Science Weekly - Issue 462

Thursday, September 29, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #462 September 29 2022 Editor's

Data Science Weekly - Issue 461

Friday, September 23, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #461 September 22 2022 Editor's

Data Science Weekly - Issue 460

Thursday, September 15, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #460 September 15 2022 Editor's

You Might Also Like

📧 Introduction to Distributed Tracing With OpenTelemetry in .NET

Saturday, April 20, 2024

​ Introduction to Distributed Tracing With OpenTelemetry in .NET Read on: m​y website / Read time: 5 minutes BROUGHT TO YOU BY ​ Shesha: The .NET Open-Source Low-Code Framework ​ Introducing Shesha, a

a16z’s Infrastructure team gets a new general partner

Friday, April 19, 2024

Post News is shutting down and Wall Street isn't feeling a Salesforce-Informatica pairing View this email online in your browser By Christine Hall Friday, April 19, 2024 Image Credits: Andreessen

New Roundtable! Additive for Mass Production Applications

Friday, April 19, 2024

The Outlook for the Future View this email in your browser engineering.com Roundtable - Additive for Mass Production Applications: The Outlook for the Future 6 Considerations for Choosing the Right

📷 What to Know About Macro Photography — Why You Should Buy a Budget Motherboard

Friday, April 19, 2024

Also: How to Automatically Highlight Values in Excel, and More! How-To Geek Logo April 19, 2024 📩 Get expert reviews, the hottest deals, how-to's, breaking news, and more delivered directly to your

Is the wind going out of the AI sails?

Friday, April 19, 2024

Rippling vacuums up venture capital and Ramp bags more millions View this email online in your browser By Haje Jan Kamps Friday, April 19, 2024 Image Credits: Getty Images / Carol Yepes Welcome to

Llama 3 is out - Weekly News Roundup - Issue #463

Friday, April 19, 2024

Plus: brand-new, all-electric Atlas; AI Index Report 2024; Microsoft pitched GenAI tools to US military; Humane AI Pin reviews are in; debunking Devin; and more! ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Daily Coding Problem: Problem #1417 [Easy]

Friday, April 19, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Wayfair. You are given a 2 x N board, and instructed to completely cover the board with

Charted | How Hard Is It to Get Into an Ivy League School? 🎓

Friday, April 19, 2024

We detail the admission rates and average annual cost for Ivy League schools, as well as the median SAT scores required to be accepted. View Online | Subscribe Presented by: Discover the motivations

Dark Matter & Tortured Poets

Friday, April 19, 2024

New music releases aren't what they used to be -- for good and bad. Dark Matter & Tortured Poets By MG Siegler • 19 Apr 2024 View in browser View in browser New music releases in 2024 are a

Impact of AI on Product Management

Friday, April 19, 2024

​ Impact of AI on Product Management The rise of the AI Product Manager. Product managers have always championed customer's needs. However, with AI, the job requires new technical and ethical