Data Science Weekly - Data Science Weekly - Issue 465

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #465

October 20 2022

Editor's Picks

 

  • An AI Might Have Written This
    Every has been building Lex, a word processor with AI baked in. I started working on this piece before we launched Lex, but testing out this tool (among others) has shaped my perspective on the role of AI writing assistants for creatives...Indie fiction writers are using AI assistants to write their novels faster, and a New York Times best-selling author, April Henry, is using AI to help generate story ideas...
  • How Transformers Seem to Mimic Parts of the Brain
    For years, neuroscientists have harnessed many types of neural networks to model the firing of neurons in the brain. In recent work, researchers have shown that the hippocampus, a structure of the brain critical to memory, is basically a special kind of neural net, known as a transformer, in disguise. Their new model tracks spatial information in a way that parallels the inner workings of the brain. They’ve seen remarkable success...
  • State of AI Report 2022
    Now in its fifth year, the State of AI Report 2022 is reviewed by leading AI practioners in industry and research. It considers the following key dimensions, including a new Safety section: a) Research: Technology breakthroughs and their capabilities, b) Industry: Areas of commercial application for AI and its business impact, c) Politics: Regulation of AI, its economic implications and the evolving geopolitics of AI, d) Safety: Identifying and mitigating catastrophic risks that highly-capable future AI systems could pose to us, e) Predictions: What we believe will happen and a performance review to keep us honest...Key themes in the 2022 Report include... ...
 
 

A Message from this week's Sponsor:

 



Out now: new semantic layer whitepapers

Check out this bundle of Semantic Layer whitepapers by best selling authors - download here.

You'll learn the key value propositions to implement a semantic layer and best practices for analytics success with one.

 

 

Data Science Articles & Videos

 
  • Building Transformers from Neurons and Astrocytes
    Glial cells account for roughly 90% of all human brain cells, and serve a variety of important developmental, structural, and metabolic functions. Recent experimental efforts suggest that astrocytes, a type of glial cell, are also directly involved in core cognitive processes such as learning and memory. While it is well-established that astrocytes and neurons are connected to one another in feedback loops across many time scales and spatial scales, there is a gap in understanding the computational role of neuron-astrocyte interactions. To help bridge this gap, we draw on recent advances in artificial intelligence (AI) and astrocyte imaging technology. In particular, we show that neuron-astrocyte networks can naturally perform the core computation of a Transformer...
  • How We Enabled Dev and Data Science Independence With Clear API Boundaries Using Airflow and Databricks
    Your dev team needs to use a data science algorithm to solve a real business problem, but how can you use this algorithm? Usually, data scientists write in R, Python, or Scala (Spark), and these do not expose a microservice you can consume using a clear API like any other service. So, you will often need someone (Dev/ML Platform) to wrap a data science artifact and expose it for consumption...In this post, I will show you how we enabled our data science team to expose their artifacts with a clear API, allowing them to take full ownership of the process from deployment to production...
  • Are you Data Scientists or Software Developers?!
    In my recent talk ‘Really Useful Engines’ I rabbited on about how effective data science teams must necessarily engineer a domain specific capability layer of software functions or packages that become a force multiplier. The simple becomes trivial, and the hard becomes tractable. This makes headroom for the development of more capabilities still. A virtuous cycle. It’s either that or get snared in a quagmire of copy-pasta code tech debt...This post is about what that looks like, and how it can be made better with good data science tooling (or not)...
  • Minimax Estimation and Identity Testing of Markov Chains
    We briefly review the two classical problems of distribution estimation and identity testing (in the context of property testing), then propose to extend them to a Markovian setting. We will see that the sample complexity depends not only on the number of states, but also on the stationary and mixing properties of the chains...
  • Exploring the Frontiers in Earth System Modeling with Machine Learning
    Over the last decade, the volume of data from satellite sensors and Earth system models has increased by at least an order of magnitude...This workshop brings a small but varied group of geoscientists and climate modelers together with machine learners, statisticians, and representatives of other fields where ML has already had a big impact. Discussions will center on how innovative and efficient ML methods will provide new, innovative and transformative ways of modeling and projecting the Earth system and extracting information from massive data volumes...[Videos and PDFs from presentations available]...
  • John Schulman on TalkRL: The Reinforcement Learning Podcast
    John Schulman, OpenAI cofounder and researcher, inventor of PPO/TRPO talks RL from human feedback, tuning GPT-3 to follow instructions (InstructGPT) and answer long-form questions using the internet (WebGPT), AI alignment, AGI timelines, and more!...
  • Memorizing facts about systems I work with [Twitter Thread]
    I've found it unexpectedly useful to memorize facts about systems I work with...Knowing these numbers allows one to 1. sanity check performance, 2. sketch out feasibility of technical solutions, and 3. reason about performance characteristics...Some examples below...
  • Bayesian Structural Timeseries - Forecasting
    We want to show how we can model bayesian structural time series with autoregressive processes can be modeled in pymc and used to predict future unobserved data. How these kinds of models can flexibly incorporate structural assumptions and project future outcomes is only sparsely covered in the PYMC documentation. Hopefully recording the full modeling and prediction loop here is useful for you...
  • How undesired goals can arise with correct rewards
    Exploring examples of goal misgeneralisation – where an AI system's capabilities generalise but its goal doesn't...we explore a more subtle mechanism by which AI systems may unintentionally learn to pursue undesired goals: goal misgeneralisation (GMG)...
  • Obtaining genetics insights from deep learning via explainable artificial intelligence
    AI models based on deep learning now represent the state of the art for making functional predictions in genomics research. However, the underlying basis on which predictive models make such predictions is often unknown. For genomics researchers, this missing explanatory information would frequently be of greater value than the predictions themselves, as it can enable new insights into genetic processes. We review progress in the emerging area of explainable AI (xAI), a field with the potential to empower life science researchers to gain mechanistic insights into complex deep learning models...
  • General-Purpose Pre-Trained Models in Robotics
    The impressive generalization capabilities of large neural network models hinge on the ability to integrate enormous quantities of training data. This presents a major challenge for most downstream tasks where data is scarce...A central benefit of robotic learning should be in enabling rapid and autonomous acquisition of new tasks on command, but if each task requires either a large human-provided demonstration dataset or a long reinforcement learning training run, this benefit will be hard to realize. So how can we develop models and datasets that make it possible to pre-train for a broad range of downstream robotic skills?...
 
 

Tool*

 



Retool is the fast way to build an interface for any database

With Retool, you don't need to be a developer to quickly build an app or dashboard on top of any data set. Data teams at companies like NBC use Retool to build any interface on top of their data—whether it's a simple read-write visualization or a full-fledged ML workflow.

Drag and drop UI components—like tables and charts—to create apps. At every step, you can jump into the code to define the SQL queries and JavaScript that power how your app acts and connects to data. The result—less time on repetitive work and more time to discover insights.



*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 
 

Conference*

 



Global AI Developer Days – 26 October 2022

Join the Global AI Community for a day of inspiring keynotes from industry leaders with a high focus on AI developers.

Highlights during this 3-hour conference include responsible AI by Ruth Yakubu Principal Cloud Advocate at Microsoft. She will talk about how to improve fairness and reliability of AI solutions. Eric Boyd, Corporate Vice President at Microsoft, will show all the latest inventions in Azure AI. Manuvir Das, Head of Enterprise Computing at NVIDIA, takes you on a journey through the new era of AI for developers and many more leaders from the AI community will share their vision.

Don’t miss out on this free day of learning from top leaders in the AI space!

https://devdays.globalai.community


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 

 

Jobs

 
  • Data Scientist - Mount Sinai Data Commons - NYC

    A position is available for an individual with skills in data science, bioinformatics and software engineering to play the key role in running and managing the Mount Sinai Data Commons – known as the Data Ark. The Data Ark team brings together all the most important data sets used by Sinai researchers (e.g. 1000G, GTEx, UK Biobank) in a single location on our HPC server (minvera.org), performs QA/QC processing of the data, conducts initial demographics analyses to showcase the different data sets, and will be tasked with expanding the data commons to host a large range of different data sets of different types (genotype, WES, WGS, RNA-seq, EHR-linked, imaging etc.), which will come with their own computational and platform challenges...
     

        Want to post a job here? Email us for details --> team@datascienceweekly.org

 

 

Training & Resources

 
  • Napkin Math - Techniques and numbers for estimating system's performance
    The goal of this project is to collect software, numbers, and techniques to quickly estimate the expected performance of systems from first-principles. For example, how quickly can you read 1 GB of memory? By composing these resources you should be able to answer interesting questions like: how much storage cost should you expect to pay for logging for an application with 100,000 RPS?...
  • Tutorial on Uncertainty Estimation for Natural Language Processing
    This tutorial is intended for both academic researchers and industry practitioners alike, and provides a comprehensive introduction to uncertainty estimation for NLP problems---from fundamentals in probability calibration, Bayesian inference, and confidence set (or interval) construction, to applied topics in modern out-of-distribution detection and selective inference...
  • Setting up R in Visual Studio Code
    This post will show you how to set up Visual Studio Code as an integrated development environment for the statistical language R. This will include some useful features such as: a) plots that appear within a VS Code panel, b) a language server with autocomplete, c) syntax highlighting of R code in console and scripts, d) interactive window development...Of course, RStudio has all of these features for R too. However, Visual Studio Code does a lot more than just R, and has tons of cutting edge integrated development environment features that we’d like to make use of...
 
 

What you’re up to – notes from DSW readers

 
  • Fill out the form below to appear here :) ...
 

* To share your projects and updates, share the details here.

** Want to chat with one of the above people? Hit reply and let us know :)

 

Last Week's Newsletter's 3 Most Clicked Links

   

* Based on unique clicks.

** Find last week's newsletter here.

 

Cutting Room Floor

 


P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2022 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

Data Science Weekly - Issue 464

Thursday, October 13, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #464 October 13 2022 Editor's Picks

Data Science Weekly - Issue 463

Thursday, October 6, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #463 October 06 2022 Editor's Picks

Data Science Weekly - Issue 462

Thursday, September 29, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #462 September 29 2022 Editor's

Data Science Weekly - Issue 461

Friday, September 23, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #461 September 22 2022 Editor's

Data Science Weekly - Issue 460

Thursday, September 15, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #460 September 15 2022 Editor's

You Might Also Like

📧 Building Async APIs in ASP.NET Core - The Right Way

Saturday, November 23, 2024

​ Building Async APIs in ASP .NET Core - The Right Way Read on: m​y website / Read time: 5 minutes The .NET Weekly is brought to you by: Even the smartest AI in the world won't save you from a

WebAIM November 2024 Newsletter

Friday, November 22, 2024

WebAIM November 2024 Newsletter Read this newsletter online at https://webaim.org/newsletter/2024/november Features Using Severity Ratings to Prioritize Web Accessibility Remediation When it comes to

➡️ Why Your Phone Doesn't Want You to Sideload Apps — Setting the Default Gateway in Linux

Friday, November 22, 2024

Also: Hey Apple, It's Time to Upgrade the Macs Storage, and More! How-To Geek Logo November 22, 2024 Did You Know Fantasy author JRR Tolkien is credited with inventing the main concept of orcs and

JSK Daily for Nov 22, 2024

Friday, November 22, 2024

JSK Daily for Nov 22, 2024 View this email in your browser A community curated daily e-mail of JavaScript news React E-Commerce App for Digital Products: Part 4 (Creating the Home Page) This component

Spyglass Dispatch: The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen

Friday, November 22, 2024

The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen The Spyglass Dispatch is a free newsletter sent out daily on

Charted | How the Global Distribution of Wealth Has Changed (2000-2023) 💰

Friday, November 22, 2024

This graphic illustrates the shifts in global wealth distribution between 2000 and 2023. View Online | Subscribe | Download Our App Presented by: MSCI >> Get the Free Investor Guide Now FEATURED

Daily Coding Problem: Problem #1616 [Easy]

Friday, November 22, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Alibaba. Given an even number (greater than 2), return two prime numbers whose sum will

The problem to solve

Friday, November 22, 2024

​ Use problem framing to define the problem to solve This week, Tom Parson and Krishna Raha share tools and frameworks to identify and address challenges effectively, while Voltage Control highlights

Issue #568: Random mazes, train clock, and ReKill

Friday, November 22, 2024

View this email in your browser Issue #568 - November 22nd 2024 Weekly newsletter about Web Game Development. If you have anything you want to share with our community please let me know by replying to

Whats Next for AI: Interpreting Anthropic CEOs Vision

Friday, November 22, 2024

Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 22, 2024? The HackerNoon