Data Science Weekly - Data Science Weekly - Issue 461

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #461

September 22 2022

Editor's Picks

 

  • Growing a Career in NLP with Primer’s Amy Heineike
    How does Amy apply her curious spirit to her work in NLP? She has been working in NLP for 7 years, on the team building Primer’s core code and applications. That means she has seen the models completely evolve. And one of the core challenges is still figuring out how to use data to draw interesting new conclusions. At this point, a lot of people realize NLP is out there, she says, “but it’s still quite hard to figure out how to make it useful.”...
  • Curating R-Ladies' Twitter Account - A Fun Ride!
    I had an incredible pleasure (and honor) to curate R-Ladies' Twitter account this week. To make it short: It’s been a blast and a fantastic experience that I can only recommend!...If you are interested, there are multiple posts that I all read beforehand about what it is like to be a curator...But let’s start from the beginning...
  • Productizing Large Language Models
    At Replit we have deployed transformer-based language models of all sizes: ~100m parameter models for search and spam, 1-10B models for a code autocomplete product we call GhostWriter, and 100B+ models for features that require a higher reasoning ability. In this post we'll talk about what we've learned about building and hosting large language models...
 
 

A Message from this week's Sponsor:

 



Data Maturity Assessment

You might be data fluent, but what about the rest of your organization? Partner with team members and business stakeholders to complete Pragmatic Institute’s complimentary Data Maturity Assessment so you can measure your organization’s overall data maturity.

By discovering where your organization falls in the data maturity continuum, you can start taking steps to leverage data more strategically.

Take Assessment.

 

 

Data Science Articles & Videos

 
  • Introducing Whisper
    We’ve trained and are open-sourcing a neural net called Whisper that approaches human level robustness and accuracy on English speech recognition...
  • Brain Imaging Generation with Latent Diffusion Models
    Diffusion models recently have caught the attention of the computer vision community by producing photorealistic synthetic images. In this study, we explore using Latent Diffusion Models to generate synthetic images from high-resolution 3D brain images. We used T1w MRI images from the UK Biobank dataset (N=31,740) to train our models to learn about the probabilistic distribution of brain images, conditioned on covariables, such as age, sex, and brain structure volumes. We found that our models created realistic data, and we could use the conditioning variables to control the data generation effectively...
  • PyMC Labs: The Bayesian Newsletter, Sep 2022
    Previously this mailing list has been used for the Bayes course. Moving forward, we will continue to provide updates to the course on this mailing list. In addition, we will share announcements on Bayesian news, upcoming PyMC events, blogs, releases and more...
  • Generative AI: A Creative New World
    The fields that generative AI addresses—knowledge work and creative work—comprise billions of workers. Generative AI can make these workers at least 10% more efficient and/or creative: they become not only faster and more efficient, but more capable than before. Therefore, Generative AI has the potential to generate trillions of dollars of economic value...
  • Perspectives on knowledge acquisition & mobilization with neural net - Hugo Larochelle - CoLLAs 2022 [Video]
    In this talk, I’ll share my thoughts on the state of progress in designing AI systems with neural networks. I’ll frame a perspective that views our success as relying on two separate and equally critical steps, that I refer to as neural knowledge acquisition and neural knowledge mobilization. Then I’ll describe my own research journey from that point of view using various examples, discuss lessons learned and highlight what I think are the opportunities and challenges ahead...
  • How to build TRUST in Machine Learning, the sane way
    Building trust in machine learning is tough. Loss of trust is possibly the biggest risk that a business can ever face ☠. Unfortunately, people tend to discuss this topic in a very superficial and buzzwordy manner...In this post, I will present why it is difficult to build trust in machine learning projects. To gain the most business value from the model, we want stakeholders to trust it. We want to provide defensive mechanisms to avoid problems impacting stakeholders and to build developers’ trust in the product...
  • How to incorporate biological insights into network models and why it matters
    Here, we argue that building biologically realistic network models is crucial to establishing causal relationships between neurons, synapses, circuits, and behavior. More specifically, we advocate for network models that consider the connectivity structure and the recorded activity dynamics while evaluating task performance...
  • SQLite: Past, Present, and Future
    SQLite is the most widely deployed database engine (or likely even software of any type) in existence. It is found in nearly every smartphone (iOS and Android), computer, web browser, television, and automobile. There are likely over one trillion SQLite databases in active use...
 
 

Tool*

 


DataQA is a no-code tool for model error and quality analysis

Assessing the quality of a model is more than just looking at a few metrics; problems can often be hidden in biases or underperforming segments that are important to the business.

DataQA enables data science teams to accelerate their model QA with an intuitive no-code platform. With it, teams can quickly inspect model performance visually across different segments of the data. DataQA keeps non-technical domain experts involved in the process, replacing the need to send emails and spreadsheets.


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

 

 

Jobs

 
  • Data Scientist - Success Academy Charter Schools, Inc - NYC

    This new Data Scientist role will be a key contributor to our mission of driving innovation across the organization. Reporting to the Leader of Enterprise Analytics, this role will be responsible for working with stakeholders in various functions to understand areas of opportunity, developing analytical solutions ranging from dashboards to sophisticated mathematical models, and helping functional teams adopt those solutions. This role will be part of a highly collaborative team of professionals with a wide range of skills including data science, data engineering, business analysis, and project management....
     

        Want to post a job here? Email us for details --> team@datascienceweekly.org

 

 

Training & Resources

 
  • Understanding the Snowflake Query Optimizer
    The job of a query optimizer is to reduce the cost of queries without changing what they do. Optimizers cleverly manipulate the underlying data pipelines of a query to eliminate work, pare down expensive operations, and optimally re-arrange tasks...there are three types of optimizations you need to know about: scan reduction, limiting the volume of data read, query rewriting, reorganizing a query to reduce cost, and join optimization, the NP-hard problem of optimally executing a join...In this post, I'll share a reference of most common optimizations you might expect to see when working with Snowflake...
  • The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning [Video]
    The concept of word embeddings is a central one in language processing (NLP). It's a method of representing words as numerically -- as lists of numbers that capture their meaning. Word2vec is an algorithm (a couple of algorithms, actually) of creating word vectors which helped popularize this concept. In this video, Jay take you in a guided tour of The Illustrated Word2Vec, an article explaining the method and how it came to be developed...
 
 

What you’re up to – notes from DSW readers

 
  • Fill out the form below to appear here :) ...
 

* To share your projects and updates, share the details here.

** Want to chat with one of the above people? Hit reply and let us know :)

 

Last Week's Newsletter's 3 Most Clicked Links

 

* Based on unique clicks.

** Find last week's newsletter here.



 

Cutting Room Floor

 


P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2022 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

Data Science Weekly - Issue 460

Thursday, September 15, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #460 September 15 2022 Editor's

Data Science Weekly - Issue 459

Thursday, September 8, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #459 September 08 2022 Editor's

Data Science Weekly - Issue 458

Friday, September 2, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #458 September 01 2022 Editor's

Data Science Weekly - Issue 457

Friday, August 26, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #457 August 25 2022 Editor's Picks

Data Science Weekly - Issue 456

Friday, August 19, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #456 August 18 2022 Editor's Picks

You Might Also Like

Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator

Friday, February 14, 2025

What came before the golem? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Defining Your Paranoia Level: Navigating Change Without the Overkill

Friday, February 14, 2025

We've all been there: trying to learn something new, only to find our old habits holding us back. We discussed today how our gut feelings about solving problems can sometimes be our own worst enemy

5 ways AI can help with taxes 🪄

Friday, February 14, 2025

Remotely control an iPhone; 💸 50+ early Presidents' Day deals -- ZDNET ZDNET Tech Today - US February 10, 2025 5 ways AI can help you with your taxes (and what not to use it for) 5 ways AI can help

Recurring Automations + Secret Updates

Friday, February 14, 2025

Smarter automations, better templates, and hidden updates to explore 👀 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

The First Provable AI-Proof Game: Introducing Butterfly Wings 4

Friday, February 14, 2025

Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? undefined The Market Today #01 Instagram (Meta) 714.52 -0.32%

GCP Newsletter #437

Friday, February 14, 2025

Welcome to issue #437 February 10th, 2025 News BigQuery Cloud Marketplace Official Blog Partners BigQuery datasets now available on Google Cloud Marketplace - Google Cloud Marketplace now offers

Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰

Friday, February 14, 2025

Discover how the share of US wealth held by the top 1% has evolved from 1989 to 2024 in this infographic. View Online | Subscribe | Download Our App Download our app to see thousands of new charts from

The Great Social Media Diaspora & Tapestry is here

Friday, February 14, 2025

Apple introduces new app called 'Apple Invites', The Iconfactory launches Tapestry, beyond the traditional portfolio, and more in this week's issue of Creativerly. Creativerly The Great

Daily Coding Problem: Problem #1689 [Medium]

Friday, February 14, 2025

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a linked list, sort it in O(n log n) time and constant space. For example,

📧 Stop Conflating CQRS and MediatR

Friday, February 14, 2025

​ Stop Conflating CQRS and MediatR Read on: m​y website / Read time: 4 minutes The .NET Weekly is brought to you by: Step right up to the Generative AI Use Cases Repository! See how MongoDB powers your