Editor's Picks
- New (1h57m) video lecture from Andrej Karpathy
"The spelled-out intro to language modeling: building makemore"...We build a neural net bigram language model (working up to transformers). Micrograd was fun, now things complexify: tensors, broadcasting, training, sampling...
- Data Activation In The Modern Data Stack
The activation layer of the modern data stack is my favorite since it allows you to take action on the data — in the tools you depend on — to build personalized, data-powered experiences...You finally get to go beyond looking at dashboards and utilize data in a meaningful manner, and in the process, do more impactful work...With so many companies innovating and building products to activate data, it’s not straightforward to ascertain which of the processes, tools, and technologies should fall under data activation...After talking to many founders and giving it a lot of thought, here’s what I recommend the activation layer should comprise......
A Message from this week's Sponsor:
Pinecone vector database
The Pinecone vector database makes it easy to build high-performance vector search applications. Developer-friendly, fully managed, and easily scalable without infrastructure hassles.
Use Pinecone to build semantic search, object recognition, recommendations, anomaly detection, and other vector-based functionality into your applications.
Data Science Articles & Videos
- Applied NLP Research at Primer
John Bohannon is a Senior Director of Data Science and Head of Research at Primer AI, an end-to-end machine intelligence solution for textual data. We discussed their process of translating ML research into ML products, through the lens of the following examples: a) Zero shot entity recognition, b) Inference triage, c) Tools for detecting synthetic text, d) Text Summarization, e) End-to-end platforms for NLP applications...
- Using Web Server Logs to Answer Product and Business Questions
This tutorial demonstrates how to set up a relatively lightweight data stack that will serve as a platform to answer questions from web server access logs about who is using your product and how they are using it. This Data Stack can run on any cloud, could scale with your business and potentially provide all the capabilities you require, this ain’t no toy data stack...
- MuJoCo Menagerie
Menagerie is a collection of high-quality models for the MuJoCo physics engine, curated by DeepMind...A physics simulator is only as good as the model it is simulating, and in a powerful simulator like MuJoCo with many modeling options, it is easy to create "bad" models which do not behave as expected. The goal of this collection is to provide the community with a curated library of well-designed models that work well right out of the gate...
- AudioLM: a Language Modeling Approach to Audio Generation
We introduce AudioLM, a framework for high-quality audio generation with long-term consistency. AudioLM maps the input audio to a sequence of discrete tokens and casts audio generation as a language modeling task in this representation space. We show how existing audio tokenizers provide different trade-offs between reconstruction quality and long-term structure, and we propose a hybrid tokenization scheme to achieve both objectives. Namely, we leverage the discretized activations of a masked language model pre-trained on audio to capture long-term structure and the discrete codes produced by a neural audio codec to achieve high-quality synthesis...
- What songs were popular when I was in high school?
Why do algorithmic recommendations leave so much to be desired?...you start scrolling through your algorithmically created playlists. The ones Spotify makes just for you. But they don't feel like they are for you, but instead for someone who looks like you, the stereotypical you...like any data-oriented person, I decided to do entirely too much work to learn a few things. In the end, I did come up with the insight I was looking for...
- Why Momentum Really Works
We often think of Momentum as a means of dampening oscillations and speeding up the iterations, leading to faster convergence. But it has other interesting behavior. It allows a larger range of step-sizes to be used, and creates its own oscillations. What is going on?...
- A Review of Sparse Expert Models in Deep Learning
Sparse expert models are a thirty-year old concept re-emerging as a popular architecture in deep learning. This class of architecture encompasses Mixture-of-Experts, Switch Transformers, Routing Networks, BASE layers, and others, all with the unifying idea that each example is acted on by a subset of the parameters...We review the concept of sparse expert models, provide a basic description of the common algorithms, contextualize the advances in the deep learning era, and conclude by highlighting areas for future work...
- Stop Pickling your ML Models. Use ONNX instead!
When you pickle a model you are serializing a python object so it can be stored in a file...In contrast when you export a model to ONNX you are converting it to a set of operations that can be executed directly by the framework...What this means is that your model is no longer strongly coupled to your specific python environment. In fact it’s no longer coupled with Python at all, because ONNX models are portable to many different languages...let’s now get into an example on how you can convert your models to both pickle and ONNX...
- A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification
This hands-on introduction is aimed to provide the reader a working understanding of conformal prediction and related distribution-free uncertainty quantification techniques with one self-contained document. We lead the reader through practical theory for and examples of conformal prediction and describe its extensions to complex machine learning tasks involving structured outputs, distribution shift, time-series, outliers, models that abstain, and more...
- Analyzing Employee Attrition in Healthcare Data and Predicting Outcomes
Healthcare employers can use their proprietary data, much of which contain insightful signals on causes of attrition and burn out. This is where data analytics and predictive modeling can be useful. For example, data analytics can aid employers in identifying employees and departments at high risk of attrition. Further, this can aid employers in determining the factors that contribute to high attrition rates....
- Finding a picture in an image without marking it up?
We often see pictures in images: comics, for example, combine several pictures into one. And if you have an entertainment app where people post memes, like in our iFunny, you’re going to run into that all the time. Neural networks are already capable of finding animals, people, or other objects, but what if we need to find but another image in the image? Let’s take a closer look at our algorithm so that you can test it with a notebook in Google Colaboratory and even implement it in your project....
Summit*
Register for IMPACT 2022: The Data Observability Summit
Join thousands of professionals for a virtual event October 25-26 to learn how to drive real-world impact with your data at scale.
Get inspired with virtual keynotes from Nate Silver, the FiveThirtyEight founder and editor-in-chief, Daniel Kahneman, the Nobel Prize-winning psychologist, economist, and author of Thinking, Fast and Slow. Hear from the founders and chief executives of Databricks, Looker, Confluent, dbt Labs, and Fivetran about the industry's hottest technologies. Leverage best practices from leaders heading the industry’s top data organizations including The New York Times, Roche, and GitLab.
RSVP at impactdatasummit.com/2022
*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!
Jobs
- Data Scientist - Success Academy Charter Schools, Inc - NYC
This new Data Scientist role will be a key contributor to our mission of driving innovation across the organization. Reporting to the Leader of Enterprise Analytics, this role will be responsible for working with stakeholders in various functions to understand areas of opportunity, developing analytical solutions ranging from dashboards to sophisticated mathematical models, and helping functional teams adopt those solutions. This role will be part of a highly collaborative team of professionals with a wide range of skills including data science, data engineering, business analysis, and project management....
Want to post a job here? Email us for details --> team@datascienceweekly.org
Training & Resources
- Deep Learning Paper Implementations
59 Implementations/tutorials of deep learning papers with side-by-side notes ๐; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, ...), gans(cyclegan, stylegan2, ...), ๐ฎ reinforcement learning (ppo, dqn), capsnet, distillation,... ๐ง ...
- Cycle-GAN implemented in PyTorch
This repository contains an implementation of the Cylce-GAN architecture for style transfer along with instructions to train on an own dataset...
- 6.S965 • Fall 2022 • MIT: TinyML and Efficient Deep Learning
This course is a deep dive into efficient machine learning techniques that enable powerful deep learning applications on resource-constrained devices. Topics cover efficient inference techniques, including model compression, pruning, quantization, neural architecture search, and distillation; and efficient training techniques, including gradient compression and on-device transfer learning; followed by application-specific model optimization techniques for videos, point cloud, and NLP; and efficient quantum machine learning...
What you’re up to – notes from DSW readers
* To share your projects and updates, share the details here.
** Want to chat with one of the above people? Hit reply and let us know :)
Last Week's Newsletter's 3 Most Clicked Links
* Based on unique clicks.
** Find last week's newsletter here.
P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian |