Editor's Picks
- AI and The Limits Of Language
The underlying problem isn’t the AI. The problem is the limited nature of language. Once we abandon old assumptions about the connection between thought and language, it is clear that these systems are doomed to a shallow understanding that will never approximate the full-bodied thinking we see in humans. In short, despite being among the most impressive AI systems on the planet, these AI systems will never be much like us...
- Hello Free Teams
The community is the heart of Observable: it’s how people learn and get inspired; it’s how people find examples, templates, and components to accelerate their work...Today, in support of a vibrant, open, and collaborative future of data, we are launching Free Teams. Now, everyone can collaborate with their data people, openly and easily in public, for free!...Create a Free Team in one click, give it a name, and invite as many people to the team as you want. Editors and viewers are free...Your team will have live, public notebooks. Guests can see your updates without needing to republish or refresh...You can organize your team’s notebooks in collections, use team templates, and transfer notebooks between your team members...
- Backpropagation ≠ Chain Rule
The backpropagation algorithm is usually taught as an application of the chain rule in machine learning classes. This leads to a common belief that “backpropagation is just applying the chain rule repeatedly”. While this is in a sense true, we wish to point out in this blog post that this belief is over-simplifying and can lead to incorrect implementations of the backpropagation algorithm...we want to focus on a simple and basic difference here...
A Message from this week's Sponsor:
Retool is the fast way to build an interface for any database
With Retool, you don't need to be a developer to quickly build an app or dashboard on top of any data set. Data teams at companies like NBC use Retool to build any interface on top of their data—whether it's a simple read-write visualization or a full-fledged ML workflow.
Drag and drop UI components—like tables and charts—to create apps. At every step, you can jump into the code to define the SQL queries and JavaScript that power how your app acts and connects to data. The result—less time on repetitive work and more time to discover insights.
Data Science Articles & Videos
- Bayesian Age/Period/Cohort Models in Python with PyMC
For my day job, I spend a lot of time thinking about e-commerce analytics and cohort analysis in particular. Statistical age-period-cohort (APC) models are important in many fields such as epidemiology, demography, marketing, and many more. These models also pose some interesting inferential challenges for the unwary. This post shows how to use pymc to build Bayesian APC models in Python and presents a series of increasingly sophistocated systems of priors to resolve the inferential challenges these models pose...
- The alignment problem from a deep learning perspective [PDF]
As the field of machine learning advances, the alignment problem is becoming increasingly concrete. In a new report, I present the key high-level arguments for expecting AGI to be misaligned, grounded in technical details of deep learning...
- Reproducible Research and the Common Task Method
The ‘Reproducible Research’ idea posits that publishing data and code, not just statistical summaries, makes for better and faster science. In particular, shared datasets and shared evaluation metrics lower barriers to entry, and allow meaningful comparison of scientific hypotheses with engineering algorithms...In this lecture, Mark Liberman will describe the origins and development of the ‘Common Task’ method in DARPA’s human language technology program, its broader influence on recent research and development practices, and its lessons for the future...
- Stable Diffusion Public Release
Stable Diffusion is a text-to-image model that will empower billions of people to create stunning art within seconds. It is a breakthrough in speed and quality meaning that it can run on consumer GPUs. You can see some of the amazing output that has been created by this model without pre or post-processing on this page...
- Machine Learning for Time Series Intelligence
Aadyot Bhatnagar on building an end-to-end machine learning framework for time series data...Aadyot Bhatnagar, is a Senior Research Engineer at Salesforce, and co-creator of Merlion an open source framework for applying machine learning on time series data. Merlion supports a wide range of time series learning tasks including forecasting, anomaly detection, and change point detection. Equally important, Merlion is an end-to-end framework that covers loading and transforming data, building and training models, post-processing model outputs, and evaluating model performance...
- Matching Patient Data with Machine Learning (Part 1: The Problem with Rules)
Tracking patient journeys is difficult in practice. The very same real-world person has different representations across different third-party systems. In order to build a picture of their journey over time, we need the ability to link all of them together. To be both useful and HIPAA-compliant, there should be no false negatives (missing links) and no false positives (spurious links)...In this first installment of a two-part series, we’ll examine this problem in more detail. We’ll also see why one solution, based on rules devised by experts, leaves much to be desired....
- The problem with data industry is hiring roles instead of people [Reddit Discussion]
Data Engineer, Database Architecht, Data Scientist, Solution Architecht, Data Specialist...Each one of these categories contains a wide variety of skillsets with a lot of overlap. Some companies call anyone who knows SQL a Data Engineer, and some companies call anyone who knows XGBoost a Data Scientist...I've done tons of staff aug for companies and I have noticed a similar pattern: they can't find talent that has a holistic view on data. The data engineers only know and care about data engineering, data scientists only care about their algorithms, etc. There's no collaboration, communication or understanding of the other sides of the shop and no one there to form the bridge...
- ipyvizzu-story - Animated Chart Presentation in Jupyter Notebook
ipyvizzu-story enables users to create interactive presentations within the data science notebook of their choice. The extension provides a widget that contains the presentation and adds controls for navigating between slides - predefined stages within the story being presented. Navigation also works with keyboard shortcuts - arrow keys, PgUp, PgDn, Home, End - and you can also use a clicker to switch between the slides...
- ML for Good [Reddit Discussion]
Does anyone have any experience using their ML skills in a non-profit manner on societal or environmental issues? How did you get started?...Recently I’ve been feeling that most advances in tech don’t really solve the problems we really need solved. Feeling a bit disillusioned...
Data Collaboration Tool*
Explore, analyze, and explain data. As a team.
Collaborate to uncover new insights and make better decisions. Visualize data to communicate clearly. Share findings with transparency and context. Get support and inspiration from the community.
Uncover new insights, answer more questions, and make better decisions.
Sign Up For Free
*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!
Jobs
- Data Scientist - Success Academy Charter Schools, Inc - NYC
This new Data Scientist role will be a key contributor to our mission of driving innovation across the organization. Reporting to the Leader of Enterprise Analytics, this role will be responsible for working with stakeholders in various functions to understand areas of opportunity, developing analytical solutions ranging from dashboards to sophisticated mathematical models, and helping functional teams adopt those solutions. This role will be part of a highly collaborative team of professionals with a wide range of skills including data science, data engineering, business analysis, and project management....
Want to post a job here? Email us for details --> team@datascienceweekly.org
Training & Resources
- Open Curriculai
Welcome to Open Curriculai, an opinionated, constantly evolving, organized curation of top resources in the form of a curriculum and a resource hub, for people whose goal is to become a data scientist. It is intended to be a complete education in data science using online materials and a holistic approach to learning...
- Tips on how to prepare for real world SQL [Reddit Discussion]
I can write basic queries and I do leetcode problems daily, however, since I've never used SQL in the wild I feel somewhat unsure about if I should advertise myself as if I know it or not...So my question is: is there anything I can do that would simulate what I would be using SQL for so I could accurately assess mywelf? I understand that Im being vague but my point is there's a difference between doing housing price predictions and a real life ml problem, and I'm good there but I assume it's the same with SQL and I don't want to promise something that I may not be able to deliver on...
- Comprehensive Guide to Zero-Shot and K-Shot Learning
Deep neural networks have achieved state-of-the-art for many computer vision tasks. However, much of this performance improvement can be accredited to their utilisation and reliance on large amounts of supervised information for learning. There are many practical cases in which such training data is not available. Few-shot learning as an approach is tasked with dealing with such issues. Few-shot learning is a type of supervised learning that is intended to rapidly generalise to new tasks containing only a few samples of supervised information based on prior knowledge. At the extremes of this is one-shot learning, where a model is only given one reference per class at the inference stage before it has to find other instances in new images. The most extreme approach is Zero-Shot Learning...
What you’re up to – notes from DSW readers
- You and your work could be featured here :)
* To share your projects and updates, share the details here.
** Want to chat with one of the above people? Hit reply and let us know :)
Last Week's Newsletter's 3 Most Clicked Links
* Based on unique clicks.
** Find last week's newsletter here.
P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian |