Editor Picks
- Literary AI
The Literary History of Artificial Intelligence...explores the long, shared history of literature and computation through the Columbia Library’s holdings...Following a timeline from circa 1890–1970, this exhibition explores professional manuals, devices, and techniques that promised to make writing easier—and even to automate it. The Literary History of AI showcases examples of algorithmic composition, such as prose and poetry written by machines, alongside literature written with the aid of algorithmic and combinatorial devices...
- 8 surprising ways to use Jupyter Notebooks
The Jupyter Notebook is a great tool for experimentation with code. It provides the REPL (read-eval-print loop) with a visual interface for plots, tables and many more. You can mix Markdown and selected programming language (usually Python). It is a default choice of development and experimentation environment for data scientists and machine learning practitioners. Have you heard about other ways to use the Jupyter Notebook? Let’s explore 8 alternative ways of how to use Jupyter Notebook that might surprise you!...
- Watch The NBA Finals Players Data Jam
NBA data fans can view data animation cards of players who made or missed shots over the course of the game, along with their sentiment analysis, plus fouls by the team. A bar chart race like "The flux of gameplay" that shows the score for each team, a play, over the course of the game by shot analysis. A look at successful shots on the court, each shot made, by team, over the course of the game...
A Message from this week's Sponsor:
Retool is the fast way to build an interface for any database
With Retool, you don't need to be a developer to quickly build an app or dashboard on top of any data set. Data teams at companies like NBC use Retool to build any interface on top of their data—whether it's a simple read-write visualization or a full-fledged ML workflow.
Drag and drop UI components—like tables and charts—to create apps. At every step, you can jump into the code to define the SQL queries and JavaScript that power how your app acts and connects to data. The result—less time on repetitive work and more time to discover insights.
Data Science Articles & Videos
- How to play with the GPT-3 language model
I ran a Twitter poll the other day asking if people had tried GPT-3 and why or why not. The winning option, by quite a long way, was “No, I don’t know how to”. So here’s how to try it out, for free, without needing to write any code...
- Putting a two-layered recommendation system into production
Recommendation systems will always stay relevant — users want to see personalized content, the best of the catalog (in the case of our iFunny app — trending memes and jokes). Our team is testing dozens of hypotheses on how a smart feed can improve user experience. This article will tell you how we implemented the second-ranking level of the model above the collaborative one: what difficulties we encountered, and how they affected the metrics...
- Resolving the Human Subjects Status of Machine Learning's Crowdworkers
In recent years, machine learning (ML) has come to rely more heavily on crowdworkers, both for building bigger datasets and for addressing research questions requiring human interaction or judgment...Additionally, few ML papers involving crowdwork mention IRB oversight, raising the prospect that many might not be in compliance with ethical and regulatory requirements. In this paper, we focus on research in natural language processing to investigate the appropriate designation of crowdsourcing studies and the unique challenges that ML research poses for research oversight...
- Some thoughts on machine learning with small data
I feel techniques for making machine learning work with small data are not talked about enough. This makes sense because many ML applications are only possible by collecting a huge amount of data...But there are many legitimate reasons for only being able to work on small data sets...I think there are basically two approaches/philosophies to dealing with this problem...
- Diary of a spaCy project: Predicting GitHub Tags
One could learn how an oven works, but that doesn’t mean that you’ve learned how to cook. Similarly, one could understand the syntax of a machine learning tool, and still not be able to apply the technology in a meaningful way. That’s why in this blogpost I’d like to describe some topics that surround the creation of a spaCy project that isn’t directly related to syntax and instead relate more to “the act” of doing an NLP project in general...
- The Annotated Transformer
The Transformer has been on a lot of people’s minds over the last year five years. This post presents an annotated version of the paper in the form of a line-by-line implementation. It reorders and deletes some sections from the original paper and adds comments throughout. This document itself is a working notebook, and should be a completely usable implementation...
- What does it mean when an AI fails? A Reply to SlateStarCodex’s riff on Gary Marcus
I'm flattered no end that yesterday @slatestarcodex, aka Scott Alexander, devoted 3600 words to dissecting yours truly. And he called me "a legend" no less :)...Of course that was moments before digging the knife in and suggesting that maybe I was wrong about more or less everything I've written for the last few years (whilst carefully noting that he has in no way proven that he had). It's an entertaining read, and like his previous essay on DALL-E ], worth reading...
- From data to functa: Your data point is a function and you can treat it like one
It is common practice in deep learning to represent a measurement of the world on a discrete grid, e.g. a 2D grid of pixels. However, the underlying signal represented by these measurements is often continuous, e.g. the scene depicted in an image. A powerful continuous alternative is then to represent these measurements using an implicit neural representation, a neural function trained to output the appropriate measurement value for any input spatial location. In this paper, we take this idea to its next level: what would it take to perform deep learning on these functions instead, treating them as data? In this context we refer to the data as functa, and propose a framework for deep learning on functa...
- The Machine Learning Scientist as Toolsmith
One of the most influential and formative essays of my undergraduate days is The Computer Scientist as Toolsmith II by Fred Brooks of The Mythical Man-Month fame. Written in 1996 as a speech of someone looking back to a long and successful career, it contains many nuggets of wisdom, showing the vision one man has for his chosen research field. The essay, which I shall henceforth abbreviate as CST-II, can also be read as a passionate cri de cœur for computer science—specifically, computer graphics—to not lose track of the problems that matter. Having just celebrated its 25th anniversary, I want to honour Dr. Brooks with an essay of my own, dealing with machine learning...
- Finding Patterns in Seizure Data
Around one third of people with epilepsy aren’t able to control seizures with medication or treatment. For these people, not knowing when a seizure will occur makes even mundane, day-to-day tasks profoundly difficult...if seizures can’t be controlled, is there a way they could be managed? In short, yes — there is...Being able to forecast seizures, hours to days in advance, would give someone the opportunity to manage, plan, and respond appropriately...
- AGI Ruin: A List of Lethalities
I have several times failed to write up a well-organized list of reasons why AGI will kill you. People come in with different ideas about why AGI would be survivable, and want to hear different obviously key points addressed first. Some fraction of those people are loudly upset with me if the obviously most important points aren't addressed immediately, and I address different points first instead...Having failed to solve this problem in any good way, I now give up and solve it poorly with a poorly organized list of individual rants...
Bootcamp*
Jumpstart Your Data Career Today
The Data Incubator Is Accepting Applications for Their Data Bootcamps!
Data jobs have some of the most competitive salaries in the world right now. And when you master the skills from our bootcamps, you’ll practically be fighting off job offers with a stick.
The only thing standing between you and this next step in your career is training.
Apply early to increase your chances of earning full tuition for free. Early applications are due July 1, 2022.
Apply Now
*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!
Jobs
- Senior Data Scientist, Startup Creation at Redesign Health - US
As our Senior Data Scientist for our Startup Creation team, you will set up and configure the data infrastructure for our startups, and work with the startup founding team to define data driven KPIs, and implement automated statistical analyses of customer behavior. Your goal is to make all of the companies that we launch data-driven from day one.
In this role, you will function as an in-house implementation team for the companies that Redesign Health launches (internally referred to as OpCos). We provide data strategy, data pipeline, data analytics and forecasting services to newly formed companies in a repeatable and scalable manner...
Want to post a job here? Email us for details --> team@datascienceweekly.org
Training & Resources
- Diffusion models
A minimal standalone example of diffusion model. The objective is to understand the forward as well as reverse mapping process of diffusion models. The notebook contains the equations along side the code and some visualizations....
- JAX and TensorFlow interoperation
This package provides experimental support for interoperation between JAX and TensorFlow...The jax2tf.convert mechanism can wrap a function written in JAX, possibly including JAX transformations, and turn it into a function that uses only TensorFlow operations. The converted function can be called or traced from TensorFlow and will behave as if it was written in TensorFlow...
What you’re up to – notes from DSW readers
- Have something to share? Fill out the form below :) ...
* To share your projects and updates, share the details here.
** Want to chat with one of the above people? Hit reply and let us know :)
Last Week's Newsletter's 3 Most Clicked Links
* Based on unique clicks.
** Find last week's newsletter here.
P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian |