[in case you missed it] Data Science Weekly - Issue 469

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #469

November 17 2022

Editor's Picks

 
  • How I learn machine learning
    The context for the advice I’m about to share is: I started without an engineering background and through hard work and a lot of luck became a machine learning engineer...My overarching goal as an MLE is to continuously work towards designing and deploying well-designed, and transparent machine learning systems and to learn the best software engineering practices to do so...So, as always, take this blog post mostly as advice to my past self that may or may not work for you based on your goals...
  • Robotics and AI in Fulfillment at Amazon
    I’ve been leading the Robotics AI Organization at Amazon since 2018, inventing and deploying Robotics and AI in Fulfillment. You are receiving packages at home today that have been manipulated or transported by my robots. I’ll talk about some of our challenges in perception, continual learning, motion planning, and control, our solutions to date, and the technical gaps I’d like us all to address. I’ll also talk about my lessons learned scaling Robotics and AI worldwide to thousands of systems...
  • The Near Future of AI is Action-Driven...and it will look a lot like AGI
    In 2022, large language models (LLMs) finally got good...But the best is yet to come. The really exciting applications will be action-driven, where the model acts like an agent choosing actions. And although academics can argue all day about the true definition of AGI, an action-driven LLM is going to look a lot like AGI...


 

A Message from this week's Sponsor:

 



Pinecone vector database

The Pinecone vector database makes it easy to build high-performance vector search applications. Developer-friendly, fully managed, and easily scalable without infrastructure hassles.

Use Pinecone to build semantic search, object recognition, recommendations, anomaly detection, and other vector-based functionality into your applications.



 

Data Science Articles & Videos

  • Ten Years of Image Synthesis
    It’s the end of 2022. Deep learning models have become so good at generating images that, at this point, it is more than clear that they are here to stay. How did we end up here? The timeline below traces some milestones – papers, architectures, models, datasets, experiments – starting from the beginning of the current “AI summer” ten years ago...
  • Dealing with Career Stagnation: my Machine Learning Story [Video]
    A recent article Mark Saroufim wrote went viral and it discusses the many ways in which academic ML has stagnated and has started rewarding incremental work. Some of the most common questions he's gotten since have been "Should I still pursue a PhD in Machine Learning?", "How can I get that FAANG job?", "How can I set myself up to start my own innovative company?". Mark has the dubious honor of having tried every single career path in tech and will share his learnings and stagnations...
  • Machine Learning with Unix Pipes
    I contribute to an open source programmer-focused machine learning library called Shumai. Recently, we added some basic /dev/stdin handling, which makes it possible to compose standard Unix utilities with machine learning on the command-line...
  • MuLan: A Joint Embedding of Music Audio and Natural Language
    Music tagging and content-based retrieval systems have traditionally been constructed using pre-defined ontologies covering a rigid set of music attributes or text queries. This paper presents MuLan: a first attempt at a new generation of acoustic models that link music audio directly to unconstrained natural language music descriptions...We demonstrate the versatility of the MuLan embeddings with a range of experiments including transfer learning, zero-shot music tagging, language understanding in the music domain, and cross-modal retrieval applications...
  • Towards Geometric Deep Learning I: On the Shoulders of Giants
    Geometric Deep Learning approaches a broad class of ML problems from the perspectives of symmetry and invariance, providing a common blueprint for neural network architectures as diverse as CNNs, GNNs, and Transformers. In a new series of posts, we study how these ideas have emerged through history from ancient Greek geometry to Graph Neural Networks...
  • No Designer Needed: How to Create Beautiful Reports Using Only R [Video]
    If you need to make a PDF report from R, you've got two choices: 1) work with a designer to do the layout, or 2) try to do the layout yourself with LaTeX...The pagedown package lets you use HTML and CSS to design PDF layouts. These two languages are much more common than LaTeX, much easier to learn than LaTeX, and much easier to customize than LaTeX...I'll show how you can too can make beautiful PDF – no designer needed...
  • Exercise: Find Problems in an Evaluation
    One of my favourite books on doing good scientific research is Greenhalgh’s "How to Read a Paper". This is a wonderful book whose goal is to help doctors critically read medical research papers, so that they can assess for themselves whether the paper is scientifically solid or not; especially with regard to whether experimental results are trustworthy...being able to read a scientific paper and assess whether its experimental results are trustworthy is an important skill for all scientists, including CS/AI/NLP researchers. I’m not aware of an equivalent book for AI researchers, sadly, but I do encourage my students to critically read papers and look for flaws in their experiments or evaluation...
  • Too much efficiency makes everything worse: overfitting and the strong version of Goodhart's law
    Increased efficiency can sometimes, counterintuitively, lead to worse outcomes. This is true almost everywhere. We will name this phenomenon the strong version of Goodhart's law...This same counterintuitive relationship between efficiency and outcome occurs in machine learning, where it is called overfitting. Overfitting is heavily studied, somewhat theoretically understood, and has well known mitigations. This connection between the strong version of Goodhart's law in general, and overfitting in machine learning, provides a new lens for understanding bad outcomes, and new ideas for fixing them...
  • Using functional analysis to model air pollution data in R
    Let's say you need to understand how your data changes within a day, and between different days. Functional analysis is one approach of doing just that so here's how I applied functional analysis to some air pollution data using R!...
  • Emergent abilities of large language models
    In "Emergent abilities of large language models", we defined an emergent ability as an ability that is “not present in small models but is present in large models.” Is emergence a rare phenomena, or are many tasks actually emergent?...It turns out that there are more than 100 examples of emergent abilities that already been empirically discovered by scaling language models such as GPT-3, Chinchilla, and PaLM. To facilitate further research on emergence, I have compiled a list of emergent abilities in this post...
  • Acquisition of chess knowledge in AlphaZero
    We analyze the knowledge acquired by AlphaZero, a neural network engine that learns chess solely by playing against itself yet becomes capable of outperforming human chess players. Although the system trains without access to human games or guidance, it appears to learn concepts analogous to those used by human chess players. We provide two lines of evidence...


 

Tool*

 



Retool is the fast way to build an interface for any database

With Retool, you don't need to be a developer to quickly build an app or dashboard on top of any data set. Data teams at companies like NBC use Retool to build any interface on top of their data—whether it's a simple read-write visualization or a full-fledged ML workflow.

Drag and drop UI components—like tables and charts—to create apps. At every step, you can jump into the code to define the SQL queries and JavaScript that power how your app acts and connects to data. The result—less time on repetitive work and more time to discover insights.


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

   
 

Webinar*

 



SuperAnnotate Webinar

In December last year SuperAnnotate hosted a webinar “2021 CV’s year retrospective and opportunities for 2022” to wrap up the passing year in AI and share their predictions of 2022.

We are excited to share that this year SuperAnnotate is hosting an end-of-the-year webinar again reviewing the developments in the AI space in 2022 and sharing what we can expect from the year ahead. This webinar will be covering everything from generative models like Stable Diffusion, NLP with Large Language Models, DataOps and Data-Centricity, Transformers expanding into CV; new models like YOLOv7, large partnerships in A(G)I space and more! Following that, SuperAnnotate's CTO and co-founder Vahan Petrosyan will share his predictions for 2023.

Join us to see which of their predictions from the previous webinar came true, sum up developments in AI this year, and see what to expect from 2023. Register Now.


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

   

 

Jobs

 
  • Senior Data Analyst - Epic Games - New York

    Epic Games spans across 19 countries with 55 studios and 4,500+ employees globally. For over 25 years, we’ve been making award-winning games and engine technology that empowers others to make visually stunning games and 3D content that bring environments to life like never before.

    Use your expert experience in data & analytics to build powerful stories and visuals that inform the games we make, the technology we develop, and business decisions that drive Epic... Epic Games is looking for a Senior Data Analyst to help us create the models that fuel our creator economy. The successful candidate will have excellent SQL knowledge, and enjoy combining analytic skills with business acumen to provide the data and insights that will drive our continued success...

     

        Want to post a job here? Email us for details --> team@datascienceweekly.org

 

 

Training & Resources

 
  • R for Data Analysis [Free Book]
    The purpose of this book is to inspire and enable anyone who reads it to reconsider the methods they currently employ to analyse data. This is not to suggest that the methodologies outlined will be useful or sufficient for everyone who reads it. Some analyses can be performed quickly without the need for additional computation while others will require advanced analytics techniques not outlined in this book; however, the aspiration is that all will be equipped with novel tools and ideas for approaching data analysis...
  • The State of Multilingual AI
    This post takes a closer look at the state of multilingual AI. How multilingual are current models in NLP, computer vision, and speech? What are the main recent contributions in this area? What challenges remain and how we can we address them?...


Last Week's Newsletter's 3 Most Clicked Links

 

* Based on unique clicks.

** Find last week's newsletter here.

 

Cutting Room Floor

 


P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2022 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

Data Science Weekly - Issue 469

Friday, November 18, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #469 November 17 2022 Editor's Picks

Data Science Weekly - Issue 468

Friday, November 11, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #468 November 03 2022 Editor's Picks

Data Science Weekly - Issue 467

Thursday, November 3, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #467 November 03 2022 Editor's Picks

Data Science Weekly - Issue 466

Thursday, October 27, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #466 October 27 2022 Editor's Picks

Data Science Weekly - Issue 465

Thursday, October 20, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #465 October 20 2022 Editor's Picks

You Might Also Like

Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator

Friday, February 14, 2025

What came before the golem? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Defining Your Paranoia Level: Navigating Change Without the Overkill

Friday, February 14, 2025

We've all been there: trying to learn something new, only to find our old habits holding us back. We discussed today how our gut feelings about solving problems can sometimes be our own worst enemy

5 ways AI can help with taxes 🪄

Friday, February 14, 2025

Remotely control an iPhone; 💸 50+ early Presidents' Day deals -- ZDNET ZDNET Tech Today - US February 10, 2025 5 ways AI can help you with your taxes (and what not to use it for) 5 ways AI can help

Recurring Automations + Secret Updates

Friday, February 14, 2025

Smarter automations, better templates, and hidden updates to explore 👀 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

The First Provable AI-Proof Game: Introducing Butterfly Wings 4

Friday, February 14, 2025

Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? undefined The Market Today #01 Instagram (Meta) 714.52 -0.32%

GCP Newsletter #437

Friday, February 14, 2025

Welcome to issue #437 February 10th, 2025 News BigQuery Cloud Marketplace Official Blog Partners BigQuery datasets now available on Google Cloud Marketplace - Google Cloud Marketplace now offers

Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰

Friday, February 14, 2025

Discover how the share of US wealth held by the top 1% has evolved from 1989 to 2024 in this infographic. View Online | Subscribe | Download Our App Download our app to see thousands of new charts from

The Great Social Media Diaspora & Tapestry is here

Friday, February 14, 2025

Apple introduces new app called 'Apple Invites', The Iconfactory launches Tapestry, beyond the traditional portfolio, and more in this week's issue of Creativerly. Creativerly The Great

Daily Coding Problem: Problem #1689 [Medium]

Friday, February 14, 2025

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a linked list, sort it in O(n log n) time and constant space. For example,

📧 Stop Conflating CQRS and MediatR

Friday, February 14, 2025

​ Stop Conflating CQRS and MediatR Read on: m​y website / Read time: 4 minutes The .NET Weekly is brought to you by: Step right up to the Generative AI Use Cases Repository! See how MongoDB powers your