Data Science Weekly - Data Science Weekly - Issue 470

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #470

November 24 2022

Editor's Picks

 
  • What Does it Mean to Give Someone What They Want? The Nature of Preferences in Recommender Systems
    In practice, most recommenders optimize for engagement. This has been justified by the assumption that people always choose what they want, an idea from 20th-century economics called revealed preference. However, this approach to preferences can lead to a variety of unwanted outcomes including clickbait, addiction, or algorithmic manipulation...Doing better requires both a change in thinking and a change in approach. We’ll propose a more realistic definition of preferences, taking into account a century of interdisciplinary study, and two concrete ways to build better recommender systems...
  • The GitLab Data Team Handbook
    GitLab has two primary distinct groups within the Data Program who use data to drive insights and business decisions...The two teams are the (central) Data Team and, separately, Function Analytics Teams located in Sales, Marketing, Product, Engineering or Finance...The Data Team Handbook contains a large amount of information! To help you navigate the handbook we've organized it into the following major sections: a) Dashboards & Data you can use, b) How data works at GitLab, c) How the data team works, d) How the data platform works, and e) What the data team is working on...
  • Playtesting Candy Crush: Human-Like Playtesting with Deep Learning
    Today I learned that there's actualy research in playtesting video games using deep learning...What's interesting is that the paper is actually written by actual employees from actual video game companies. But also that it decided to explore these techniques for Candy Crush Saga...


 

A Message from this week's Sponsor:

 



Pinecone vector database

The Pinecone vector database makes it easy to build high-performance vector search applications. Developer-friendly, fully managed, and easily scalable without infrastructure hassles.

Use Pinecone to build semantic search, object recognition, recommendations, anomaly detection, and other vector-based functionality into your applications.



 

Data Science Articles & Videos


  • World's Simplest Data Pipeline?
    How much you invest in your data engineering capability is dependent on your own ambition and needs and the risks of over- or under-investing are well documented. I believe there are some rules and guidelines that are universally applicable, regardless of your team size or tech stack, and that following these rules can save huge headaches in both teams of one and teams of one hundred...In order to demonstrate this, about a year ago I built the simplest data pipeline I could build while still adhering to my rules...
  • Explaining the Effects of Clouds on Remote Sensing Scene Classification
    Most of Earth is covered by haze or clouds, impeding the constant monitoring of our planet...little effort has been spent on understanding how exactly atmospheric disturbances impede the application of modern machine learning methods to Earth observation data...We provide a thorough investigation of how classifiers trained on cloud-free data fail once they encounter noisy imagery – a common scenario encountered when deploying pretrained models for remote sensing to real use cases...
  • A Short Guide for Feature Engineering and Feature Selection
    Feature engineering and selection is the art/science of converting data to the best way possible, which involve an elegant blend of domain expertise, intuition and mathematics. This guide is a concise reference for beginners with most simple yet widely used techniques for feature engineering and selection...
  • Demystifying ML PhD Admissions to US Universities [Video]
    This video is a recording of the panel discussion and Q & A on ML PhD admissions to US universities...Several faculty from various universities in the US took part in it including Tatsu Hashimoto (Stanford), Rada Mihalcea (UMichigan), Devi Parikh (Georgia Tech), Sameer Singh (UC Irvine), and James Zou (Stanford)...
  • Writing a scientific article: A step-by-step guide for beginners
    Many young researchers find it extremely difficult to write scientific articles, and few receive specific training in the art of presenting their research work in written format. Yet, publication is often vital for career advancement, to obtain funding, to obtain academic qualifications, or for all these reasons. We describe here the basic steps to follow in writing a scientific article. We outline the main sections that an average article should contain; the elements that should appear in these sections, and some pointers for making the overall result attractive and acceptable for publication...
  • Tools to Improve Training Data - Talking Language AI Episode #2 [Video]
    Vincent Warmerdam builds a lot of NLP tools. Many of these tools target the scikit-learn ecosystem and there's a theme of labeling across many of them. A recent focus of his stack of tools is to improve training data. In this video, Vincent and Jay discuss a few of these tools and show how they work together...These tools are discussed in the video: a) Human-learn: a toolkit to build human-based scikit-learn components, b) Doubtlab: a toolkit to help find doubtful labels in data, c) Embetter: A library that makes it very easy to use embeddings in scikit-learn, and d) Bulk: a library that uses embeddings to leverage bulk labeling...The talk includes live demos for each and to show how some simple tricks can go a long way...
  • Planes are still decades away from displacing most bird jobs
    Here’s the thing: all human-built artificial flight (AF) machines are incredibly specialized and are far away from being able to perform most of the tasks birds – the only general flight (GF) machines we are aware of – can perform...
  • Robot Learning on the Job: Human-in-the-Loop Autonomy and Learning During Deployment
    To harness the capabilities of state-of-the-art robot learning models while embracing their imperfections, we present Sirius, a principled framework for humans and robots to collaborate through a division of work. In this framework, partially autonomous robots are tasked with handling a major portion of decision-making where they work reliably; meanwhile, human operators monitor the process and intervene in challenging situations...
  • General purpose visual recognition across modalities with limited supervision [Video]
    Ishan Misra, FAIR (Meta AI), presents on how modern computer vision models are good at specialized tasks...However, specialist models also have severe limitations — they can only do what they are trained for and require copious amounts of pristine supervision for it. In this talk, he focuses on two limitations: specialist models cannot work on tasks beyond what they saw training labels for, or on new types of visual data. He’ll present our recent efforts that design better architectures, training paradigms and loss functions to address these issues...


 

Tool*

 



Retool is the fast way to build an interface for any database

With Retool, you don't need to be a developer to quickly build an app or dashboard on top of any data set. Data teams at companies like NBC use Retool to build any interface on top of their data—whether it's a simple read-write visualization or a full-fledged ML workflow.

Drag and drop UI components—like tables and charts—to create apps. At every step, you can jump into the code to define the SQL queries and JavaScript that power how your app acts and connects to data. The result—less time on repetitive work and more time to discover insights.


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

   
 

Webinar*

 



SuperAnnotate Webinar

In December last year SuperAnnotate hosted a webinar “2021 CV’s year retrospective and opportunities for 2022” to wrap up the passing year in AI and share their predictions of 2022.

We are excited to share that this year SuperAnnotate is hosting an end-of-the-year webinar again reviewing the developments in the AI space in 2022 and sharing what we can expect from the year ahead. This webinar will be covering everything from generative models like Stable Diffusion, NLP with Large Language Models, DataOps and Data-Centricity, Transformers expanding into CV; new models like YOLOv7, large partnerships in A(G)I space and more! Following that, SuperAnnotate's CTO and co-founder Vahan Petrosyan will share his predictions for 2023.

Join us to see which of their predictions from the previous webinar came true, sum up developments in AI this year, and see what to expect from 2023. Register Now.


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

   

 

Jobs

 
  • Senior Data Analyst - Epic Games - New York

    Epic Games spans across 19 countries with 55 studios and 4,500+ employees globally. For over 25 years, we’ve been making award-winning games and engine technology that empowers others to make visually stunning games and 3D content that bring environments to life like never before.

    Use your expert experience in data & analytics to build powerful stories and visuals that inform the games we make, the technology we develop, and business decisions that drive Epic... Epic Games is looking for a Senior Data Analyst to help us create the models that fuel our creator economy. The successful candidate will have excellent SQL knowledge, and enjoy combining analytic skills with business acumen to provide the data and insights that will drive our continued success...

     

        Want to post a job here? Email us for details --> team@datascienceweekly.org

 

 

Training & Resources

 
  • Introduction to Robotics @ Princeton
    Lectures from "Introduction to Robotics" at Princeton University by instructor Anirudha Majumdar...This course will provide an introduction to the fundamental theoretical and algorithmic principles behind robotic systems. The course will also allow students to get hands-on experience through project-based assignments with the Crazyflie quadrotor....



Last Week's Newsletter's 3 Most Clicked Links

 

* Based on unique clicks.

** Find last week's newsletter here.

 

Cutting Room Floor

 


P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2022 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

[in case you missed it] Data Science Weekly - Issue 469

Sunday, November 20, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #469 November 17 2022 Editor's Picks

Data Science Weekly - Issue 469

Friday, November 18, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #469 November 17 2022 Editor's Picks

Data Science Weekly - Issue 468

Friday, November 11, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #468 November 03 2022 Editor's Picks

Data Science Weekly - Issue 467

Thursday, November 3, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #467 November 03 2022 Editor's Picks

Data Science Weekly - Issue 466

Thursday, October 27, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #466 October 27 2022 Editor's Picks

You Might Also Like

Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator

Friday, February 14, 2025

What came before the golem? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Defining Your Paranoia Level: Navigating Change Without the Overkill

Friday, February 14, 2025

We've all been there: trying to learn something new, only to find our old habits holding us back. We discussed today how our gut feelings about solving problems can sometimes be our own worst enemy

5 ways AI can help with taxes 🪄

Friday, February 14, 2025

Remotely control an iPhone; 💸 50+ early Presidents' Day deals -- ZDNET ZDNET Tech Today - US February 10, 2025 5 ways AI can help you with your taxes (and what not to use it for) 5 ways AI can help

Recurring Automations + Secret Updates

Friday, February 14, 2025

Smarter automations, better templates, and hidden updates to explore 👀 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

The First Provable AI-Proof Game: Introducing Butterfly Wings 4

Friday, February 14, 2025

Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? undefined The Market Today #01 Instagram (Meta) 714.52 -0.32%

GCP Newsletter #437

Friday, February 14, 2025

Welcome to issue #437 February 10th, 2025 News BigQuery Cloud Marketplace Official Blog Partners BigQuery datasets now available on Google Cloud Marketplace - Google Cloud Marketplace now offers

Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰

Friday, February 14, 2025

Discover how the share of US wealth held by the top 1% has evolved from 1989 to 2024 in this infographic. View Online | Subscribe | Download Our App Download our app to see thousands of new charts from

The Great Social Media Diaspora & Tapestry is here

Friday, February 14, 2025

Apple introduces new app called 'Apple Invites', The Iconfactory launches Tapestry, beyond the traditional portfolio, and more in this week's issue of Creativerly. Creativerly The Great

Daily Coding Problem: Problem #1689 [Medium]

Friday, February 14, 2025

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a linked list, sort it in O(n log n) time and constant space. For example,

📧 Stop Conflating CQRS and MediatR

Friday, February 14, 2025

​ Stop Conflating CQRS and MediatR Read on: m​y website / Read time: 4 minutes The .NET Weekly is brought to you by: Step right up to the Generative AI Use Cases Repository! See how MongoDB powers your