Data Science Weekly - Data Science Weekly - Issue 468

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #468

November 03 2022

Editor's Picks

 
  • Planning to leave Twitter?
    With all of the uncertainty around Twitter's future, many are considering leaving the platform. But before blindly jumping into the unknown, users should seriously consider downloading and saving their Twitter data to analyze it for important trends, insights and information that they can take with them...We created the free dataviz tool below to illustrate how data visualization can help better inform users before they decide to delete their Twitter accounts and abandon years of useful data. Without dataviz, these insights are nearly impossible for anyone to decipher from a data file alone...
  • Command-line data analytics made easy
    The command-line is incredibly powerful when it comes to data processing. Still, many of us working with data do not take advantage of it...These motivated me to write a command-line tool that focus on readability, easiness to learn and modern data formats, while leveraging the command-line ecosystem. On top of that, it also leverages the Python ecosystem! Meet SPyQL - SQL with Python in the middle...
  • How Federated Learning Protects Privacy
    With federated learning, it’s possible to collaboratively train a model with data from multiple users without any raw data leaving their devices. If we can learn from data across many sources without needing to own or collect it, imagine what opportunities that opens!...Let’s explore how this technology works with a simple example we can all relate to: blocking spam messages...


 

A Message from this week's Sponsor:

 



Out now: new semantic layer whitepapers

Check out this bundle of Semantic Layer whitepapers by best selling authors - download here.

You'll learn the key value propositions to implement a semantic layer and best practices for analytics success with one.




 

Data Science Articles & Videos

  • Management Seat Time: Reflections on Management and Returning to Engineering
    In case you haven’t heard, there’s a (r)evolution going on in the modern data stack. There’s a new way of working in data, where software engineering best practices are the way the data team gets work done. I recently left my role as the manager of a large team to join Aula Education as a Sr Analytics Engineer — primarily to get in on the technical fun...This post is a reflection on my years in management and the lessons I learned from them...
  • Tutorial #17: Transformers III Training
    In part I of this tutorial we introduced the self-attention mechanism and the transformer architecture. In part II, we discussed position encoding and how to extend the transformer to longer sequence lengths. We also discussed connections between the transformer and other machine learning models...In this final part, we discuss challenges with transformer training dynamics and introduce some of the tricks that practitioners use to get transformers to converge...
  • Online internal speech decoding from single neurons in a human participant
    Speech brain-machine interfaces (BMI’s) translate brain signals into words or audio outputs, enabling communication for people having lost their speech abilities due to diseases or injury...In this work, a tetraplegic participant with implanted microelectrode arrays located in the supramarginal gyrus (SMG) and primary somatosensory cortex (S1) performed internal and vocalized speech of six words and two pseudowords. We found robust internal speech decoding from SMG single neuron activity, achieving up to 91% classification accuracy during an online task (chance level 12.5%)...
  • Dashboard Design Patterns
    There are many high-level guidelines on dashboard design, including advice about visual perception, reducing information load, the use of interaction, and visualization literacy. Despite this, we know little about effective and applicable dashboard design, and about how to support rapid dashboard design...Our design patterns for dashboard design on this website aims to support creativity and to streamline the dashboard design...
  • Experiences from Using Code Explanations Generated by Large Language Models in a Web Software Development E-Book
    In this paper, we report on our experiences generating multiple code explanation types using LLMs and integrating them into an interactive e-book on web software development. We modified the e-book to make LLM-generated code explanations accessible through buttons next to code snippets in the materials, which allowed us to track the use of the explanations as well as to ask for feedback on their utility...Our preliminary results show that all varieties of explanations were viewed by students and that the majority of students perceived the code explanations as helpful to them. However, student engagement appeared to vary by code snippet complexity, explanation type, and code snippet length...
  • The Use Case for Relative Position Embeddings
    We’re in 2022 but many of our most popular causal language models (LMs), including GPT-3, still use absolute positional embeddings. I believe we should stop using those and move to relative positional embeddings such as ALiBi. Deepmind’s Gopher and BigScience’s BLOOM already use relative positioning methods, and I’ve heard that multiple upcoming models also will, and so hopefully this post will help in encouraging the remanining holdouts to follow suit...
  • Datacast Episode 101: Scaling Data Engineering, Building Data Teams, and Managed Data Stack With Tarush Aggarwal
    This is my conversation with Tarush Aggarwal — the Founder and CEO of 5x, the modern data stack as a managed data service. He is one of the leading experts in leveraging data for exponential growth, with over ten years of experience in the field...Our wide-ranging conversation touches on his college experience at Carnegie Mellon University, his time at Salesforce as the first data engineer, lessons learned from building and managing a data team as a Data Manager at Wyng, his leadership role at WeWork scaling the data team and establishing the operations in the Chinese market, his current journey with 5x building the app store for the modern data stack, and much more....
  • Learning to Imitate
    Systems often require over 100 million interactions with an environment to train — equivalent of more than 100 years of human experience — to reach human-level performance. In contrast, a human can acquire new skills in relatively short amounts of time by observing an expert. How can we enable our artificial agents to similarly acquire such fast learning ability?...In this post, I’ll discuss several techniques being developed in a field called “Imitation Learning” (IL) to solve these sorts of problems and present a recent method from our lab, called Inverse Q-Learning — which was used to create the best AI agent for playing Minecraft using few expert demos...
  • Chelsea Finn, Stanford: On the biggest bottlenecks in robotics and reinforcement learning
    Chelsea Finn is an Assistant Professor at Stanford and part of the Google Brain team. She's interested in the capability of robots and other agents to develop broadly intelligent behavior through learning and interaction at scale. In this episode, we chat about some of the biggest bottlenecks in RL and robotics—including distribution shifts, Sim2Real transferability, and the inherent tradeoff of sample efficiency—as well as what makes a great researcher, why she aspires to build a robot that can make cereal, and much more...
  • Find All the Pangolins
    Videos from trail cameras are a useful tool for noninvasive observation of wildlife, but if you are studying a rare species, you might have to look at a lot of videos before you find it...In a previous article, we used probablistic classification to remove blank videos, that is, videos that don't contain animals...An alternative is a targeted search. For each video in a dataset, we use probablistic classification to compute the probability that it contains each category of animal (species or group of species). If we are looking for a particular species, we can sort the videos in descending order by the probability they contain the category that contains the target species...


 

Tool*

 



Co:here

Unlock the power of language models – No ML experience required. Cohere’s ready-to-use NLP toolkits can help you build and deploy your language AI projects at scale. Our pre-trained models enable developers to build AI-driven apps faster and easier from creating Marketing copy, product descriptions, to summarizing articles, categorizing text and much more! Whether you’re a beginner or an expert, Cohere is making NLP accessible to everyone.

Get started for free


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

   

 

Tool*

 



Bridge the Analytics to Action Gap with Hightouch + Looker

Want to get the most from your data in Looker, but don’t want to rewrite your Looks or :shudder: export a CSV? We have great news! With Hightouch, you can automatically sync your Looks directly into any business tool in minutes, so you and your stakeholders can easily take action with all the right context.

Ready to bridge the analytics to action gap? Check out Hightouch.com to learn more or get started for free.


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

   

 

Jobs

 
  • Senior Data Analyst - Epic Games - New York

    Epic Games spans across 19 countries with 55 studios and 4,500+ employees globally. For over 25 years, we’ve been making award-winning games and engine technology that empowers others to make visually stunning games and 3D content that bring environments to life like never before.

    Use your expert experience in data & analytics to build powerful stories and visuals that inform the games we make, the technology we develop, and business decisions that drive Epic... Epic Games is looking for a Senior Data Analyst to help us create the models that fuel our creator economy. The successful candidate will have excellent SQL knowledge, and enjoy combining analytic skills with business acumen to provide the data and insights that will drive our continued success...

     

        Want to post a job here? Email us for details --> team@datascienceweekly.org

 

 

Training & Resources

 
  • Orchestrating Single-Cell Analysis with Bioconductor
    This is the landing page for the “Orchestrating Single-Cell Analysis with Bioconductor” book, which teaches users some common workflows for the analysis of single-cell RNA-seq data (scRNA-seq). This book will show you how to make use of cutting-edge Bioconductor tools to process, analyze, visualize, and explore scRNA-seq data. Additionally, it serves as an online companion for the paper of the same name...
  • Lovely Tensors - for PyTorch
    How often do you find yourself debugging PyTorch code? You dump a tensor to the cell output, and see this...Was it really useful for you, as a human, to see all these numbers?...What is the shape?...The size?...What are the statistics?...Are any of the values nan or inf?...Is it an image of a man holding a tench?...


Last Week's Newsletter's 3 Most Clicked Links

 

* Based on unique clicks.

** Find last week's newsletter here.

 

Cutting Room Floor

 


P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2022 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

Data Science Weekly - Issue 467

Thursday, November 3, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #467 November 03 2022 Editor's Picks

Data Science Weekly - Issue 466

Thursday, October 27, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #466 October 27 2022 Editor's Picks

Data Science Weekly - Issue 465

Thursday, October 20, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #465 October 20 2022 Editor's Picks

Data Science Weekly - Issue 464

Thursday, October 13, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #464 October 13 2022 Editor's Picks

Data Science Weekly - Issue 463

Thursday, October 6, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #463 October 06 2022 Editor's Picks

You Might Also Like

Import AI 399: 1,000 samples to make a reasoning model; DeepSeek proliferation; Apple's self-driving car simulator

Friday, February 14, 2025

What came before the golem? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Defining Your Paranoia Level: Navigating Change Without the Overkill

Friday, February 14, 2025

We've all been there: trying to learn something new, only to find our old habits holding us back. We discussed today how our gut feelings about solving problems can sometimes be our own worst enemy

5 ways AI can help with taxes 🪄

Friday, February 14, 2025

Remotely control an iPhone; 💸 50+ early Presidents' Day deals -- ZDNET ZDNET Tech Today - US February 10, 2025 5 ways AI can help you with your taxes (and what not to use it for) 5 ways AI can help

Recurring Automations + Secret Updates

Friday, February 14, 2025

Smarter automations, better templates, and hidden updates to explore 👀 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

The First Provable AI-Proof Game: Introducing Butterfly Wings 4

Friday, February 14, 2025

Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? undefined The Market Today #01 Instagram (Meta) 714.52 -0.32%

GCP Newsletter #437

Friday, February 14, 2025

Welcome to issue #437 February 10th, 2025 News BigQuery Cloud Marketplace Official Blog Partners BigQuery datasets now available on Google Cloud Marketplace - Google Cloud Marketplace now offers

Charted | The 1%'s Share of U.S. Wealth Over Time (1989-2024) 💰

Friday, February 14, 2025

Discover how the share of US wealth held by the top 1% has evolved from 1989 to 2024 in this infographic. View Online | Subscribe | Download Our App Download our app to see thousands of new charts from

The Great Social Media Diaspora & Tapestry is here

Friday, February 14, 2025

Apple introduces new app called 'Apple Invites', The Iconfactory launches Tapestry, beyond the traditional portfolio, and more in this week's issue of Creativerly. Creativerly The Great

Daily Coding Problem: Problem #1689 [Medium]

Friday, February 14, 2025

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given a linked list, sort it in O(n log n) time and constant space. For example,

📧 Stop Conflating CQRS and MediatR

Friday, February 14, 2025

​ Stop Conflating CQRS and MediatR Read on: m​y website / Read time: 4 minutes The .NET Weekly is brought to you by: Step right up to the Generative AI Use Cases Repository! See how MongoDB powers your