Data Science Weekly - Data Science Weekly - Issue 478

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #478

January 18 2023

Editor's Picks

 
  • Data is Rectangular and other Limiting Misconceptions
    In software we express our ideas through tools. In data, those tools think in rectangles. From spreadsheets to the data warehouses, to do any analytical calculation, you must first go through a rectangle. Forcing data through a rectangle shapes the way we solve problems (for example, dimensional fact tables, OLAP Cubes)...But really, most data isn’t rectangular. Most data exists in hierarchies (orders, items, products, users). Most query results are better returned as a hierarchy (category, brand, product). Can we escape the rectangle?...


 

A Message from this week's Sponsor:

 



Pinecone vector database

The Pinecone vector database makes it easy to build high-performance vector search applications. Developer-friendly, fully managed, and easily scalable without infrastructure hassles.

Use Pinecone to build semantic search, object recognition, recommendations, anomaly detection, and other vector-based functionality into your applications.




 

Data Science Articles & Videos

 
  • My journey from R to Julia
    For 15 years I taught “Applied Epidemiology using R” at the UC Berkeley School of Public Health...I started dabbling in Python...but eventually I discovered Julia—a programming language designed for scientific computing with the intuition of Python or R, but with the speed of C++. I fell in love with Julia and I gave up on learning Python...As I learned more Julia, I became convinced that for me learning Julia was a better long term investment than sticking with R...
  • Finding Modes and Antimodes
    How can I find the least frequent value (antimode) between 2 modes in a bimodal distribution?...One option is to use kernel density estimation KDE...
  • Let's build GPT: from scratch, in code, spelled out. [YouTube Vide]
    Andrej Karpathy...builds a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3. We talk about connections to ChatGPT, which has taken the world by storm. We watch GitHub Copilot, itself a GPT, help us write a GPT (meta :D!)...
  • Pretraining quadrupeds: a case study in RL as an engineering tool
    How an unlikely corner of robotics research, locomotion, defined RL's new notion of success...This article’s goal is to try and shift the narrative around what success is in RL: how we don’t need to be talking about AGI, how RL can be used to solve specific problems like robotic locomotion, and how we should react to future examples of RL’s success...
  • The State of Data Testing
    You would be hard-pressed to find anyone who doesn’t believe that it’s important to test your data, especially given the increased reliance on data to drive decision-making at companies. Data must be well-tested, accurate, and reliable, but all too often, data is broken, incorrect, and untested...This guide to data testing hopes to change that. We’ll cover the current state of data testing and bring the Datafold view on what data testing should look like with Data -Diff, to ensure your pipelines run smoothly through automated and proactive data testing...
  • Large Transformer Model Inference Optimization
    In this post, we will look into several approaches for making transformer inference more efficient. Some are general network compression methods, while others are specific to transformer architecture...
  • Google Research, 2022 & Beyond: Language, Vision and Generative Models
    With this post, I am kicking off a series in which researchers across Google will highlight some exciting progress we've made in 2022 and present our vision for 2023 and beyond. I will begin with a discussion of language, computer vision, multi-modal models, and generative machine learning models. Over the next several weeks, we will discuss novel developments in research topics ranging from responsible AI to algorithms and computer systems to science, health and robotics...
  • Welcome to the jungle, we got fun and frames
    I’m still in the early stages of a project and doing data analysis on the input data, a dump of 10GB from Goodreads in JSON...When I tried to use read_json in Pandas, the kernel took over 2 minutes just to read the data in. Actually, initially I wasn’t sure if it would even be able to read the entire file, and I’d need to read it in again and again during the exploration process, so this wasn’t an optimal way to go...But to understand the performance constraints, I had to go pretty deep into the Pandas ecosystem...


 

Tool*

 



Full Transparency for ML Experiment Tracking

With just 2 lines of code, Comet automatically logs metrics, hyperparameters, libraries, and more. Comet works with Keras, Tensorflow, Pytorch, Hugging Face and 15+ more popular tools and frameworks. Check out our GitHub repo for Comet examples. With Comet, you can:
  • Diff up to 4 experiments
  • Review model predictions across experiments with built-in visualizations
  • Save models to a model registry
Get started with your free account today


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!



 

Jobs

 
  • Data Scientist / Machine Learning Engineer - Epsilon - NYC

    Epsilon Strategy and Insights, Data Sciences team is looking for a talented team player in a Data Scientist/Machine Learning Engineer role. You are an expert, mentor and advocate. You have strong machine learning and deep learning background and are passionate about transforming data into ml models. You welcome the challenge of data science and are proficient in Python, Spark MLLib, Tensorflow, Keras, ML algorithms and Deep Neural Networks, Big Data. You must be self-driven, take initiative and want to work in a dynamic, busy and innovative group...
     
Want to post a job here? Email us for details --> team@datascienceweekly.org



 

Training & Resources

 
  • Resources for Learning Computational Complexity Theory
    Computational complexity theory studies the feasibility of solving and resources required to solve computational problems and is useful to any field that thinks about the analysis and design of algorithms (which is much more broad than one may first think). While there are a good bit of notes and lectures available online, these are scattered across university course pages, YouTube, etc. This guide aims to bring this material together for learning computational complexity theory at the introductory graduate level, especially for those without a formal CS background...
 

Last Week's Newsletter's 3 Most Clicked Links

 
* Based on unique clicks.
** Find last week's newsletter here.

 


Cutting Room Floor

 


P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2022 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Key phrases

Older messages

Data Science Weekly - Issue 476

Friday, January 6, 2023

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #476 January 05 2023 Editor's Picks

Data Science Weekly - Issue 475

Thursday, December 29, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #475 December 29 2022 Editor's Picks

Data Science Weekly - Issue 474

Friday, December 23, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #474 December 22 2022 Editor's Picks

Data Science Weekly - Issue 473

Friday, December 16, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #473 December 15 2022 Editor's Picks

Data Science Weekly - Issue 472

Friday, December 9, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #472 December 08 2022 Editor's Picks

You Might Also Like

A deal made in cloud security heaven

Thursday, April 18, 2024

Meta's Llama 3 goes public and hackers hold World-Check data for ransom View this email online in your browser By Christine Hall Thursday, April 18, 2024 Welcome to TechCrunch PM! I'm glad you

💎 Issue 413 - RubyJS-Vite

Thursday, April 18, 2024

This week's Awesome Ruby Newsletter Read this email on the Web The Awesome Ruby Newsletter Issue » 413 Release Date Apr 18, 2024 Your weekly report of the most popular Ruby news, articles and

💻 Issue 406 - Swift for C++ Practitioners, Part 1

Thursday, April 18, 2024

This week's Awesome .NET Weekly Read this email on the Web The Awesome .NET Weekly Issue » 406 Release Date Apr 18, 2024 Your weekly report of the most popular .NET news, articles and projects

💻 Issue 413 - How to implement HLS Video Streaming in a React App

Thursday, April 18, 2024

This week's Awesome Node.js Weekly Read this email on the Web The Awesome Node.js Weekly Issue » 413 Release Date Apr 18, 2024 Your weekly report of the most popular Node.js news, articles and

📱 Issue 407 - Textual Healing: iOS Text Editing Minutiae

Thursday, April 18, 2024

This week's Awesome iOS Weekly Read this email on the Web The Awesome iOS Weekly Issue » 407 Release Date Apr 18, 2024 Your weekly report of the most popular iOS news, articles and projects Popular

💻 Issue 413 - Interview with Senior JavaScript Developer 2024 [video]

Thursday, April 18, 2024

This week's Awesome JavaScript Weekly Read this email on the Web The Awesome JavaScript Weekly Issue » 413 Release Date Apr 18, 2024 Your weekly report of the most popular JavaScript news, articles

💻 Issue 331 - 30+ app ideas with complete source code

Thursday, April 18, 2024

This week's Awesome React Weekly Read this email on the Web The Awesome React Weekly Issue » 331 Release Date Apr 18, 2024 Your weekly report of the most popular React news, articles and projects

💻 Issue 408 - Curl: Hyper, is it worth it?

Thursday, April 18, 2024

This week's Awesome Rust Weekly Read this email on the Web The Awesome Rust Weekly Issue » 408 Release Date Apr 18, 2024 Your weekly report of the most popular Rust news, articles and projects

📱 Issue 410 - Swift for C++ Practitioners, Part 1

Thursday, April 18, 2024

This week's Awesome Swift Weekly Read this email on the Web The Awesome Swift Weekly Issue » 410 Release Date Apr 18, 2024 Your weekly report of the most popular Swift news, articles and projects

🤷🏻‍♂️ What to Do When Windows Won't Boot — How to Try the Android 15 Beta

Thursday, April 18, 2024

Also: We Tried a Small AI Voice Recorder, and More! How-To Geek Logo April 18, 2024 📩 Get expert reviews, the hottest deals, how-to's, breaking news, and more delivered directly to your inbox by