Data Science Weekly - Data Science Weekly - Issue 473

Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments
Email not displaying correctly?
View it in your browser.

Issue #473

December 15 2022

Editor's Picks

 
  • Failed ML Project - How bad is the real estate market getting?
    There aren't enough failed data science projects out there. Usually, projects only show up in public if they work. I think that's a shame. If we learn more from our successes than our failures, it makes sense to share more failures to help those around us...TL;DR I made several mistakes in this machine learning project that led to its failure. I pick these apart in this article...
  • Data lineage is a cartography problem
    Data lineage can become overwhelming if you don’t know what you are looking at — there are too many layers affecting the different abstractions of data, and it can be difficult to understand when data (and the organisation) scales...In this post we’ll discuss how we can learn from the field of cartography and Google Maps to extract the untapped potential of data lineage, and build this ideal interface to improve data literacy and observability...
  • Building a synth with ChatGPT
    I know everyone is talking about ChatGPT right now. One aspect I haven’t seen much of is what it actually looks like to build software using it. There’s a bunch of articles about cool things you can do with it, but what are you going to run into when you build real-world software?..My idea was to build a synth using the Web Audio API. You can watch the recording below. Early results were impressive; we were able to get noise without knowing anything about the Web Audio API...


 

A Message from this week's Sponsor:

 



Pinecone vector database

The Pinecone vector database makes it easy to build high-performance vector search applications. Developer-friendly, fully managed, and easily scalable without infrastructure hassles.

Use Pinecone to build semantic search, object recognition, recommendations, anomaly detection, and other vector-based functionality into your applications.



 

Data Science Articles & Videos

 
  • PubMed GPT: a Domain-Specific Large Language Model for Biomedical Text
    We evaluated PubMed GPT on several question and answer (QA) benchmarks, and manually assessed its generations for a question summarization task. One key benchmark was MedQA-USMLE, which consists of question and answer pairs taken from previous Medical Licensing Exams given to doctors in the United States...
  • What do Vision Transformers Learn? A Visual Exploration
    Vision transformers (ViTs) are quickly becoming the de-facto architecture for computer vision, yet we understand very little about why they work and what they learn. While existing studies visually analyze the mechanisms of convolutional neural networks, an analogous exploration of ViTs remains challenging. In this paper, we first address the obstacles to performing visualizations on ViTs. Assisted by these solutions, we observe that neurons in ViTs trained with language model supervision (e.g., CLIP) are activated by semantic concepts rather than visual features...
  • A/B Testing with Holm's Procedure
    Holm's procedure is helpful for multiple tests with any degree of correlation. It's less conservative than Bonferroni because once you achieve a success comparing with the smallest p-value from your trial, you'll be able to benchmark your other tests at a larger test-level alpha...Here's how it work...
  • Streamlining Machine Learning In Production with Ran Romano
    Our wide-ranging conversation touches on his time in the Israeli army, his engineering experience at VMware, his time at Wix building their internal ML platform, his current journey with Qwak building an end-to-end ML engineering platform to automate the MLOps processes, the evolution of feature stores, the MLOps community in Israel, and much more...
  • Web scraping and text analysis in R and GGplot2
    I recently needed to learn text mining for a project at work. I generally learn more quickly with a real-world project. So, I turned to a topic I love: Wilderness, to see how I could apply the skills of text scrubbing and natural language processing...The first portion of this post will cover web scraping, then text mining, and finally analysis and visualization...
  • Don’t Start Your SQL Queries with the ‘Select’ Statement
    The majority of developers start writing their SQL queries with the ‘SELECT’ clause, then write ‘FROM’, ‘WHERE’, ‘HAVING’….and so on. But this is not the ‘right’ way of writing your SQL queries as this is very prone to syntactic errors, especially if you are a beginner in SQL...
  • What's unsolved in generative AI?
    Much has already been written about the market opportunities for the Cambrian explosion of startups now emerging, which aim to remake everything from creative tools to software development...But equally important is what’s missing today. Other computing paradigm shifts necessitated new infrastructure, from cloud tools at the advent of SaaS to MLOps in the early deep learning era. What problems are unsolved in the generative AI stack? And what tools and infrastructure does the ecosystem need to truly take off?...
  • The Importance of Data Analytics for Marketers
    Data analytics is the not-so-secret weapon that businesses are leveraging to derive key insights into their audiences and expand in saturated markets...Marketers now have highly specialized and segmented insights into market trends, consumer behavior, demographics, and purchase preferences. This allows them to craft personalized messages to target the right consumer at the right time and on the platforms of their choice...
  • Awesome ChatGPT Prompts
    Welcome to the "Awesome ChatGPT Prompts" repository! This is a collection of prompt examples to be used with the ChatGPT model...The ChatGPT model is a large language model trained by OpenAI that is capable of generating human-like text. By providing it with a prompt, it can generate responses that continue the conversation or expand on the given prompt...In this repository, you will find a variety of prompts that can be used with ChatGPT...


 

Tool*

 



Now where can I find that query... 🔍🔍

Did you put it in a doc? Slack? Teams? Notes?? Make searching for a query a thing of the past with Sherloq. Sherloq helps data analysts save, organize, and share their metrics, most used or complex queries for seamless collaboration within their organization. It’s a secure add-on (no integrations or permissions necessary) that works on the popular query editors. Start organizing your query repository in a shared workspace with Sherloq beta.

Try Sherloq For Free


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!



 

Tool*

 



Retool is the fast way to build an interface for any database

With Retool, you don't need to be a developer to quickly build an app or dashboard on top of any data set. Data teams at companies like NBC use Retool to build any interface on top of their data—whether it's a simple read-write visualization or a full-fledged ML workflow.

Drag and drop UI components—like tables and charts—to create apps. At every step, you can jump into the code to define the SQL queries and JavaScript that power how your app acts and connects to data. The result—less time on repetitive work and more time to discover insights.


*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!




 

Jobs

 
  • Senior Data Analyst - Epic Games - New York

    Epic Games spans across 19 countries with 55 studios and 4,500+ employees globally. For over 25 years, we’ve been making award-winning games and engine technology that empowers others to make visually stunning games and 3D content that bring environments to life like never before.

    Use your expert experience in data & analytics to build powerful stories and visuals that inform the games we make, the technology we develop, and business decisions that drive Epic... Epic Games is looking for a Senior Data Analyst to help us create the models that fuel our creator economy. The successful candidate will have excellent SQL knowledge, and enjoy combining analytic skills with business acumen to provide the data and insights that will drive our continued success...

     
Want to post a job here? Email us for details --> team@datascienceweekly.org



 

Training & Resources

 
  • Stanford CS229M: Machine Learning Theory [YouTube Playlist]
    This course focuses on developing a theoretical understanding of the statistical properties of learning algorithms. Topics Include: a) Generalization bounds via uniform convergence, b) Theory for deep learning, c) Non-convex optimization, d) Neural tangent kernel, e) Implicit/algorithmic regularization, and f) Unsupervised learning and domain adaptation...
  • High-Dimensional Probability and Applications in Data Science
    This course builds probabilistic foundations for theoretical research in modern data science. You will learn some methods that form an essential toolbox for anyone looking to do mathematical work in machine learning, theoretical computer science, theoretical statistics, signal processing, etc...
  • Survival Analysis: Optimize the Partial Likelihood of the Cox Model
    Survival analysis encompasses a collection of statistical methods for describing time to event data...In this post, we introduce a popular survival analysis algorithm, the Cox proportional hazards model¹. Then, we define its log-partial likelihood and the gradient, and optimize it to find the best set of model parameters through a practical Python example...
 

Last Week's Newsletter's 3 Most Clicked Links

 
* Based on unique clicks.
** Find last week's newsletter here.

 


Cutting Room Floor

 


P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Follow on Twitter
Copyright © 2013-2022 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Older messages

Data Science Weekly - Issue 472

Friday, December 9, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #472 December 08 2022 Editor's Picks

Data Science Weekly - Issue 471

Thursday, December 1, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #471 December 01 2022 Editor's Picks

Data Science Weekly - Issue 470

Thursday, November 24, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #470 November 24 2022 Editor's Picks

[in case you missed it] Data Science Weekly - Issue 469

Sunday, November 20, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #469 November 17 2022 Editor's Picks

Data Science Weekly - Issue 469

Friday, November 18, 2022

Curated news, articles and jobs related to Data Science. Keep up with all the latest developments Email not displaying correctly? View it in your browser. Issue #469 November 17 2022 Editor's Picks

You Might Also Like

WebAIM November 2024 Newsletter

Friday, November 22, 2024

WebAIM November 2024 Newsletter Read this newsletter online at https://webaim.org/newsletter/2024/november Features Using Severity Ratings to Prioritize Web Accessibility Remediation When it comes to

➡️ Why Your Phone Doesn't Want You to Sideload Apps — Setting the Default Gateway in Linux

Friday, November 22, 2024

Also: Hey Apple, It's Time to Upgrade the Macs Storage, and More! How-To Geek Logo November 22, 2024 Did You Know Fantasy author JRR Tolkien is credited with inventing the main concept of orcs and

JSK Daily for Nov 22, 2024

Friday, November 22, 2024

JSK Daily for Nov 22, 2024 View this email in your browser A community curated daily e-mail of JavaScript news React E-Commerce App for Digital Products: Part 4 (Creating the Home Page) This component

Spyglass Dispatch: The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen

Friday, November 22, 2024

The Fate of Chrome • Amazon Tops Up Anthropic • Pros Quit Xitter • Brave Powers AI Search • Apple's Lazy AI River • RIP Enrique Allen The Spyglass Dispatch is a free newsletter sent out daily on

Charted | How the Global Distribution of Wealth Has Changed (2000-2023) 💰

Friday, November 22, 2024

This graphic illustrates the shifts in global wealth distribution between 2000 and 2023. View Online | Subscribe | Download Our App Presented by: MSCI >> Get the Free Investor Guide Now FEATURED

Daily Coding Problem: Problem #1616 [Easy]

Friday, November 22, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Alibaba. Given an even number (greater than 2), return two prime numbers whose sum will

The problem to solve

Friday, November 22, 2024

​ Use problem framing to define the problem to solve This week, Tom Parson and Krishna Raha share tools and frameworks to identify and address challenges effectively, while Voltage Control highlights

Issue #568: Random mazes, train clock, and ReKill

Friday, November 22, 2024

View this email in your browser Issue #568 - November 22nd 2024 Weekly newsletter about Web Game Development. If you have anything you want to share with our community please let me know by replying to

Whats Next for AI: Interpreting Anthropic CEOs Vision

Friday, November 22, 2024

Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 22, 2024? The HackerNoon

iOS Cocoa Treats

Friday, November 22, 2024

View in browser Hello, you're reading Infinum iOS Cocoa Treats, bringing you the latest iOS related news straight to your inbox every week. Using the SwiftUI ImageRenderer The SwiftUI ImageRenderer