Data Science Weekly - Data Science Weekly - Issue 587
Data Science Weekly - Issue 587Curated news, articles and jobs related to Data Science, AI, & Machine LearningIssue #587 |
|
Last Week’s Poll:
Data Science Articles & Videos
Towards composable data platforms
I’ve written extensively about the open table formats (OTFs - Apache Iceberg, Delta Lake, and Apache Hudi). In my original Tableflow post, I wrote that shared tables were one of the major trends, enabled by the OTFs…But we can also view OTFs as enabling a kind of virtualization. In this post, I will start by explaining my take on OTFs and virtualization…DuckDB Newsletter: Learn DuckDB by example
Get your dose of DuckDB tips, SQL tricks, and links to learn more about the swiss knife of analytical databases…What is your daily/weekly routine if you have a WFH position? [Reddit]
I'm asking this here since data science/analytics is a very remote industry. I'm honestly trying to figure out a good cadence of when to make breakfast and get coffee, when to meal prep, when to get a 15 minute walk in, when to work out, do my hobbies etc., without driving myself insane…What is your routine for keeping up with life while you're working remotely?…Incremental Archival from Postgres to Parquet for Analytics
Ideally, data would get automatically archived in cheap storage, in a format optimized for large analytical queries.We developed two open source Postgres extensions that help you do that:
pg_parquet can export (and import) query results to the Parquet file format in object storage using regular COPY commands
pg_incremental can run a command for a never-ending series of time intervals or files, built on top of pg_cron
With some simple commands, you can set up a reliable, fully automated pipeline to export time ranges to the columnar Parquet format in S3…
Step-by-Step Diffusion: An Elementary Tutorial
We present an accessible first course on diffusion models and flow matching for machine learning, aimed at a technical audience with no diffusion experience. We try to simplify the mathematical details as much as possible (sometimes heuristically), while retaining enough precision to derive correct algorithms…Semantic Layers: The Missing Link Between AI and Data with David Jayatillake from Cube
In this episode, we chat with David Jayatillake, VP of AI at Cube, about semantic layers and their crucial role in making AI work reliably with data. We explore how semantic layers act as a bridge between raw data and business meaning, and why they're more practical than pure knowledge graphs. David shares insights from his experience at Delphi Labs, where they achieved 100% accuracy in natural language data queries by combining semantic layers with AI, compared to just 16% accuracy with direct text-to-SQL approaches…Bottom up Architecture
Although architects get a bad rap, I do think architecture is an important aspect of software engineering and the better question is to figure out how to get to an architecture that enables different teams to operate more independently…In this post, I’m discussing what software architecture even is and offer some thoughts on a bottom up approach…Data Science is losing its soul [Reddit]
DS teams are starting to lose the essence that made them truly groundbreaking. their mixed scientific and business core. What we’re seeing now is a shift from deep statistical analysis and business oriented modeling to quick and dirty engineering solutions. Sure, this approach might give us a few immediate wins but it leads to low ROI projects and pulls the field further away from its true potential. One size-fits-all programming just doesn’t work. it’s not the whole game…We often start coding to avoid "by hand" work. Here's why labeling your own data is worth the time…
How to use a histogram as a legend in {ggplot2}
Land isn’t unemployed—people are. Here’s how to use R, {ggplot2}, {sf}, and {patchwork} to create a histogram legend in a choropleth map to better see the distribution of values…When calibration beats metrics
Having a classifier with great metrics is good, but it is not enough for it to be useful in production. One reason why it might still fail is because it could be that you are dealing with a badly calibrated model. The predictions might be fine, but the probability estimates can be way off…In this video we talk about how to think about calibration and what it means…If spreadsheets are eternal, are BI tools transitory?
Consider the following scenario in the Microsoft stack: ingest data with ADF, transform with Fabric (& maybe dbt?), build a semantic model in Power BI, and delicately craft an artisanal dashboard (with your mouse). Then your stakeholder takes a look at your dashboard, navigates to the top left corner and clicks “Analyze In Excel”. How did we get here?…Building a SNAP LLM eval: part 1
This is the first write-up in a series about our process of building an "eval" — evaluation — to assess how well AI models perform on prompts related to the USA’s SNAP (food stamp) program…By sharing how we are approaching this for SNAP in some detail — including publishing a SNAP eval — we hope it will make it easier for others to do the same in similar problem spaces that matter for lower income Americans: healthcare (e.g. Medicaid), disability benefits, housing, legal help, etc…While evaluation can be a fairly technical topic, we hope these posts reduce barriers to more domain-specific evaluations being created by experts in these high-impact — but complex — areas…
.
Last Week's Newsletter's 3 Most Clicked Links
.
* Based on unique clicks.
** Find last week's issue #586 here.
Cutting Room Floor
.
Whenever you're ready, 2 ways we can help:
Looking to get a job? Check out our “Get A Data Science Job” Course
It is a comprehensive course that teaches you everything related to getting a data science job based on answers to thousands of emails from readers like you. The course has 3 sections: Section 1 covers how to get started, Section 2 covers how to assemble a portfolio to showcase your experience (even if you don’t have any), and Section 3 covers how to write your resume.Promote yourself/organization to ~66,600 subscribers by sponsoring this newsletter. 35-45% weekly open rate.
Thank you for joining us this week! :)
Stay Data Science-y!
All our best,
Hannah & Sebastian
Invite your friends and earn rewards
Older messages
Data Science Weekly - Issue 586
Friday, February 14, 2025
Curated news, articles and jobs related to Data Science, AI, & Machine Learning ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Data Science Weekly - Issue 581
Thursday, January 9, 2025
Curated news, articles and jobs related to Data Science, AI, & Machine Learning ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Data Science Weekly - Issue 580
Friday, January 3, 2025
Curated news, articles and jobs related to Data Science, AI, & Machine Learning ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Data Science Weekly - Issue 579
Thursday, December 26, 2024
Curated news, articles and jobs related to Data Science, AI, & Machine Learning ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Data Science Weekly - Issue 578
Thursday, December 19, 2024
Curated news, articles and jobs related to Data Science, AI, & Machine Learning ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
You Might Also Like
📧 Did you watch the FREE chapter of Pragmatic REST APIs?
Friday, February 28, 2025
Hey, it's Milan. 👋 The weekend is almost upon us. So, if you're up for some quality learning, consider watching the free chapter of Pragmatic REST APIs. Scroll down to the curriculum or click
Data Science Weekly - Issue 588
Thursday, February 27, 2025
Curated news, articles and jobs related to Data Science, AI, & Machine Learning ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
💎 Issue 458 - Why Ruby on Rails still matters
Thursday, February 27, 2025
This week's Awesome Ruby Newsletter Read this email on the Web The Awesome Ruby Newsletter Issue » 458 Release Date Feb 27, 2025 Your weekly report of the most popular Ruby news, articles and
📱 Issue 452 - Three questions about Apple, encryption, and the U.K
Thursday, February 27, 2025
This week's Awesome iOS Weekly Read this email on the Web The Awesome iOS Weekly Issue » 452 Release Date Feb 27, 2025 Your weekly report of the most popular iOS news, articles and projects Popular
💻 Issue 451 - .NET 10 Preview 1 is now available!
Thursday, February 27, 2025
This week's Awesome .NET Weekly Read this email on the Web The Awesome .NET Weekly Issue » 451 Release Date Feb 27, 2025 Your weekly report of the most popular .NET news, articles and projects
💻 Issue 458 - Full Stack Security Essentials: Preventing CSRF, Clickjacking, and Ensuring Content Integrity in JavaScript
Thursday, February 27, 2025
This week's Awesome Node.js Weekly Read this email on the Web The Awesome Node.js Weekly Issue » 458 Release Date Feb 27, 2025 Your weekly report of the most popular Node.js news, articles and
💻 Issue 458 - TypeScript types can run DOOM
Thursday, February 27, 2025
This week's Awesome JavaScript Weekly Read this email on the Web The Awesome JavaScript Weekly Issue » 458 Release Date Feb 27, 2025 Your weekly report of the most popular JavaScript news, articles
💻 Issue 453 - Linus Torvalds Clearly Lays Out Linux Maintainer Roles Around Rust Code
Thursday, February 27, 2025
This week's Awesome Rust Weekly Read this email on the Web The Awesome Rust Weekly Issue » 453 Release Date Feb 27, 2025 Your weekly report of the most popular Rust news, articles and projects
💻 Issue 376 - Top 10 React Libraries/Frameworks for 2025 🚀
Thursday, February 27, 2025
This week's Awesome React Weekly Read this email on the Web The Awesome React Weekly Issue » 376 Release Date Feb 27, 2025 Your weekly report of the most popular React news, articles and projects
February 27th 2025
Thursday, February 27, 2025
Curated news all about PHP. Here's the latest edition Is this email not displaying correctly? View it in your browser. PHP Weekly 27th February 2025 Hi everyone, Laravel 12 is finally released, and