Data Elixir - Data Elixir - Issue 441
ISSUE 441 · June 20, 2023In the NewsDo Foundation Models Comply with the EU AI Act?In this post from Stanford, researchers evaluate foundation model providers like OpenAI and Google for their compliance with the recently proposed EU regulations on AI. The post scores each of the models on the key issues and offers recommendations. Ultimately, the entire ecosystem will benefit from working towards compliance but it's not clear how, or if, that will happen. Sponsored LinkWebinar: How to generate business intelligence leveraging Yelp's rich first-party data on AWSDiscover how to create actionable insights using Yelp's robust data sets to analyze your marketplace, your customers, and grow your business. Explore use cases on how businesses leverage this rich data with AWS Data Exchange to make strategic business decisions. Date: July 19, 2023 Posts & TutorialsData Falsificada (Part 1): "Clusterfake"In this first post in a series on academic fraud, researchers explore a case where two different people independently faked data for two different studies in a paper about dishonesty! Besides the backstory, what's interesting are the techniques the researchers used to dissect Excel files. There's a lot more to those files than most people realize. Artifact corrections for effect sizesAn effect size is a way to quantify the difference between two groups. While p-values can tell you whether an effect exists, effect sizes can tell you how large that effect is. But to be useful, effect sizes need to be corrected for a variety of statistical artifacts, such as measurement error. This post walks through nearly all artifact corrections and includes equations, code snippets and an interactive learning app. What Makes Raincloud Plots Tick?A raincloud plot combines visualizations of the overall shape of a distribution, the raw data values, and relevant statistics. This is a nice explainer that explores how raincloud plots are useful and things to think about for their design. The post introduces a larger project on raincloud plots that includes a paper and a notebook with examples. 5 methods to detect drift in ML embeddingsThis post explores the problem of drift in ML embeddings and a variety of techniques to monitor it. For each technique, there's a description of how it works, pros/cons, and experimental results. Tools & CodeDB-GPT - Database Interactions with Private LLMsDB-GPT is an experimental open-source project that uses local LLMs to enable you to interact with your data in natural language. Use it to generate SQL, diagnose SQL issues, provide natural language Q/A with knowledge bases, chat with documents, etc. Privacy and security are core objectives and all of your data stays in your own environment. GPT EngineerGPT Engineer is an AI agent that can write an entire codebase with a prompt. Specify what you want it to build, the AI asks for clarification, and then builds it. It's made to be easy to adapt and it even learns how you want your code to look. This has been out for less than a week and already has more than 22K stars on GitHub. ResourcesSpatial Statistics for Data ScienceThis new book introduces the theory and practice of spatial statistics using R. Covers packages for working with spatial data, the various types of spatial data and how to access it, making maps, spatial autocorrelation, Bayesian spatial models, and more. Free to read online. Julia programming for MLThis notebook-based course introduces Julia's machine learning ecosystem and will teach you how to write reproducible, unit-tested Julia code along the way. Prior experience with Julia is not required. Covers Julia fundamentals (e.g. plotting, data frames, classical ML), deep learning, a personal project, and finishes with debugging and profiling. Was this email forwarded to you? Sign up here >> |
Key phrases
Older messages
Data Elixir - Issue 440
Tuesday, June 13, 2023
NFL Analytics. Sequential testing. Data + Music. Managing generative AI risks. FinGPT: open-source LLM for finance. Data exploration toolkit.
Data Elixir - Issue 439
Monday, June 12, 2023
Data podcasts. What are embeddings? Road trip maps. Dependency management. The {marginaleffects} book. A first course in causal inference.
Data Elixir - Issue 438
Tuesday, May 30, 2023
State of GPT. Interview questions and answers. Hierarchical vs rectangular data. Intro to Vega-Lite.
Data Elixir - Issue 437
Tuesday, May 23, 2023
How db indexes work. ML vs climate change. Word salad. Guide to MLOps. Intro to data viz for the web.
Data Elixir - Issue 436
Tuesday, May 16, 2023
privateGPT. Julia 1.9 highlights. Built on probability. Tidy Finance. Python packaging.
You Might Also Like
WP Weekly 191 - Essentials - Duplicate in Core, White Label Kadence, Studio for Mac
Monday, April 29, 2024
Read on Website WP Weekly 191 / Essentials It seems many essential features are being covered in-house, be it the upcoming duplicate posts/pages feature in the WordPress core or the launch of Studio
SRE Weekly Issue #422
Monday, April 29, 2024
View on sreweekly.com A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries,
Quick question
Sunday, April 28, 2024
I want to learn how I can better serve you
Kotlin Weekly #404 (NOT FOUND)
Sunday, April 28, 2024
ISSUE #404 28st of April 2024 Announcements Kotlin Multiplatform State of the Art Survey 2024 Help to shape and understand the Kotlin Multiplatform Ecosystem! It takes 4 minutes to fill this survey.
📲 Why Is It Called Bluetooth? — Check Out This AI Text to Song Generator
Sunday, April 28, 2024
Also: What to Know About Emulating Games on iPhone, and More! How-To Geek Logo April 28, 2024 📩 Get expert reviews, the hottest deals, how-to's, breaking news, and more delivered directly to your
Daily Coding Problem: Problem #1425 [Easy]
Sunday, April 28, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Microsoft. Suppose an arithmetic expression is given as a binary tree. Each leaf is an
PD#571 Software Design Principles I Learned the Hard Way
Sunday, April 28, 2024
If there's two sources of truth, one is probably wrong. And yes, please repeat yourself.
When Procrastination is Productive & Ghost integrating with ActivityPub
Sunday, April 28, 2024
Automattic, Texts, and Beeper join forces to build world's best inbox, Reflect launches its iOS app, how to start small rituals, and a lot more in this week's issue of Creativerly. Creativerly
C#503 Building pipelines with System.Threading.Channels
Sunday, April 28, 2024
Concurrent programming challenges can be effectively addressed using channels
RD#453 Get your codebase ready for React 19
Sunday, April 28, 2024
Is your app ready for what's coming up in React 19's release