Data Elixir - Data Elixir - Issue 388
ISSUE 388 · May 24, 2022In the NewsUsing ML to Help Protect the Great Barrier ReefIn spite of the costs, machine learning has been successfully used in a variety of conservation projects around the world. Here's an inside look at how the Great Barrier Reef Foundation leveraged the latest technologies to survey, monitor and map reefs at scale. OrganizationsDon’t just run your data team like a product team, run it like a company that needs to scaleData teams are always under-resourced, but simultaneously can be seen as an already expensive investment. Here are some ideas for getting the support your data team needs. Sponsored LinkHow to Capture Advantages by Investing in High-Quality Training DataAt the enterprise level, machine learning requires either large amounts of training data or a smaller set of extremely high quality data, as well as the infrastructure to support high data volumes. Consequently, labeling data through robust software or in partnership with an annotation service provider is critical to project success. Read more. Tutorials, Projects & OpinionsHow random forests really workIn this notebook tutorial, Jeremy Howard from fast.ai shows how Random Forests work, by building one from scratch, and then using it to submit to a Kaggle competition. Visualizing multicollinearity in PythonMulticollinearity is when two or more features are correlated with each other in a dataset and it's important to identify and understand it prior to training predictive models. This post explores three ways to visualize multicollinearity, including pros/cons of each. MarginaliaIn the world of statistics, “marginal” means “additional,” or what happens to outcome variable y when explanatory variable x changes a little. This isn't short but it's a gentle introduction to all things marginal and how they work: marginal effects, marginal slopes, average marginal effects, marginal effects at the mean, and more. Unlock Secret Knowledge from Data Experts for $10Packt's Spring Sale is on and for a limited period, all eBooks and Videos are only $10. Our Products are available as PDF, ePub, and MP4 files for you to download and keep forever. All the practical content you need - by developers for developers. ResourcesSoftware Development Resources for Data ScientistsGreat collection of resources that will help data teams create reproducible and production-ready code and tools. This is a crowd-sourced collection covering project structure, automatated testing, reproducible environments, and version control. Mathematics for Machine LearningThis is a tightly curated collection of free books, videos, and papers for learning mathematics for machine learning. Covers all levels. Code & ToolsLineaPyLineaPy is a Python package for data scientists that makes it easy to go from prototype to production. Just add two lines of code and LineaPy will automatically capture, analyze, and transform messy data science code to production data pipelines. No refactoring or new tools needed. NannyMLNannyML is an open-source python library that estimates real-world model performance (without access to targets), detects data drift, and links data drift alerts to changes in model performance. It's easy to use, model-agnostic and supports all tabular binary classification use cases. Obsidian DataviewDataview is a data index and query language over Markdown files. It's designed as an Obsidian plugin and will give you superpowers with your Obsidian Vaults. If you're not familiar with it, Obsidian is a free graph knowledge base that works on top of a local folder of Markdown files and is great for things like note taking, book development, ideation,
etc. Was this email forwarded to you? Sign up here >> |
Older messages
Data Elixir - Issue 387
Tuesday, May 17, 2022
Supervised clustering. Bandits for recommender systems. JavaScript for R. Teaching data science at scale.
Data Elixir - Issue 386
Tuesday, May 10, 2022
Trusting your data. How to protect your models. How to hire for DS roles. Horizon charts.
Data Elixir - Issue 385
Tuesday, May 3, 2022
Making data actionable. Using BIG AI models in a startup. From academia to industry. ML validity. Mental models for visualization.
Data Elixir - Issue 384
Tuesday, April 26, 2022
Data tests. Null Island. Confidence intervals for ML classifiers. Containers for ML. Performance utilities for regression modeling.
Data Elixir - Issue 383
Tuesday, April 19, 2022
Data teams: embedded or centralized? Unskilled and unaware of it. Counterfactual evaluation. Quant UX vs data science.
You Might Also Like
Edge 449: Getting Into Adversarial Distillation
Tuesday, November 19, 2024
A way to distill models using inspiration from GANs. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Power BI Weekly #285 - 19th November 2024
Tuesday, November 19, 2024
Power BI Weekly Newsletter Issue #285 powered by endjin Welcome to the 285th edition of Power BI Weekly! Quite a short one this week. A couple of people have written about the new Path Layer feature
Software Testing Weekly - Issue 246
Tuesday, November 19, 2024
Highlights from the 10th DORA report by Google 📈 View on the Web Archives ISSUE 246 November 19th 2024 COMMENT Welcome to the 246th issue! It's hard to believe that DORA metrics have been around
💻 Installing Linux on an Old Laptop Instead of a Raspberry Pi — Flagship Phones Need More Storage
Monday, November 18, 2024
Also: I Built the Perfect Programming Platform In Less Than 10 Minutes, and More! How-To Geek Logo November 18, 2024 Did You Know The Sixth Sense was the highest-grossing horror film of all time in
Daily Coding Problem: Problem #1612 [Hard]
Monday, November 18, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Etsy. Given a sorted array, convert it into a height-balanced binary search tree.
10,000 ways to fail & The European Search Perspective
Monday, November 18, 2024
Reflecting on over five years of Creativerly, Signal introduces Call Links, the science of mental models, and a lot more in this week's issue of Creativerly. Creativerly 10000 ways to fail &
Charted | Global GHG Emissions, by Sector 🌎
Monday, November 18, 2024
In this graphic, we show greenhouse gas emissions by sector in 2023. View Online | Subscribe | Download Our App Presented by: New 3-Part Series: Bitcoin Demystified >> Learn more about one of the
Spyglass Dispatch: Samsung/Google Smart Glasses • Star Wars Mess • Netflix Knocked Out • Conan's Oscars • MicroStrategy's Comeback • Vision Pro In Focus • Saving 'Inside the NBA' • Apple Television Lives!
Monday, November 18, 2024
Samsung/Google Smart Glasses • Star Wars Mess • Netflix Knocked Out • Conan's Oscars • MicroStrategy's Comeback • Vision Pro In Focus • Saving 'Inside the NBA' • Apple Television Lives!
GCP Newsletter #424
Monday, November 18, 2024
Welcome to issue #425 November 18th, 2024 News Google Kubernetes Engine Official Blog 65000 nodes and counting: Google Kubernetes Engine is ready for trillion-parameter AI models - Google Kubernetes
Design and code beautiful products. Together.
Monday, November 18, 2024
Pablo Ruiz-Múzquiz and the team at Penpot have recently announced a new plugin feature that allows users to build new tools and functionalities on the platform. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏