SRE Weekly - SRE Weekly Issue #457
View on sreweekly.com
In this post, we’ll explore the reasons that OOM kills can occur and provide tactics to combat and prevent them.
Will Searle — Causely
The high-plateau of basic resilience is the third interim stop, companies tend to reach on their journey towards resilience.
I especially enjoyed the bit about how trying to add robustness can paradoxically diminish overall reliability, reminiscent of Lorin Hochstein and others.
Uwe Friedrichsen
What happens when you move your DB and network latency goes from 0.5ms to 10ms? Time to find out by experimenting (carefully).
Lawrence Jones
I've only used Kubernetes under Amazon EKS, which handles running etcd, so this guide helped fill in some gaps in my knowledge. Of course, under EKS, you still need to pay attention to etcd.
David M. Lentz — Datadog
Google folks share how they've applied System-Theoretic Accident Model and Processes (STAMP) to SRE at Google. This really stood out to me:
A design might implement its requirements flawlessly. But what if requirements necessary for the system to be safe were incorrect or, even worse, missing altogether?
Tim Falzone and Ben Treynor Sloss — USENIX ;login:
Search and rescue (SAR) operations and incident response have striking similarities. In this series, Claire dives into lessons SREs can learn from wildfire management ICSs.
I really love learning about ICS from the veterans who use it for actual emergencies!
Claire Leverne — Rootly
Runbooks are programs for an imperfect execution engine of highly variable quality.
What happens when the runbook meets reality?
Jos Visser
This is a really great one! Several factors combined to cause the outage, and they're all laid out in juicy detail.
Brendan Humphreys — Canva
Here's Lorin Hochstein's take on Canva's outage report.
Lorin Hochstein
|
Older messages
SRE Weekly Issue #456
Monday, December 23, 2024
View on sreweekly.com A message from our sponsor, FireHydrant: On-call during the holidays? Spend more time taking in some R&R and less getting paged. Let alerts make their rounds fairly with our
SRE Weekly Issue #455
Thursday, December 19, 2024
View on sreweekly.com A message from our sponsor, FireHydrant: FireHydrant Retrospectives are now more customizable and collaborative than ever with custom templates, AI-generated answers,
SRE Weekly Issue #454
Tuesday, December 10, 2024
View on sreweekly.com Nine entire years ago, I threw together a few "issues" with my favorite SRE articles, installed Wordpress, and added a subscription form, with no clue what I was doing.
SRE Weekly Issue #453
Monday, December 2, 2024
View on sreweekly.com A message from our sponsor, FireHydrant: Why migrate from PagerDuty? Empower team-level ownership, reduce costs, decouple alerts from incidents, automate incidents end-to-end...to
SRE Weekly Issue #452
Monday, November 25, 2024
View on sreweekly.com A message from our sponsor, FireHydrant: Practice Makes Prepared: Why Every Minor System Hiccup Is Your Team's Secret Training Ground. https://firehydrant.com/blog/the-hidden-
You Might Also Like
Re: This took me 10 minutes and protects my privacy
Thursday, January 2, 2025
My New Year's resolution is to do a better job of protecting my identity online. Last year, billions of personal records were compromised due to data breaches. That's why I wanted to tell you
Edge 462: What is Fast-LLM. The New Popular Framework for Pretraining your Own LLMs
Thursday, January 2, 2025
Created by ServiceNow, the framework provides the key building blocks for pretraining AI models. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The Evolution of Android Architecture Patterns
Thursday, January 2, 2025
View in browser 🔖 Articles The Evolution of Android Architecture Patterns As we welcome the New Year, dive into the transformative journey of Android architecture patterns. From MVC to MVI, explore how
🤖 Here’s How Robots are Farming Your Food — My 5 Video Game Resolutions for 2025
Wednesday, January 1, 2025
Also: Facebook Is Too Good at Suggesting Ads, and More! How-To Geek Logo January 1, 2025 Did You Know After the 1982 film ET: The Extra-Terrestrial featured Reese's Pieces prominently as a treat
Daily Coding Problem: Problem #1655 [Medium]
Wednesday, January 1, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. What will this code print out? def make_functions(): flist = [] for i in [1, 2,
Ranked | The Most Popular Visualizations on Voronoi in 2024 🏆
Wednesday, January 1, 2025
We round up the most popular, most discussed, and most liked visualizations of 2024 on Voronoi, our data storytelling platform. View Online | Subscribe In 2024, there were over 3000 visualizations
Top Tech Deals 👀 Cheap TVs, Gaming Headphones, Apple Watch, and More!
Wednesday, January 1, 2025
Score an indoor smart camera, soundbar, and more at a big discount. How-To Geek Logo January 1, 2025 Top Tech Deals: Cheap TVs, Gaming Headphones, Apple Watch, and More! Score an indoor smart camera,
New U.S. DoJ Rule Halts Bulk Data Transfers to Adversarial Nations to Protect Privacy
Wednesday, January 1, 2025
THN Daily Updates Newsletter cover Full Stack Web Development ($54.99 Value) FREE for a Limited Time This book offers a comprehensive guide to full stack web development, covering everything from core
Detecting Parasites 🪲
Wednesday, January 1, 2025
A newsletter publisher takes on parasite SEO. Here's a version for your browser. Hunting for the end of the long tail • December 31, 2024 Detecting Parasites Our year-end award for best blog post
Final Chance to Claim Your Bonus Gift 🎁
Tuesday, December 31, 2024
Just sign up to VC+ by January 1st and we'll include a free gift. View email in browser OFFER ENDS JANUARY 1ST Last Chance to Claim Your Free Gift! The Global Forecast Series, presented by Inigo,