SRE Weekly - SRE Weekly Issue #457
View on sreweekly.com
In this post, we’ll explore the reasons that OOM kills can occur and provide tactics to combat and prevent them.
Will Searle — Causely
The high-plateau of basic resilience is the third interim stop, companies tend to reach on their journey towards resilience.
I especially enjoyed the bit about how trying to add robustness can paradoxically diminish overall reliability, reminiscent of Lorin Hochstein and others.
Uwe Friedrichsen
What happens when you move your DB and network latency goes from 0.5ms to 10ms? Time to find out by experimenting (carefully).
Lawrence Jones
I've only used Kubernetes under Amazon EKS, which handles running etcd, so this guide helped fill in some gaps in my knowledge. Of course, under EKS, you still need to pay attention to etcd.
David M. Lentz — Datadog
Google folks share how they've applied System-Theoretic Accident Model and Processes (STAMP) to SRE at Google. This really stood out to me:
A design might implement its requirements flawlessly. But what if requirements necessary for the system to be safe were incorrect or, even worse, missing altogether?
Tim Falzone and Ben Treynor Sloss — USENIX ;login:
Search and rescue (SAR) operations and incident response have striking similarities. In this series, Claire dives into lessons SREs can learn from wildfire management ICSs.
I really love learning about ICS from the veterans who use it for actual emergencies!
Claire Leverne — Rootly
Runbooks are programs for an imperfect execution engine of highly variable quality.
What happens when the runbook meets reality?
Jos Visser
This is a really great one! Several factors combined to cause the outage, and they're all laid out in juicy detail.
Brendan Humphreys — Canva
Here's Lorin Hochstein's take on Canva's outage report.
Lorin Hochstein
|
Older messages
SRE Weekly Issue #456
Monday, December 23, 2024
View on sreweekly.com A message from our sponsor, FireHydrant: On-call during the holidays? Spend more time taking in some R&R and less getting paged. Let alerts make their rounds fairly with our
SRE Weekly Issue #455
Thursday, December 19, 2024
View on sreweekly.com A message from our sponsor, FireHydrant: FireHydrant Retrospectives are now more customizable and collaborative than ever with custom templates, AI-generated answers,
SRE Weekly Issue #454
Tuesday, December 10, 2024
View on sreweekly.com Nine entire years ago, I threw together a few "issues" with my favorite SRE articles, installed Wordpress, and added a subscription form, with no clue what I was doing.
SRE Weekly Issue #453
Monday, December 2, 2024
View on sreweekly.com A message from our sponsor, FireHydrant: Why migrate from PagerDuty? Empower team-level ownership, reduce costs, decouple alerts from incidents, automate incidents end-to-end...to
SRE Weekly Issue #452
Monday, November 25, 2024
View on sreweekly.com A message from our sponsor, FireHydrant: Practice Makes Prepared: Why Every Minor System Hiccup Is Your Team's Secret Training Ground. https://firehydrant.com/blog/the-hidden-
You Might Also Like
💎 Issue 458 - Why Ruby on Rails still matters
Thursday, February 27, 2025
This week's Awesome Ruby Newsletter Read this email on the Web The Awesome Ruby Newsletter Issue » 458 Release Date Feb 27, 2025 Your weekly report of the most popular Ruby news, articles and
📱 Issue 452 - Three questions about Apple, encryption, and the U.K
Thursday, February 27, 2025
This week's Awesome iOS Weekly Read this email on the Web The Awesome iOS Weekly Issue » 452 Release Date Feb 27, 2025 Your weekly report of the most popular iOS news, articles and projects Popular
💻 Issue 451 - .NET 10 Preview 1 is now available!
Thursday, February 27, 2025
This week's Awesome .NET Weekly Read this email on the Web The Awesome .NET Weekly Issue » 451 Release Date Feb 27, 2025 Your weekly report of the most popular .NET news, articles and projects
💻 Issue 458 - Full Stack Security Essentials: Preventing CSRF, Clickjacking, and Ensuring Content Integrity in JavaScript
Thursday, February 27, 2025
This week's Awesome Node.js Weekly Read this email on the Web The Awesome Node.js Weekly Issue » 458 Release Date Feb 27, 2025 Your weekly report of the most popular Node.js news, articles and
💻 Issue 458 - TypeScript types can run DOOM
Thursday, February 27, 2025
This week's Awesome JavaScript Weekly Read this email on the Web The Awesome JavaScript Weekly Issue » 458 Release Date Feb 27, 2025 Your weekly report of the most popular JavaScript news, articles
💻 Issue 453 - Linus Torvalds Clearly Lays Out Linux Maintainer Roles Around Rust Code
Thursday, February 27, 2025
This week's Awesome Rust Weekly Read this email on the Web The Awesome Rust Weekly Issue » 453 Release Date Feb 27, 2025 Your weekly report of the most popular Rust news, articles and projects
💻 Issue 376 - Top 10 React Libraries/Frameworks for 2025 🚀
Thursday, February 27, 2025
This week's Awesome React Weekly Read this email on the Web The Awesome React Weekly Issue » 376 Release Date Feb 27, 2025 Your weekly report of the most popular React news, articles and projects
February 27th 2025
Thursday, February 27, 2025
Curated news all about PHP. Here's the latest edition Is this email not displaying correctly? View it in your browser. PHP Weekly 27th February 2025 Hi everyone, Laravel 12 is finally released, and
📱 Issue 455 - How Swift's server support powers Things Cloud
Thursday, February 27, 2025
This week's Awesome Swift Weekly Read this email on the Web The Awesome Swift Weekly Issue » 455 Release Date Feb 27, 2025 Your weekly report of the most popular Swift news, articles and projects
JSK Daily for Feb 27, 2025
Thursday, February 27, 2025
JSK Daily for Feb 27, 2025 View this email in your browser A community curated daily e-mail of JavaScript news Introducing the New Angular TextArea Component It is a robust and flexible user interface