SRE Weekly - SRE Weekly Issue #388
Articles
This article makes a cool analogy between designing systems to operate well under unexpected load and designing socio-technical systems that operate well when the people are surprised by what the system is doing.
Lorin Hochstein
If you need to create SLAs, this article has some solid advice on how to go about it — and what to avoid.
incident.io
If Prometheus can’t scrape your service, an alert can get resolved incorrectly — and that can happen exactly when your service is failing!
Chris Siebenmann
A really nifty three-part exploration of action items in the aftermath of an incidents. Rather than consider cost/benefit, this article series proposes that we think about the likelihood of an action item being completed.
J. Paul Reed
Yes, as it turns out — and these folks have the receipts (along with some theories as to why).
Colin Bartlett
The “wow” moment in this article is under the heading, “What can we learn from creative desperation?”
Eric Dobbs — Learning From Incidents
Before explaining how they set up their on-call, these folks share why they avoided it in the early stages of their startup, and what made them finally take the plunge.
Dustin Brown — DoltHub
For the good of the profession, the SRE community still needs to coalesce around more consistent job ladders, expectations, and competencies.
Code Reliant
Honeycomb had their worst incident ever at the end of July, and in their characteristic style, they’ve posted an incredibly detailed analysis of what happened — and that’s just the blog post. Then you can click through for a 17-page PDF with lots more detail.
Fred Hebert — Honeycomb
Full disclosure: Honeycomb is my employer.
|
Older messages
SRE Weekly Issue #388
Monday, September 4, 2023
View on sreweekly.com A message from our sponsor, Rootly: When incidents impact your customers, failing to communicate with them effectively can erode trust even further and compound an already
SRE Weekly Issue #387
Monday, August 28, 2023
View on sreweekly.com A message from our sponsor, Rootly: When incidents impact your customers, failing to communicate with them effectively can erode trust even further and compound an already
SRE Weekly Issue #386
Tuesday, August 22, 2023
View on sreweekly.com This issue was delayed a day while I was enjoying a much-needed vacation with my family. While I'm on the subject, it's hot take time: vacations are important for the
SRE Weekly Issue #385
Monday, August 14, 2023
View on sreweekly.com Many apologies to Matt Cooper at GitHub, who is the actual author of the article Scaling Merge-ort Across GitHub from last week. Sorry for the mis-credit, Matt! A message from our
SRE Weekly Issue #384
Monday, August 7, 2023
View on sreweekly.com A message from our sponsor, Rootly: When incidents impact your customers, failing to communicate with them effectively can erode trust even further and compound an already
You Might Also Like
C#546 Finalizers are tricker than you think
Sunday, March 2, 2025
Common pitfalls when implementing finalizers
PD#615 How Core Git Developers Configure Git
Sunday, March 2, 2025
What git config settings should be defaults by now? ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Daily Coding Problem: Problem #1706 [Medium]
Sunday, March 2, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Microsoft. Given an unsorted array of integers, find the length of the longest
RD#495 Functional programming in React
Sunday, March 2, 2025
First-class functions, pure functions, immutability, currying and composition
Sunday Digest | Featuring 'How Far $1 Million Gets You in Retirement, by U.S State' 📊
Sunday, March 2, 2025
Every visualization published this week, in one place. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Android Weekly #664 🤖
Sunday, March 2, 2025
View in web browser 664 March 2nd, 2025 Articles & Tutorials Sponsored Implementing Session Replay for Jetpack Compose This post recounts our difficult but rewarding journey of building one of our
😸 AI, AI, and more AI
Sunday, March 2, 2025
Really, a lot of AI launched this week Product Hunt Sunday, Mar 02 The Roundup rise and shine legends 🫶 It's Sunday again and before those scaries settle in, why not settle down with a cup of
Digest #162: IBM Acquires HashiCorp, No TCP/UDP, GitHub Data Theft, FinOps Future, Postgres Graph, AWS Config, Air…
Sunday, March 2, 2025
GitHub data theft, SQL injection court, and sky-high CPU use. Plus, learn about the future of FinOps, using Postgres as a graph database, saving with AWS Config and Airflow & handling Kubernetes
This Week's Daily Tip Roundup
Sunday, March 2, 2025
Missed some of this week's tips? No problem. We've compiled all of them here in one convenient place for you to enjoy. Happy learning! iPhoneLife Logo View In Browser Your Tip of the Day is
The Sequence Radar #501: DeepSeek 5 New Open Source Releases
Sunday, March 2, 2025
Some of the techniques used in R1 are now open source. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏