SRE Weekly - SRE Weekly Issue #399
This research paper summary goes into Mode Error and the dangers of adding more features to a system in the form of modes, especially if the system can change modes on its own.
Fred Hebert (summary)
Dr. Nadine B. Sarter (original paper)
Cloudflare suffered a power outage in one of the datacenters housing their control and data planes. The outage itself is intriguing, and in its aftermath, Cloudflare learned that their system wasn’t as HA as they thought.
Lots of great lessons here, and if you want more, they posted another incident writeup recently.
Matthew Prince — Cloudflare
Separating write from read workloads can increase complexity but also open the door to greater scalability, as this article explains.
Pier-Jean Malandrino
Covers four strategies for load shedding, with code examples:
- Random Shedding
- Priority-Based Shedding
- Resource-Based Shedding
- Node Isolation
Code Reliant
Lots of juicy details about the three outages, including a link to AWS’s write-up of their Lambda outage in June.
Gergely Orosz
The diagrams in this article are especially useful for understanding how the circuit-breaker pattern works.
Pier-Jean Malandrino
This one’s about how on-call can go bad, and how to structure your team’s on-call so to be livable and sustainable.
Michael Hart
Execs cast a big shadow in an incident, so it’s important to have a plan for how to communicate with them, as this article explains.
Ashley Sawatsky — Rootly
|
Older messages
SRE Weekly Issue #398
Monday, November 13, 2023
View on sreweekly.com A message from our sponsor, FireHydrant: “Change is the essential process of all existence.” – Spock It's time for alerting to evolve. Get a first look at how incident
[SRE Weekly] I'll be at KubeCon North America
Monday, November 6, 2023
Hi folks, sorry for invading your inbox / RSS feed an extra time this week! I forgot to mention with yesterday's issue that I'll be at KubeCon this week. Hit me up for some SRE Weekly swag (
SRE Weekly Issue #397
Monday, November 6, 2023
View on sreweekly.com A message from our sponsor, FireHydrant: Incident management platform FireHydrant is combining alerting and incident response in one ring-to-retro tool. Sign up for the early
SRE Weekly Issue #396
Monday, October 30, 2023
View on sreweekly.com A message from our sponsor, FireHydrant: DevOps keeps evolving but alerting tools are stuck in the past. Any modern alerting tool should be built on these four principles: cost-
SRE Weekly Issue #395
Monday, October 23, 2023
View on sreweekly.com A message from our sponsor, FireHydrant: Incident management platform FireHydrant is combining alerting and incident response in one ring-to-retro tool. Sign up for the early
You Might Also Like
Daily Coding Problem: Problem #1664 [Easy]
Friday, January 10, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Twitter. A permutation can be specified by an array P , where P[i] represents the
Spyglass Dispatch: The Case for a For-Profit OpenAI
Friday, January 10, 2025
RIP Venu • A More Political and Real Time Threads • An OpenAI Auction • Apple's Tough 2025 The Spyglass Dispatch is a newsletter sent on weekdays featuring links and commentary on timely topics
⌨️ 10 Mods to Improve Your Mechanical Keyboard — How to Set Up Quick Share on Windows
Friday, January 10, 2025
Also: Why Are Tech Companies Trying to Sell Me Expensive Clocks? How-To Geek Logo January 10, 2025 Did You Know Famed biologist Charles Darwin and US President Abraham Lincoln were born on the same day
Your best friends in design
Friday, January 10, 2025
Working With Designers Product manager & UX designer collaboration guide. How members of your product team work together is just as important as the work itself. A fundamental relationship within
Charted | How Canada Would Rank as the 51st State 📊
Friday, January 10, 2025
Donald Trump has floated the idea that Canada should be the 51st state. Here's how it compares statistically. View Online | Subscribe | Download Our App Presented by: Global X ETFs Power AI's
Pinpointing The Actual Problem 🎯
Friday, January 10, 2025
WordPress accidentally diagnoses its own business problem. Here's a version for your browser. Hunting for the end of the long tail • January 10, 2025 Pinpointing The Actual Problem A blog post from
😱Major Azure Outage in EastUS2, 🚀New AI and Azure Developer CLI Courses, azureedge.net DNS retiring
Friday, January 10, 2025
͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
iOS Cocoa Treats
Friday, January 10, 2025
View in browser Hello, you're reading Infinum iOS Cocoa Treats, bringing you the latest iOS related news straight to your inbox every week. Adopting Swift 6 across the app codebase I've been
Issue #575: Excalibird, bird’s eye metropolis, and Stimulation Clicker
Friday, January 10, 2025
View this email in your browser Issue #575 - January 10th 2025 Weekly newsletter about Web Game Development. If you have anything you want to share with our community please let me know by replying to
22 CES products you can't miss
Friday, January 10, 2025
10 must-install Linux apps; Cybersecurity in 2025; Email encryption how-to -- ZDNET ZDNET Tech Today - US January 10, 2025 CES logo 2025 CES 2025: The 22 most impressive products you don't want to