SRE Weekly - SRE Weekly Issue #311
I’m dedicating this issue to the people of Ukraine, and also those in Russia that are protesting the invasion.
Articles
In this episode of the podcast Page it to the Limit, they discuss learning how to be an incident commander.
There was major AWS outage and the second day I was incident command.
Kat Gaines, with guest Iris Carrera — Page it to the Limit
This article discusses three aspects of fully owning your systems: mandate, knowledge, and responsibility. After defining those terms, it goes on to discuss what happens if one of the three is missing.
Alex Ewerlöf
I really like the “Managing High RPS” section, especially the part about ignoring events if they’re too old to be relevant any longer.
Ankush Gulati and David Gevorkyan — Netflix
Cool idea! When a process is overloaded, the system drops requests based on heuristics until the overload condition has passed.
Bryan Barkley — LinkedIn
Here’s another take on incident severity and priority levels. The two terms are different and mean specific things.
Robert Ross — FireHydrant
Can we please agree to stop calling them “postmortems”?
Ash P — Cruform Newsletter
The term “service level” goes back to the US highway system maintenance procedures, among others.
Akshay Chugh and Piyush Verma — Last9
Charity Majors has railed against metrics for years. Now, her company Honeycomb has a metrics product offering. How does she square it?
Charity Majors — Honeycomb
Despite the December AWS outage, folks aren’t fleeing AWS, and multi-cloud designs for reliability still don’t make sense, according to this cloud consultant. The media angle is fascinating.
Lydia Leong — Cloud Pundit
This article has a great list of ideas of who to talk to, plus a section on how to prioritize when you’re short on time.
Daniela Hurtado — Jeli
Outages
- Slack
-
They posted a followup with details on what happened.
A configuration change inadvertently lead to a sudden increase in activity on our database infrastructure.
-
- crates.io (Rust package repository)
- British Airways
- Truth Social
- Peloton
- Truth Social
-
Due to the overwhelming demand at launch, we are currently rate-limited on onboarding new users to the platform.
-
|
Older messages
SRE Weekly Issue #310
Monday, February 21, 2022
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right
SRE Weekly Issue #309
Monday, February 14, 2022
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right
SRE Weekly Issue #308
Monday, February 7, 2022
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right
SRE Weekly Issue #307
Monday, January 31, 2022
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right
SRE Weekly Issue #306
Monday, January 24, 2022
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right
You Might Also Like
WP Weekly 226 - Launches - New Elementor Theme, WP 6.8 in April 2025, Automattic Scale Back
Monday, January 13, 2025
Read on Website WP Weekly 226 / Launches 2025 has just started, and there is a slew of new launches like Hello Biz Theme, Meta Box Lite, FooConvert, Affililink, and more. Also, the next WordPress 6.8
SRE Weekly Issue #459
Monday, January 13, 2025
View on sreweekly.com A message from our sponsor, incident.io: Effective incident management demands coordination and collaboration to minimize disruptions. This guide by incident.io covers the full
Saving One Screen At A Time 🖥️
Monday, January 13, 2025
Why the screen saver stopped being so in-your-face. Here's a version for your browser. Hunting for the end of the long tail • January 12, 2025 Today in Tedium: Having seen a lot of pipes, wavy
Software Testing Weekly - Issue 253
Monday, January 13, 2025
Software Testing Weekly turns 5! 🥳 View on the Web Archives ISSUE 253 January 13th 2025 COMMENT Welcome to the 253rd issue! Oh my, time flies! It's hard to believe this week marks 5 years since I
CES 2025 - Sync #501
Sunday, January 12, 2025
Plus: Sam Altman reflects on the last two years; Anthropic reportedly in talks to raise $2B at $60B valuation; e-tattoo decodes brainwaves; anthrobots; top 25 biotech companies for 2025; and more! ͏ ͏
PD#608 Mistakes engineers make in large established codebases
Sunday, January 12, 2025
You can't practice it beforehand ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
C#539 A detailed look at EF Core’s JSON Columns feature
Sunday, January 12, 2025
Comparing it with the traditional tables with indexes
RD#488 How to avoid issues with custom Hooks
Sunday, January 12, 2025
Using them carelessly can lead to many problems
Daily Coding Problem: Problem #1666 [Easy]
Sunday, January 12, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Amazon. Given n numbers, find the greatest common denominator between them. For example,
🛜 Here's What Happens to Old Websites — Features the Pixel Should Copy From Samsung's One UI 7
Sunday, January 12, 2025
Also: What Instagram Needs to Compete With TikTok, and More! How-To Geek Logo January 12, 2025 Did You Know Mount Wingen, located near Wingen, New South Wales in Australia, is better known as Burning