SRE Weekly - SRE Weekly Issue #304
Articles
Ably processes a lot of messages, so when they have to redesign a core part of their architecture, it gets pretty interesting.
Simon Woolf — Ably
If you ask any Site Reliability or DevOps engineer how they feel about a deployment plan with over 300 single points of failure, you’d see a lot of nauseous faces and an outbreak of nervous tics!
Nevertheless, that was the best design. Read on to find out why.
Robert Barron
Slack had three separate incidents while trying to deploy DNSSEC for slack.com
. This article goes into deep detail on what went wrong each time and what they learned.
Yes, it was an oversight that we did not test a domain with a wildcard record before attempting slack.com — learn from our mistakes!
Rafael Elvira and Laura Nolan — Slack
The specializations outlined in this article include:
- The Educator
- The SLO Guard
- Infrastructure architect
- Incident response leader
Emily Arnott — Blameless
If you had to design a WhatsApp today to support its current load, how would you go about it? Here’s one possible design.
Ankit Sirmorya — High Scalability
Yesterday I asked on Twitter why you might want to run your own DNS servers, and I got a lot of great answers that I wanted to summarize here.
Julia Evans
In this podcast interview, find out more about why Courtney Nash created the VOID and how posting an incident report can benefit your company. Transcript available.
Mandy Walls (with guest Courtney Nash) — Page it to the Limit
Drawing on Cynefin, this article explains why debugging by feel and guesswork won’t suffice anymore; we need to be methodical.
  Pete Hodgson — Honeycomb
Outages
|
Older messages
SRE Weekly Issue #303
Monday, January 3, 2022
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right
SRE Weekly Issue #302
Monday, December 27, 2021
View on sreweekly.com Happy holidays, for those that celebrate! I put this issue together in advance, so no Outages section this week. A message from our sponsor, Rootly: Manage incidents directly from
SRE Weekly Issue #301
Monday, December 20, 2021
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right
SRE Weekly Issue #300
Monday, December 13, 2021
View on sreweekly.com 300 issues. 6 years. Wow! I couldn't have done it without all of you wonderful people, writing articles and reading issues. Thanks, you make curating this newsletter fun! A
SRE Weekly Issue #299
Monday, December 6, 2021
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right
You Might Also Like
WP Weekly 226 - Launches - New Elementor Theme, WP 6.8 in April 2025, Automattic Scale Back
Monday, January 13, 2025
Read on Website WP Weekly 226 / Launches 2025 has just started, and there is a slew of new launches like Hello Biz Theme, Meta Box Lite, FooConvert, Affililink, and more. Also, the next WordPress 6.8
SRE Weekly Issue #459
Monday, January 13, 2025
View on sreweekly.com A message from our sponsor, incident.io: Effective incident management demands coordination and collaboration to minimize disruptions. This guide by incident.io covers the full
Saving One Screen At A Time 🖥️
Monday, January 13, 2025
Why the screen saver stopped being so in-your-face. Here's a version for your browser. Hunting for the end of the long tail • January 12, 2025 Today in Tedium: Having seen a lot of pipes, wavy
Software Testing Weekly - Issue 253
Monday, January 13, 2025
Software Testing Weekly turns 5! 🥳 View on the Web Archives ISSUE 253 January 13th 2025 COMMENT Welcome to the 253rd issue! Oh my, time flies! It's hard to believe this week marks 5 years since I
CES 2025 - Sync #501
Sunday, January 12, 2025
Plus: Sam Altman reflects on the last two years; Anthropic reportedly in talks to raise $2B at $60B valuation; e-tattoo decodes brainwaves; anthrobots; top 25 biotech companies for 2025; and more! ͏ ͏
PD#608 Mistakes engineers make in large established codebases
Sunday, January 12, 2025
You can't practice it beforehand ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
C#539 A detailed look at EF Core’s JSON Columns feature
Sunday, January 12, 2025
Comparing it with the traditional tables with indexes
RD#488 How to avoid issues with custom Hooks
Sunday, January 12, 2025
Using them carelessly can lead to many problems
Daily Coding Problem: Problem #1666 [Easy]
Sunday, January 12, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Amazon. Given n numbers, find the greatest common denominator between them. For example,
🛜 Here's What Happens to Old Websites — Features the Pixel Should Copy From Samsung's One UI 7
Sunday, January 12, 2025
Also: What Instagram Needs to Compete With TikTok, and More! How-To Geek Logo January 12, 2025 Did You Know Mount Wingen, located near Wingen, New South Wales in Australia, is better known as Burning