SRE Weekly - SRE Weekly Issue #272
Articles
Salesforce has posted a ton of information about their major outage two weeks ago.
It involved a change to their DNS system that combined with an issue in BIND daemon shutdown that prevented it from starting back up.
The analysis goes into great detail on the fact that an engineer used the Emergency Break-Fix (EBF) process to rush out the DNS configuration change.
In this case, the engineer subverted the known policy and the appropriate disciplinary action has been taken to ensure this does not happen in the future.
Thanks to an anonymous reader for pointing this out to me.
Salesforce
This article calls out the heavily blame-ridden language in the above incident analysis and the briefing given by Salesforce’s Chief Reliability Officer.
I’m dismayed to see such language from someone who is at the C-level for reliability.
“For whatever reason that we don’t understand, the employee decided to do a global deployment,” Dieken went on.
Richard Speed — The Register
…and the Twittersphere agrees with me.
If you want to blame someone, maybe try blaming the “chief availability officer” who oversees a system so fragile that one action by one engineer can cause this much damage. But it’s never that simple, is it.
@ReinH on Twitter
Another really great take on the Salesforce outage followup.
Lorin Hochstein
I like how this article covers the different roles that SREs play.
Emily Arnott — Blameless
The principles covered in this article are:
- Build a hypothesis around steady-state behavior
- Vary real-world events
- Run experiments in production
- Automate experiments to run continuously
- Minimize blast radius
Casey Rosenthal — Verica
This post is full of thought-provoking questions on the nature of configuration changes and incidents.
Lorin Hochstein
Outages
- IBM Cloud
- Klarna
- Klarna showed users information related to other users, as detailed in this followup post.
|
Older messages
SRE Weekly Issue #271
Monday, May 24, 2021
View on sreweekly.com A message from our sponsor, StackHawk: Join StackHawk on Tuesday, May 25 for a hands-on authenticated security testing workshop. Follow along as we walk through three common
SRE Weekly Issue #270
Monday, May 17, 2021
View on sreweekly.com A message from our sponsor, StackHawk: APIs are not only the backbone of modern application architecture, but they are also a key part of security. Discover what API security
SRE Weekly Issue #269
Monday, May 10, 2021
View on sreweekly.com A message from our sponsor, StackHawk: Tune into ZAPCon After Hours this Tuesday at 8 am PT to learn how to include automated security testing in your builds with ZAP http://sthwk
SRE Weekly Issue #268
Monday, May 3, 2021
View on sreweekly.com A message from our sponsor, StackHawk: Join StackHawk Tuesday May 4 at 9 am PT for a hands-on technical workshop! By the end of the session, you will have three types of security
SRE Weekly Issue #267
Monday, April 26, 2021
View on sreweekly.com A message from our sponsor, StackHawk: Serverless doesn't mean secure. Use modern security testing tools to assess serverless applications for vulnerabilities during
You Might Also Like
Daily Coding Problem: Problem #1668 [Easy]
Tuesday, January 14, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Microsoft. A number is considered perfect if its digits sum up to exactly 10. Given a
Django vs FastAPI, Interacting With Python, Data Cleaning, and More
Tuesday, January 14, 2025
Django vs. FastAPI, an Honest Comparison #664 – JANUARY 14, 2025 VIEW IN BROWSER The PyCoder's Weekly Logo Django vs. FastAPI, an Honest Comparison David has worked with Django for a long time, but
🤖 Yes, I Do Want a Drink-Carrying Robot — The Best Way to Give Old TVs Bluetooth
Tuesday, January 14, 2025
Also: How to Prevent Your Computer From Waking Up Accidentally, and More! How-To Geek Logo January 14, 2025 Did You Know Except for the letter Q, every letter of the alphabet shows up in the names of
Charted | AI's Perceived Impact on Job Creation, by Country 🔮
Tuesday, January 14, 2025
This chart presents Ipsos survey results on whether people believe AI will create many new jobs in their country. View Online | Subscribe | Download Our App Presented by Hinrich Foundation NEW REPORT:
HackerNoon Decoded: How Users Searched in 2024
Tuesday, January 14, 2025
Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, January 14, 2025? The
Hack Your Python Roadblocks -- Just 5 Seats Left
Tuesday, January 14, 2025
Hi there, A quick update: after last week's announcement, the Intermediate Python Deep Dive live course is almost full. We're down to just 5 spots left for the February cohort, and once they
Spyglass Dispatch: TikTok & Twitter
Tuesday, January 14, 2025
Sonos Switch • MySports Streaming • Amazon's Alexa Brain Freeze • Billionaire-Free Social Media • EU Backs off Big Tech The Spyglass Dispatch is a newsletter sent on weekdays featuring links and
5 AI Predictions for 2025 (AI hype dying; real opportunities rising)
Tuesday, January 14, 2025
plus, a new study: AI Economy = $15 trillion. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Power BI Weekly #291 - 14th January 2025
Tuesday, January 14, 2025
Power BI Weekly Newsletter Issue #291 powered by endjin Welcome to the 291st edition of Power BI Weekly! No official Power BI blogs yet, so let's dive into the community articles. To start, Eugene
LW 165 - How Shopify Built Its Live Globe for Black Friday
Tuesday, January 14, 2025
How Shopify Built Its Live Globe for Black Friday Shopify Development news and articles Issue 165