SRE Weekly - SRE Weekly Issue #222
Articles
This article in a nutshell:
- Nines don’t matter if users aren’t happy (h/t Charity Majors)
- Chaos engineering
Kolton Andrus — Gremlin
I hadn’t heard of this distinction before. If you haven’t either, click through to find out more.
Ayende Rahien — RavenDB
In our experience, the three big sources of production stress are:
- Toil
- Bad monitoring
- Immature incident handling procedures
Cheryl Kang — Google
ProPublica picks apart the incident in exhaustive detail, showing how multiple problems interwoven in the organization contributed to this tragedy.
Robert Faturechi, Megan Rose and T. Christian Miller — ProPublica
There’s a great review of Rasmussen’s safety boundary model, which I wasn’t previously familiar with. A system moves between three boundaries:
- the boundary to economic failure
- the boundary of unacceptable work load
- the boundary of functionally acceptable performance
Lorin Hochstein
This one includes a really nifty graph showing how reliable your N backend microservices need to be in order to hit a given reliability target R.
Bill Duncan
Here are the results of the survey I linked here a couple weeks ago. There are some interesting and surprising results, well worth a read.
Rich Burroughs — FireHydrant
A commonly-used CA’s Root certificate expired, causing some havoc. Even though Sectigo did everything right, some software didn’t handle the transition to the new root well.
Paul Ducklin — Naked Security
Outages
- PagerDuty
- Coinbase
- Coinbase had an outage on June 1. Click for their post-incident analysis.
- Robinhood
- Robinhood’s status page doesn’t show history, so I can’t verify this one.
- iCloud
- Ebay
- Ebay’s status page also doesn’t show history, so I can’t verify this one either.
- Lloyds and Halifax (bank)
- Adobe Cloud
- Squarespace
- Their followup post discusses the large-scale DDoS that contributed to the outage.
- HostedGraphite
- Telegram
|
Older messages
SRE Weekly Issue #221
Monday, June 1, 2020
View on sreweekly.com Don't forget, Catchpoint's SRE From Home event is happening this Friday. The speaker list has some names you'll recognize from articles linked here in previous issues.
SRE Weekly Issue #220
Monday, May 25, 2020
View on sreweekly.com A message from our sponsor, StackHawk: Hi, SRE Weekly. We're your new newsletter sponsor, StackHawk. We believe that application security is an important part of reliability
SRE Weekly Issue #219
Monday, May 18, 2020
View on sreweekly.com Articles Download our new on-call book [Atlassian] Check out this new 100-page ebook on incident response from Atlassian, great for folks setting up a brand new on-call structure
SRE Weekly Issue #218
Monday, May 11, 2020
View on sreweekly.com Articles Checklists and Runbooks An airplane pilot's take on runbooks, by way of comparison to aviation checklists. Bill Duncan Old box, dumb code, few thousand connections,
SRE Weekly Issue #217
Monday, May 4, 2020
View on sreweekly.com A message from our sponsor, VictorOps: Our people and tools need to be connected now more than ever before. That's why VictorOps is offering free, 90-day extended Enterprise
You Might Also Like
🔒 The Vault Newsletter: November issue 🔑
Monday, November 25, 2024
Get the latest business security news, updates, and advice from 1Password. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
🧐 The Most Interesting Phones You Didn't See in 2024 — Making Reddit Faster on Older Devices
Monday, November 25, 2024
Also: Best Black Friday Deals So Far, and More! How-To Geek Logo November 25, 2024 Did You Know If you look closely over John Lennon's shoulder on the iconic cover of The Beatles Abbey Road album,
JSK Daily for Nov 25, 2024
Monday, November 25, 2024
JSK Daily for Nov 25, 2024 View this email in your browser A community curated daily e-mail of JavaScript news JavaScript Certification Black Friday Offer – Up to 54% Off! Certificates.dev, the trusted
Ranked | How Americans Rate Business Figures 📊
Monday, November 25, 2024
This graphic visualizes the results of a YouGov survey that asks Americans for their opinions on various business figures. View Online | Subscribe Presented by: Non-consensus strategies that go where
Spyglass Dispatch: Apple Throws Their Film to the Wolves • The AI Supercomputer Arms Race • Sony's Mobile Game • The EU Hunts Bluesky • Bluesky Hunts User Trust • 'Glicked' Pricked • One Massive iPad
Monday, November 25, 2024
Apple Throws Their Film to the Wolves • The AI Supercomputer Arms Race • Sony's Mobile Game • The EU Hunts Bluesky • Bluesky Hunts User Trust • 'Glicked' Pricked • One Massive iPad The
Daily Coding Problem: Problem #1619 [Hard]
Monday, November 25, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Google. Given two non-empty binary trees s and t , check whether tree t has exactly the
Unpacking “Craft” in the Software Interface & The Five Pillars of Creative Flow
Monday, November 25, 2024
Systems Over Substance, Anytype's autumn updates, Ghost's progress with its ActivityPub integration, and a lot more in this week's issue of Creativerly. Creativerly Unpacking “Craft” in the
What Investors Want From AI Startups in 2025
Monday, November 25, 2024
Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 25, 2024? The HackerNoon
GCP Newsletter #426
Monday, November 25, 2024
Welcome to issue #426 November 25th, 2024 News LLM Official Blog Vertex AI Announcing Mistral AI's Large-Instruct-2411 on Vertex AI - Google Cloud has announced the availability of Mistral AI's
⏳ 36 Hours Left: Help Get "The Art of Data" Across the Finish Line 🏁
Monday, November 25, 2024
Visual Capitalist plans to unveal its secrets behind data storytelling, but only if the book hits its minimum funding goal. View Online | Subscribe | Download Our App We Need Your Help Only 36 Hours