SRE Weekly - SRE Weekly Issue #276
Articles
HBO accidentally sent an email to a bunch of people, and they tweeted (jokingly?) blaming their intern. This is a link to a short, thoughtful response thread.
Gergely Orosz
This is the story of the Bunny CDN outage linked below. Great read, thanks folks!
Dejan Grofelnik Pelzel — Bunny
There’s never a bad time to review the fallacies of distributed computing. This article introduces them with examples and discussion of each.
Alex Diaconu — Ably
These aren’t specific tools, but rather 7 classes of tools (with examples). They are:
- Chaos engineering
- Monitoring and alerting
- Observability
- Paging tools
- SLO management
- Infrastructure-as-Code (and everything-as-code)
- Automated incident response
Quentin Rousseau — Rootly
Design is interpretive. We have to find common ground before we can even start to create a design, but finding that common ground is part of the design.
For example, we think of building codes as being precise, but when applied to new situations, they are ambiguous, and the engineers must make a judgment about how to apply them.
Lorin Hochstein
This starts with a really neat moment in which the interviewer asks Yiu to talk about lessons from her jewelry-making hobby that she applies to SRE.
Kurt Andersen
When Gamestop’s stock shot through the roof earlier this year, Reddit’s traffic did too. This is the first article in a short series by Reddit’s SRE team on how they handled the influx.
This article is about the ways that user actions affected their systems in unexpected ways, and how they responded.
Courtney Wang — Reddit
Recently in our Site Reliability Engineering organization in Azure, we established a set of cultural values that we hold ourselves and each other accountable to.
Bill Johnson — Microsoft
Outages
- Western Digital “My Book Live” hard drives
- Amazon Prime Video and Alexa
- PharmOutcomes
- PharmOutcomes is a SaaS used by pharmacies.
- Commonwealth Bank
- medium
- I’ve gotten a few 500s from Medium while trying to review articles last week and this week. Maybe it’s this incident on their status page?
- Bunny (CDN)
- reddit
- This post on their status site says “API errors”, but I saw rumblings that suggested that reddit itself was down.
|
Older messages
SRE Weekly Issue #275
Monday, June 21, 2021
View on sreweekly.com A message from our sponsor, StackHawk: Join ZAP Founder & Project Lead Simon Bennetts on June 30 for a live AMA where he will be answering questions on all things open source
SRE Weekly Issue #274
Monday, June 14, 2021
View on sreweekly.com A message from our sponsor, StackHawk: Join the GraphQL Security Testing Learning Lab on June 29 at 9 AM PT. Learn how to run automated security testing against your GraphQL APIs
SRE Weekly Issue #273
Monday, June 7, 2021
View on sreweekly.com A message from our sponsor, StackHawk: StackHawk is helping One Medical equip developers with automated security testing and self-service remediations. See how: http://sthwk.com/
SRE Weekly Issue #272
Monday, May 31, 2021
View on sreweekly.com A message from our sponsor, StackHawk: See how automated security testing can change how your teams find and fix security vulnerabilities. http://sthwk.com/security-automation
SRE Weekly Issue #271
Monday, May 24, 2021
View on sreweekly.com A message from our sponsor, StackHawk: Join StackHawk on Tuesday, May 25 for a hands-on authenticated security testing workshop. Follow along as we walk through three common
You Might Also Like
WP Weekly 233 - Themes - Offline AI+WP, Trademarks Done, 50K Users in 34 Days
Monday, March 10, 2025
Read on Website WP Weekly 233 / Themes Building new Themes without built-in audience is tough, reveals study. Managed WordPress and Hosted WordPress trademarks acquired. Also in this issue, brand new
SRE Weekly Issue #467
Monday, March 10, 2025
View on sreweekly.com A message from our sponsor, incident.io: SEV0 is back. This fall, we're bringing together the best minds in incident management for a day of learning, sharing, and networking
Where’s Apple Intelligence? - Sync #509
Sunday, March 9, 2025
Plus: Musk vs OpenAI trial set for expedited trial this year; scientists create woolly mice; an android with artificial muscles; another dancing humanoid robot; how to make superbabies; and more! ͏ ͏ ͏
CD#547 Writing a .NET profiler in C#
Sunday, March 9, 2025
CPU profiler for .NET using Silhouette ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
RD#496 Signals in React?
Sunday, March 9, 2025
Not a good idea according to Filipe ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
PD#616 Bloom Filter: A Deep Dive
Sunday, March 9, 2025
How Bloom filters are useful in scenarios with memory constraints ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Daily Coding Problem: Problem #1713 [Hard]
Sunday, March 9, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Netflix. Implement a queue using a set of fixed-length arrays. The queue should support
Netflix codes/Travel Adapter/Real China
Sunday, March 9, 2025
Recomendo - issue #453 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Sunday Digest | Featuring 'The 15 Largest Defense Budgets in the World' 📊
Sunday, March 9, 2025
Every visualization published this week, in one place. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Android Weekly #665 🤖
Sunday, March 9, 2025
View in web browser 665 March 9th, 2025 Articles & Tutorials Sponsored Discover How AI Enables Zero-Maintenance Apps Watch Instabug's CPO, Kenny Johnston in this recorded webinar as he