SRE Weekly - SRE Weekly Issue #332
Articles
Their notification service had complex load characteristics that made scaling up a tricky proposition.
Anand Prakash — Razorpay
Coalescing alerts and adding dependencies in AlertManager were the key to reducing this team’s excessive pager load.
steveazz — GitLab
Lorin Hochstein has started a series of blog posts on what we can learn about incident response from the Uvalde school shooting tragedy in the US. This article looks at how an organization’s perspective can affect their retrospective incident analysis.
Lorin Hochstein
My claim here is that we should assume the officer is telling the truth and was acting reasonably if we want to understand how these types of failure modes can happen.
Every retrospective ever:
We must assume that a person can act reasonably and still come to the wrong conclusion in order to make progress.
Lorin Hochstein
How do you synchronize state between multiple browsers and a backend, and ensure that everyone’s state will eventually converge? These folks explain how they did it, and a bug they found through testing.
Jakub Mikians — Airspace Intelligence
MTTR is a mean, so it doesn’t tell you anything about the number of incidents, among other potential pitfalls.
Dan Slimmon
Last week, I included a GCP outage in europe-west2. This week, Google posted this report about what went wrong, and it’s got layers.
Bonus: another GCP outage report
Meta wants to do away with leap seconds, because they make it especially difficult to create reliable systems.
Oleg Obleukhov and Ahmad Byagowi — Meta
If you’re anywhere near incident analysis in your organization, you need to read this list.
Milly Leadley — incident.io
Outages
|
Older messages
Monday, July 25, 2022
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and
Monday, July 18, 2022
View on sreweekly.com Thanks for all the well-wishes as I took a sick day last week. I'm feeling much better! A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒.
Monday, July 4, 2022
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and
Monday, June 27, 2022
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and
Monday, June 20, 2022
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and
Migrate your Elasticsearch data across clusters with zero downtime
Thursday, August 18, 2022
Deep dive on migrating data effectively using CCS and CCR elastic | Search. Observe. Protect Cloud migration workflow using CCS and CCR Register now Date & Time September 8, 2022 1:30 pm SGT, 3:30
Wednesday, August 17, 2022
Issue #862 — Top 20 stories of August 18, 2022 Issue #862 — August 18, 2022 You receive this email because you are subscribed to Hacker News Digest. You can open it in the browser if you prefer. 1
Google will unplug its IoT Core service in August 2023
Wednesday, August 17, 2022
TechCrunch Newsletter TechCrunch logo The Daily Crunch logo By Christine Hall and Haje Jan Kamps Wednesday, August 17, 2022 Hellooo, Crunchy McCrunchface! (What we're calling the readers of the
Explained | The Relationship Between Climate Change and Wildfires 🔥
Wednesday, August 17, 2022
More carbon in the atmosphere is creating a hotter world—and gradually fueling both climate change and instances of wildfires. View Online | Subscribe FEATURED STORY The Relationship Between Climate
New Webinar! SIMULIA Structural Simulation in Consumer Electronics Design
Wednesday, August 17, 2022
Streamline design cycles and get to market faster View this email in your browser engineering.com Robust Mobile Devices: SIMULIA Structural Simulation in Consumer Electronics Design Live Webinar -
Wednesday, August 17, 2022
JSK Daily for Aug 17, 2022 View this email in your browser A community curated daily e-mail of JavaScript news Implementing Route Protection in Angular using CanActivate In any complex web application,
Wednesday, August 17, 2022
Sometimes less is more and this week Coding Beauty shows us "14 Sensational JavaScript One-Liners That Will Show Your Wizardry", why not give them a bash? Other popular stories this week
HTG Deals: Pixel 6 Pro Drops to Lowest Price Ever, Plus More
Wednesday, August 17, 2022
Logo for How-To Geek Deals August 17, 2022 HTG Deals: Pixel 6 Pro Drops to Lowest Price Ever, Plus More Tech news has been heating up in this final month of summer, with Samsung showing off their
Welcome to ZDNET's next chapter
Wednesday, August 17, 2022
Virtual interviews: 5 ways to make a great impression... ZDNET ZDNET Tech Today - US August 17, 2022 editor's note Today, ZDNET unveiled the biggest upgrade in the 31-year history of the brand,
The Ultimate Guide to Certified in Cybersecurity
Wednesday, August 17, 2022
The Hacker News eBook Update Newsletter The Ultimate Guide to Certified in Cybersecurity Download For Free Brought to you by (ISC)² Download your free resource now > Request This email was sent to