SRE Weekly - SRE Weekly Issue #295

View on sreweekly.com

A message from our sponsor, Rootly:

Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo:
https://rootly.com/?utm_source=sreweekly

Articles

I love this crystal clear argument based on statistics and research. MTTR as a metric is simply meaningless.

Courtney Nash — Verica

Their steps for better communication during an outage:

  • Provide context to minimise speculation
  • Explain what you’re doing to demonstrate you’re ‘on it’
  • Set some expectations for when things will return to normal
  • Tell people what they should do0
  • Let folks know when you’ll be updating them next

Chris Evans — incident.io

Despite checking in advance to be sure their systems would support the new Let’s Encrypt certificate chain, they ran into trouble.

[…] we discovered that several HTTP client libraries our systems use were using their own vendored root certificates.

Heroku

This is the best case I’ve seen yet against multi-cloud infrastructure. I really like the airline analogy.

Lydia Leong

Roblox had a major, several-day outage starting on October 28. I don’t usually include game outages in the Outages section since they’re so common and there’s not usually much information to learn from, I sure do like a good post-incident report. Thanks, folks!

David Baszucki — Roblox

When you’re sending small TCP packets, two optimizations can conspire to introduce an artificial 40 millisecond (not megasecond…) delay.

Vorner

_Here’s Google’s follow-up report for their October 25-26 Meet outage.

Should you count failed requests toward your SLI if the client retries and succeeds? A good argument can be made on either side.

u/Sufficient_Tree4275 and other Reddit users

Mercari restructured its SRE team, moving toward an embedded model to adapt to their growing microservice architecture.

ShibuyaMitsuhiro — Mercari

There’s a really great discussion in this episode about leaving slack in the system in the form of bits of capacity and inefficiency that can be drawn upon to buy time during an outage.

Courtney Nash, with guests Liz Fong-Jones and Fred Hebert — Verica

Here’s how non-SREs can use SRE principles to improve their systems.

Laurel Frazier — Transposit

Outages







This email was sent to you
why did I get this?    unsubscribe from this list    update subscription preferences
SRE Weekly · PO Box 253 · South Lancaster, MA 01561-0253 · USA

Older messages

SRE Weekly Issue #294

Monday, November 1, 2021

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right

SRE Weekly Issue #293

Monday, October 25, 2021

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right

SRE Weekly Issue #292

Monday, October 18, 2021

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right

SRE Weekly Issue #291

Monday, October 11, 2021

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right

SRE Weekly Issue #290

Monday, October 4, 2021

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right

You Might Also Like

📱 Issue 453 - Does iOS have sideloading yet?

Thursday, March 6, 2025

This week's Awesome iOS Weekly Read this email on the Web The Awesome iOS Weekly Issue » 453 Release Date Mar 06, 2025 Your weekly report of the most popular iOS news, articles and projects Popular

💻 Issue 452 - Pro .NET Memory Management 2nd Edition

Thursday, March 6, 2025

This week's Awesome .NET Weekly Read this email on the Web The Awesome .NET Weekly Issue » 452 Release Date Mar 06, 2025 Your weekly report of the most popular .NET news, articles and projects

💎 Issue 459 - What's the Deal with (Ruby) Ractors?

Thursday, March 6, 2025

This week's Awesome Ruby Newsletter Read this email on the Web The Awesome Ruby Newsletter Issue » 459 Release Date Mar 06, 2025 Your weekly report of the most popular Ruby news, articles and

💻 Issue 459 - 7 Best Practices of File Upload With JavaScript

Thursday, March 6, 2025

This week's Awesome Node.js Weekly Read this email on the Web The Awesome Node.js Weekly Issue » 459 Release Date Mar 06, 2025 Your weekly report of the most popular Node.js news, articles and

💻 Issue 459 - TanStack Form V1 - Type-safe, Agnostic, Headless Form Library

Thursday, March 6, 2025

This week's Awesome JavaScript Weekly Read this email on the Web The Awesome JavaScript Weekly Issue » 459 Release Date Mar 06, 2025 Your weekly report of the most popular JavaScript news, articles

💻 Issue 454 - Take a break: Rust match has fallthrough

Thursday, March 6, 2025

This week's Awesome Rust Weekly Read this email on the Web The Awesome Rust Weekly Issue » 454 Release Date Mar 06, 2025 Your weekly report of the most popular Rust news, articles and projects

💻 Issue 377 - TanStack Form V1 - Type-safe, Agnostic, Headless Form Library

Thursday, March 6, 2025

This week's Awesome React Weekly Read this email on the Web The Awesome React Weekly Issue » 377 Release Date Mar 06, 2025 Your weekly report of the most popular React news, articles and projects

📱 Issue 456 - Safer Swift: How ~Copyable Prevents Hidden Bugs

Thursday, March 6, 2025

This week's Awesome Swift Weekly Read this email on the Web The Awesome Swift Weekly Issue » 456 Release Date Mar 06, 2025 Your weekly report of the most popular Swift news, articles and projects

JSK Daily for Mar 6, 2025

Thursday, March 6, 2025

JSK Daily for Mar 6, 2025 View this email in your browser A community curated daily e-mail of JavaScript news Build a Dynamic Watchlist for Your Web App with Angular & GraphQL (Part 6) In this

Charted | Disposable Income Growth of G7 Countries (2007-2024) 📈📉

Thursday, March 6, 2025

Among G7 countries, the US and Canada saw the largest increases in household disposable income since 2007. View Online | Subscribe | Download Our App Invest in your growth at Exchange 2025. FEATURED