SRE Weekly - SRE Weekly Issue #337

View on sreweekly.com

Thanks for all the vacation well-wishes! It was really great and relaxing. Take vacations, it’s important for reliability!

While I was out, I shipped the past two issues with content prepared in advance, and without the Outages section. This gave me a chance to really think hard about the value of the Outages section versus the time and effort I put into it.

I’ve decided to put the Outages section on hiatus for the time being. For notable outages, I’ll include them in the main section, on a case-by-case basis. Read on if you’re interested in what went into this decision.

The Outages section has always been of lower quality than the rest of the newsletter. I have no scientific process for choosing which Outages make the cut — mostly it’s just whatever shows up in my Google search alerts and seems “important”, minus a few arbitrary categories that don’t seem particularly interesting like telecoms and games. I do only a cursory review of the outage-related news articles I link to, and often they’re on poor-quality sites with a ton of intrusive ads. Gathering the list of Outages has begun taking more and more of my time, and I’d much rather spend that effort on curating quality content, so that’s what I’m going to do going forward.

A message from our sponsor, Rootly:

Manage incidents directly from Slack with Rootly 🚒.

Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?:

https://rootly.com/demo/

Every one of these 10 items is enough reason to read this article! This makes me want to go investigate some incidents right now.

  Fischer Jemison — Jeli

Slack shares with us in great detail why they use circuit breakers and how they rolled them out.

  Frank Chen — Slack

My favorite part of this one is the section on expectations. We need to socialize this to help reduce the pressure on folks going on call for the first time.

  Prakya Vasudevan — Squadcast

Status pages are marketing material. Prove me wrong.

  Ellen Steinke — Metrist

incidents have unusually high information density compared with day-to-day work, and they enable you to piggy-back on the experience of others

  Lisa Karlin Curtis — incident.io

These folks realized that they had two different use cases for the same data, real-time transactions and batch processing. Rather than try to find one DB that could support both, they fork two copies of the data.

  Xi Chen and Siliang Cao — Grab

It’s all about gathering enough information that you can ask new questions when something goes wrong, rather than being stuck with only answers to the questions you thought to ask in advance.

  Charity Majors

They needed the speed of local ephemeral SSDs but the reliability of network-based persistent disks. The solution: a linux MD option to mirror but prefer to read from the local disks. Neat!

  Glen Oakley — Discord

OS upgrades can be risky. LinkedIn developed a system to unify OS upgrade procedures and make them much less risky.

  Hengyang Hu, Dinesh Dhakal, and Kalyanasundaram Somasundaram — LinkedIn







This email was sent to you
why did I get this?    unsubscribe from this list    update subscription preferences
SRE Weekly · PO Box 253 · South Lancaster, MA 01561-0253 · USA

Older messages

SRE Weekly Issue #336

Monday, August 29, 2022

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and

SRE Weekly Issue #335

Monday, August 22, 2022

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and

SRE Weekly Issue #334

Monday, August 15, 2022

View on sreweekly.com I'll be on vacation starting next Sunday (yay!). That means the next two issues will be prepared in advance, so there won't be an Outages section. A message from our

SRE Weekly Issue #333

Monday, August 8, 2022

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and

SRE Weekly Issue #332

Monday, August 1, 2022

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and

You Might Also Like

Post from Syncfusion Blogs on 04/25/2024

Thursday, April 25, 2024

New blogs from Syncfusion How BoldSign Improved HR Operations at Syncfusion By Syncfusion HR Team Let's see how Syncfusion's BoldSign revolutionizes HR operations with seamless document

😩Not Another iPad Caaaase!

Thursday, April 25, 2024

The last iPad case you need. See the most loved features you can't live without. The form and style of ZUGU cases have evolved naturally, resulting from designing products that safeguard your

Edge 390: Diving Into Databricks' DBRX: One of the Most Impressive Open Source LLMs Released Recently

Thursday, April 25, 2024

The model uses an MoE architecture which exhibits remarkable perfromance on a relatively small budget. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

US TikTok ban 📱, Meta's $200B drop 📉, Node.js 22 👨‍💻

Thursday, April 25, 2024

President Joe Biden has signed into law a bill that orders TikTok owner ByteDance to sell the company within 270 days or lose access to the US market Sign Up |Advertise|View Online TLDR Together With

Learning about Android Runtime

Thursday, April 25, 2024

View in browser 🔖 Articles Learning about Android Runtime I always enjoy reading articles that explore how something works under the hood. Here's an article that does exactly that, providing

Stripe changes its … stripes

Wednesday, April 24, 2024

TikTok on the president's docket and Nvidia acquires Run:ai View this email online in your browser By Christine Hall Wednesday, April 24, 2024 Good afternoon, and welcome to TechCrunch PM! Today

💪 You Can Use Copilot AI as a Personal Trainer — Why Your Laptop Needs a Docking Station

Wednesday, April 24, 2024

Also: Here's How to Make Your Apple ID Recoverable, and More! How-To Geek Logo April 24, 2024 📩 Get expert reviews, the hottest deals, how-to's, breaking news, and more delivered directly to

JSK Daily for Apr 24, 2024

Wednesday, April 24, 2024

JSK Daily for Apr 24, 2024 View this email in your browser A community curated daily e-mail of JavaScript news JSK Weekly - 24th April, 2024 React 19 has introduced many great functionalities and

Daily Coding Problem: Problem #1422 [Hard]

Wednesday, April 24, 2024

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Airbnb. Given a list of integers, write a function that returns the largest sum of non-

Charted | Artificial Intelligence Patents, by Country 🤖

Wednesday, April 24, 2024

This visualization shows which countries have been granted the most AI patents each year, from 2012 to 2022. View Online | Subscribe Presented by: New on VC+: Our Visual Briefing on the IMF's World