SRE Weekly - SRE Weekly Issue #341

View on sreweekly.com

A message from our sponsor, Rootly:

Manage incidents directly from Slack with Rootly 🚒.

Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?:

https://rootly.com/demo/

Articles

My coworkers referred to a system “going metastable”, and when I asked what that was, they pointed me to this awesome paper.

Metastable failures occur in open systems with an uncontrolled source of load where a trigger causes the system to enter a bad state that persists even when the trigger is `removed.

  Nathan Bronson, Aleksey Charapko, Abutalib Aghayev, and Timothy Zhu

Honeycomb posted this incident report involving a service hitting the open file descriptors limit.

  Honeycomb
  Full disclosure: Honeycomb is my employer.

Lots of interesting answers to this one, especially when someone uttered the phrase:

engineers should not be on call

  u/infomaniac89 and others — reddit

A misbehaving internal Google service overloaded Cloud Filestore, exceeding its global request limit and effectively DoSing customers.

  Google

An in-depth look at how Adobe improved its on-call experience. They used a deliberate plan to change their team’s on-call habits for the better.

  Bianca Costache — Adobe

This one contains an interesting observation: they found that outages caused by a cloud providers take longer to solve.

  Jeff Martens — Metrist

Even if you don’t agree with all of their reasons, it’s definitely worth thinking about.

  Danny Martinez — incident.io

This one covers common reliability risks in APIs and techniques for mitigating them.

  Utsav Shah

The evolution beyond separate Dev and Ops teams continues. This article traces the path through DevOps and into platform-focused teams.

  Charity Majors — Honeycomb
  Full disclosure: Honeycomb is my employer.







This email was sent to you
why did I get this?    unsubscribe from this list    update subscription preferences
SRE Weekly · PO Box 253 · South Lancaster, MA 01561-0253 · USA

Older messages

SRE Weekly Issue #340

Monday, September 26, 2022

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms

SRE Weekly Issue #339

Monday, September 19, 2022

View on sreweekly.com It's with great sadness that I note the passing of a giant in our field, Dr. Richard Cook. His memory will live on through his huge body of work and the countless ways

SRE Weekly Issue #338

Monday, September 12, 2022

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms

SRE Weekly Issue #337

Monday, September 5, 2022

View on sreweekly.com Thanks for all the vacation well-wishes! It was really great and relaxing. Take vacations, it's important for reliability! While I was out, I shipped the past two issues with

SRE Weekly Issue #336

Monday, August 29, 2022

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and

[Last chance to register] Learn how to evaluate, build, and launch a faster, relevant, contextual search experience

Thursday, December 8, 2022

Deliver contextual search for a dataset as big as 10M records in only 6 months. elastic | Search. Observe. Protect Learn how to evaluate, build, and launch a faster, relevant, contextual search

Apple GPU drivers now in Asahi Linux — Apple introduces end-to-end encryption for backups — and Tell HN: IPv6-only still pretty much unusable

Wednesday, December 7, 2022

Issue #974 — Top 20 stories of December 08, 2022 Issue #974 — December 08, 2022 You receive this email because you are subscribed to Hacker News Digest. You can open it in the browser if you prefer. 1

Plaid unravels a fifth of its workforce after ‘growth did not materialize as quickly as expected' 

Wednesday, December 7, 2022

TechCrunch Newsletter TechCrunch logo The Daily Crunch logo By Christine Hall Wednesday, December 07, 2022 Hello, and welcome to Hump Day! If you haven't gotten your fill yet of tech egos, you'

Infographic | When Will Air Travel Return to Pre-Pandemic Levels? ✈

Wednesday, December 7, 2022

COVID-19 hit the air travel industry hard. But passenger traffic is slowly recovering, and by 2025, things are expected to return to 'normal.' View Online | Subscribe Presented by: TSX-V: CVV

JSK Daily for Dec 7, 2022

Wednesday, December 7, 2022

JSK Daily for Dec 7, 2022 View this email in your browser A community curated daily e-mail of JavaScript news JSK Weekly - December 07, 2022 Hello festive season! If you're not feeling the

Learn how strong security builds trust and wins customers

Wednesday, December 7, 2022

Join us on Tuesday, December 13 at 10 am PST/1 pm EST for a free, one-hour webinar called, “Prove trust to prospects: How to win deals in an economic downturn.” People do business with companies they

Save Big on an Android TV Projector, Portable SSD, and More

Wednesday, December 7, 2022

Logo for How-To Geek Deals December 7, 2022 Save Big on an Android TV Projector, Portable SSD, and More Welcome back to the first full edition of How-To Geek Deals since the Cyber Weekend rush. To kick

JSK Weekly - December 07, 2022

Wednesday, December 7, 2022

Hello festive season! If you're not feeling the festivities just yet, check out "AdventJS, JavaScript/TypeScript coding challenges on December", it'll make for an interesting run up

Mastering Cyber Intelligence ($19.99 Value) FREE for a Limited Time

Wednesday, December 7, 2022

The Hacker News eBook Update Newsletter Mastering Cyber Intelligence ($19.99 Value) FREE for a Limited Time Download For Free Cyber Threat Intelligence converts threat information into evidence-based

Daily Coding Problem: Problem #958 [Medium]

Wednesday, December 7, 2022

Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Facebook. Given an unordered list of flights taken by someone, each represented as (