SRE Weekly - SRE Weekly Issue #341
Articles
My coworkers referred to a system “going metastable”, and when I asked what that was, they pointed me to this awesome paper.
Metastable failures occur in open systems with an uncontrolled source of load where a trigger causes the system to enter a bad state that persists even when the trigger is `removed.
  Nathan Bronson, Aleksey Charapko, Abutalib Aghayev, and Timothy Zhu
Honeycomb posted this incident report involving a service hitting the open file descriptors limit.
  Honeycomb
  Full disclosure: Honeycomb is my employer.
Lots of interesting answers to this one, especially when someone uttered the phrase:
engineers should not be on call
u/infomaniac89 and others — reddit
A misbehaving internal Google service overloaded Cloud Filestore, exceeding its global request limit and effectively DoSing customers.
An in-depth look at how Adobe improved its on-call experience. They used a deliberate plan to change their team’s on-call habits for the better.
Bianca Costache — Adobe
This one contains an interesting observation: they found that outages caused by a cloud providers take longer to solve.
Jeff Martens — Metrist
Even if you don’t agree with all of their reasons, it’s definitely worth thinking about.
Danny Martinez — incident.io
This one covers common reliability risks in APIs and techniques for mitigating them.
Utsav Shah
The evolution beyond separate Dev and Ops teams continues. This article traces the path through DevOps and into platform-focused teams.
  Charity Majors — Honeycomb
  Full disclosure: Honeycomb is my employer.
|
Older messages
SRE Weekly Issue #340
Monday, September 26, 2022
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms
SRE Weekly Issue #339
Monday, September 19, 2022
View on sreweekly.com It's with great sadness that I note the passing of a giant in our field, Dr. Richard Cook. His memory will live on through his huge body of work and the countless ways
SRE Weekly Issue #338
Monday, September 12, 2022
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms
SRE Weekly Issue #337
Monday, September 5, 2022
View on sreweekly.com Thanks for all the vacation well-wishes! It was really great and relaxing. Take vacations, it's important for reliability! While I was out, I shipped the past two issues with
SRE Weekly Issue #336
Monday, August 29, 2022
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and
You Might Also Like
Kotlin Weekly #441
Sunday, January 12, 2025
ISSUE #441 12th of January 2025 Announcements Become a KotlinConf 2025 volunteer! The KotlinConf has started a Call for Volunteers to help out at the conference in May! If you are interested, check out
Healthy life, Meta's AI and legibility
Saturday, January 11, 2025
Neologism #25, 11.01.2024 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Daily Coding Problem: Problem #1665 [Medium]
Saturday, January 11, 2025
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by LinkedIn. A wall consists of several rows of bricks of various integer lengths and
📊 Every Smartphone I've Ever Owned, Ranked — This Tiny Smart Remote Is the Most Exciting Thing at CES
Saturday, January 11, 2025
Also: 5 Android Notification Features to Make Your Day Easier, and More! How-To Geek Logo January 11, 2025 Did You Know On March 12, 1951, a curious thing happened. In the United States and the United
Ranked | The Top Grossing Movies Worldwide in 2024 🎬
Saturday, January 11, 2025
Established IP dominated the 2024 box office, with top films mostly being sequels, spin-offs, or franchise continuations. View Online | Subscribe | Download Our App FEATURED STORY Ranked: Top Grossing
📖 Your Step-by-Step Guide to Securing AI in the Enterprise
Saturday, January 11, 2025
January 11, 2025 | Read Online Subscribe | Advertise Good Morning. Welcome to this special edition of The Deep View, brought to you in collaboration with Tines. When it comes to adopting AI securely,
🐍 New Python tutorials on Real Python
Saturday, January 11, 2025
Hey there, There's always something going on over at Real Python as far as Python tutorials go. Here's what you may have missed this past week: Iterators and Iterables in Python: Run Efficient
Life Update: Me. In Shorts. In Antarctica [Pics Inside 🧊]
Saturday, January 11, 2025
And yes, I jumped in. It taught me a lot 😅 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Google Researcher Uncovers Zero-Click Exploit Targeting Android Devices
Saturday, January 11, 2025
THN Daily Updates Newsletter cover The Kubernetes Book: Navigate the world of Kubernetes with expertise , Second Edition ($39.99 Value) FREE for a Limited Time Containers transformed how we package and
📧 Working with LLMs in .NET using Microsoft.Extensions.AI
Saturday, January 11, 2025
Working with LLMs in .NET using Microsoft․Extensions․AI Read on: my website / Read time: 6 minutes The .NET Weekly is brought to you by: Transform your database performance with RavenDB: