SRE Weekly - SRE Weekly Issue #373

View on sreweekly.com

A message from our sponsor, Rootly:

Rootly is hiring for a Sr. Developer Relations Advocate to continue helping more world-class companies like Figma, NVIDIA, Squarespace, accelerate their incident management journey. Looking for previous on-call engineers with a passion for making the world a more reliable place. Learn more:

https://rootly.com/careers?gh_jid=4015888007

Articles

Datadog posted a report on their major outage in March, and it’s a doozy. An unattended updates system that they didn’t even want, need, or know about triggered across all hosts in multiple clouds nearly simultaneously, causing a regression.

  Alexis Lê-Quôc — Datadog

GitHub has had a string of apparently unrelated outages recently, and they’ve posted this description.

  Mike Hanley — GitHub

Oh look, another awesome-* repo relevant to our interests!

A repo of links to articles, papers, conference talks, and tooling related to load management in software services: loadshedding, circuitbreaking, quota management and throttling. PRs welcome.

  Laura Nolan and Niall Murphy — Stanza Systems

This interview covers a lot of ground including looking beyond just “up or down” when considering reliability.

  Prathamesh Sonpatki — SRE Stories

If you’re in the mood for a deep systems debugging story, you’re in for a treat. The author takes you along for the ride with a wealth of detailed code snippets.

  Tycho Andersen — Netflix

Regardless of the replication mechanism you must fsync() your data to prevent global data loss in non-Byzantine protocols.

  Denis Rystsov and Alexander Gallego — Redpanda

Emotional intelligence is a critical skill for SREs, especially when we interact with other teams in fraught situations.

  Amin Astaneh — Certo Modo

Wow! Spotify created a set of tools to perform automated refactoring of thousands of repositories at once. This includes the ability to run tests, automatically merge pull requests without human review, and roll refactorings out gradually.

  Matt Brown — Spotify

Jeli has published a one-page cheat-sheet for their highly-detailed Howie guide for running incident retrospectives.

  Jeli







This email was sent to you
why did I get this?    unsubscribe from this list    update subscription preferences
SRE Weekly · PO Box 253 · South Lancaster, MA 01561-0253 · USA

Older messages

SRE Weekly Issue #372

Monday, May 15, 2023

View on sreweekly.com A message from our sponsor, Rootly: Rootly is hiring for a Sr. Developer Relations Advocate to continue helping more world-class companies like Figma, NVIDIA, Squarespace,

SRE Weekly Issue #371

Monday, May 8, 2023

View on sreweekly.com A message from our sponsor, Rootly: Rootly is hiring for a Sr. Developer Relations Advocate to continue helping more world-class companies like Figma, NVIDIA, Squarespace,

SRE Weekly Issue #370

Monday, May 1, 2023

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms

SRE Weekly Issue #369

Monday, April 24, 2023

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms

SRE Weekly Issue #368

Wednesday, April 19, 2023

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms

You Might Also Like

What Investors Want From AI Startups in 2025

Monday, November 25, 2024

Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 25, 2024? The HackerNoon

GCP Newsletter #426

Monday, November 25, 2024

Welcome to issue #426 November 25th, 2024 News LLM Official Blog Vertex AI Announcing Mistral AI's Large-Instruct-2411 on Vertex AI - Google Cloud has announced the availability of Mistral AI's

⏳ 36 Hours Left: Help Get "The Art of Data" Across the Finish Line 🏁

Monday, November 25, 2024

Visual Capitalist plans to unveal its secrets behind data storytelling, but only if the book hits its minimum funding goal. View Online | Subscribe | Download Our App We Need Your Help Only 36 Hours

DeveloPassion's Newsletter #180 - Black Friday Week

Monday, November 25, 2024

Edition 180 of my newsletter, discussing Knowledge Management, Knowledge Work, Zen Productivity, Personal Organization, and more! Sébastien Dubois DeveloPassion's Newsletter DeveloPassion's

Meet HackerNoon's Latest Features: Boost Stories with Translations, Speech-to-Text & More

Monday, November 25, 2024

Hey, Hacker! HackerNoon's monthly product update is here! Get ready for a new version of the mobile app, more translation developments, a new AI Gallery, backend moves, and more! 🚀 This product

The ultimate holiday gadget gift

Monday, November 25, 2024

AI isn't hitting a wall; $70 off Apple Watch; 60+ Amazon deals -- ZDNET ZDNET Tech Today - US November 25, 2024 Meta Quest 3S Why the Meta Quest 3S is the ultimate 2024 holiday present This $299

Deduplication in Distributed Systems: Myths, Realities, and Practical Solutions

Monday, November 25, 2024

This week, we'll discuss the deduplication strategies. We'll see whether they're useful and consider scenarios where you may need them. We'll also do a reality check with the promises

How to know if your data has been exposed

Monday, November 25, 2024

How do you know if your personal data has been leaked? Imagine getting an instant notification if your SSN, credit card, or password has been exposed on the dark web — so you can take action

⚙️ Amazon and Anthropic

Monday, November 25, 2024

Plus: The hidden market of body-centric data ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌

⚡ THN Recap: Top Cybersecurity Threats, Tools & Tips (Nov 18-24)

Monday, November 25, 2024

Don't miss the vital updates you need to stay secure. Read the full recap now. The Hacker News THN Recap: Top Cybersecurity Threats, Tools, and Practices (Nov 18 - Nov 24) We hear terms like “state