SRE Weekly - SRE Weekly Issue #340
Articles
This one’s from a couple years ago and covers 3 main themes the author saw at SRECon Americas 2020. Fascinating topics include providing context for newbies, learning from incidents, and rethinking the incident command system.
Taylor Barnett — Transposit
On September 8, Honeycomb had a major outage in data ingestion, and they’ve posted this preliminary report, “pending an in-depth incident review in the upcoming weeks”.
BONUS CONTENT: Another outage report from a different outage the next day.
Honeycomb
Full disclosure: Honeycomb is my employer.
This is neat! Someone posted a day in their life as an actual SRE, and a bunch of commenters followed suit.
Various commenters — Reddit
Some big names in SRE got together to talk about how to know when your system is broken. Listen to the recording or read this excellent summary that goes in depth on grey failures and more.
Emily Arnott — Blameless
To better scale our systems, our infrastructure and product teams got together and decided to make these optimizations: reduce database loads, conduct load tests and size the demand and prioritize critical flows.
…and sharding.
Robinhood
A major incident went poorly, and that catalyzed investment in developing a new incident response system. They worked to transition from swarming to Incident Command.
Vikrant Saini — Razorpay
I love this part:
[…] if you have to deploy your microservices in a certain order, they’re not really microservices.
Cortex
This one had an interesting interplay of contributing factors.
Heroku
|
Older messages
SRE Weekly Issue #339
Monday, September 19, 2022
View on sreweekly.com It's with great sadness that I note the passing of a giant in our field, Dr. Richard Cook. His memory will live on through his huge body of work and the countless ways
SRE Weekly Issue #338
Monday, September 12, 2022
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms
SRE Weekly Issue #337
Monday, September 5, 2022
View on sreweekly.com Thanks for all the vacation well-wishes! It was really great and relaxing. Take vacations, it's important for reliability! While I was out, I shipped the past two issues with
SRE Weekly Issue #336
Monday, August 29, 2022
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and
SRE Weekly Issue #335
Monday, August 22, 2022
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and
You Might Also Like
What Investors Want From AI Startups in 2025
Monday, November 25, 2024
Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 25, 2024? The HackerNoon
GCP Newsletter #426
Monday, November 25, 2024
Welcome to issue #426 November 25th, 2024 News LLM Official Blog Vertex AI Announcing Mistral AI's Large-Instruct-2411 on Vertex AI - Google Cloud has announced the availability of Mistral AI's
⏳ 36 Hours Left: Help Get "The Art of Data" Across the Finish Line 🏁
Monday, November 25, 2024
Visual Capitalist plans to unveal its secrets behind data storytelling, but only if the book hits its minimum funding goal. View Online | Subscribe | Download Our App We Need Your Help Only 36 Hours
DeveloPassion's Newsletter #180 - Black Friday Week
Monday, November 25, 2024
Edition 180 of my newsletter, discussing Knowledge Management, Knowledge Work, Zen Productivity, Personal Organization, and more! Sébastien Dubois DeveloPassion's Newsletter DeveloPassion's
Meet HackerNoon's Latest Features: Boost Stories with Translations, Speech-to-Text & More
Monday, November 25, 2024
Hey, Hacker! HackerNoon's monthly product update is here! Get ready for a new version of the mobile app, more translation developments, a new AI Gallery, backend moves, and more! 🚀 This product
The ultimate holiday gadget gift
Monday, November 25, 2024
AI isn't hitting a wall; $70 off Apple Watch; 60+ Amazon deals -- ZDNET ZDNET Tech Today - US November 25, 2024 Meta Quest 3S Why the Meta Quest 3S is the ultimate 2024 holiday present This $299
Deduplication in Distributed Systems: Myths, Realities, and Practical Solutions
Monday, November 25, 2024
This week, we'll discuss the deduplication strategies. We'll see whether they're useful and consider scenarios where you may need them. We'll also do a reality check with the promises
How to know if your data has been exposed
Monday, November 25, 2024
How do you know if your personal data has been leaked? Imagine getting an instant notification if your SSN, credit card, or password has been exposed on the dark web — so you can take action
⚙️ Amazon and Anthropic
Monday, November 25, 2024
Plus: The hidden market of body-centric data
⚡ THN Recap: Top Cybersecurity Threats, Tools & Tips (Nov 18-24)
Monday, November 25, 2024
Don't miss the vital updates you need to stay secure. Read the full recap now. The Hacker News THN Recap: Top Cybersecurity Threats, Tools, and Practices (Nov 18 - Nov 24) We hear terms like “state