SRE Weekly - SRE Weekly Issue #238
My daughters asked earlier today what I do at work, and I explained all about SRE, reliability, and the importance of work-life balance. They said to tell you they say hi!
Articles
Lots of really great advice in here. And really, with a title like that, I couldn’t resist reading it!
Charity Majors
Last week, I mentioned a Google Cloud Platform outage that affected multiple services. Here’s the detailed post-analysis by Google.
This one is along the lines of the classic Ironies of Automation paper by Bainbridge. In this blog post, we’ll look at how SRE can improve NOC functions such as system monitoring, triage and escalation, incident response procedure, and ticketing.How can automation be a team player, and what happens when it isn’t?
Nadine Sarter and David Woods (original paper)
Thai Wood — Resilience Roundup (summary)
In this blog post, we’ll look at how SRE can improve NOC functions such as system monitoring, triage and escalation, incident response procedure, and ticketing.How can you use chaos engineering when failures in the system can be critical and even life-threatening?
Carl Chesser — Infoq
In this blog post, we’ll look at how SRE can improve NOC functions such as system monitoring, triage and escalation, incident response procedure, and ticketing.
Emily Arnot — Blameless
This article suggests using chaos engineering to tell if your microservice-based architecture is secretly a monolith in disguise.
Andre Newman — Gremlin
Outages
- Slack
- Radware
- An accidental BGP hijack by Telstra took down Radware.
- Tokyo Stock Exchange
- The Tokyo Stock Exchange was down for an entire day, the first time that’s ever happened.
- Fastly
- Squarespace
- Google Search Indexing
- Microsoft Azure outage #SM79-F88
- A problem with Azure Active Directory caused trouble for Office365 and other Microsoft services. Click through for their detailed follow-up.
|
Older messages
SRE Weekly Issue #237
Monday, September 28, 2020
View on sreweekly.com A message from our sponsor, StackHawk: CI/CD has changed software engineering. Application security, however, has been left behind. Why doesn't your CI pipeline have AppSec
SRE Weekly Issue #236
Monday, September 21, 2020
View on sreweekly.com A message from our sponsor, StackHawk: Add application security checks with GitHub actions. Check out the video on how. https://www.stackhawk.com/blog/application-security-with-
SRE Weekly Issue #235
Monday, September 14, 2020
View on sreweekly.com A message from our sponsor, StackHawk: Adding application security tests to your CI pipeline is simple. It typically takes <30 minutes to setup automated testing so you can be
SRE Weekly Issue #234
Monday, September 7, 2020
View on sreweekly.com Last Sunday, there was a major backbone Internet provider outage after I finished putting SRE Weekly together. There were so many outages that I'm not even going to bother
SRE Weekly Issue #233
Monday, August 31, 2020
View on sreweekly.com A message from our sponsor, StackHawk: Did you catch the GitLab Commit keynote by StackHawk Founder Joni Klippert? View on demand now to learn about how security got left behind,
You Might Also Like
Youre Overthinking It
Wednesday, January 15, 2025
Top Tech Content sent at Noon! Boost Your Article on HackerNoon for $159.99! Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, January 15, 2025? The
eBook: Software Supply Chain Security for Dummies
Wednesday, January 15, 2025
Free access to this go-to-guide for invaluable insights and practical advice to secure your software supply chain. The Hacker News Software Supply Chain Security for Dummies There is no longer doubt
The 5 biggest AI prompting mistakes
Wednesday, January 15, 2025
✨ Better Pixel photos; How to quit Meta; The next TikTok? -- ZDNET ZDNET Tech Today - US January 15, 2025 ai-prompting-mistakes The five biggest mistakes people make when prompting an AI Ready to
An interactive tour of Go 1.24
Wednesday, January 15, 2025
Plus generating random art, sending emails, and a variety of gopher images you can use. | #538 — January 15, 2025 Unsub | Web Version Together with Posthog Go Weekly An Interactive Tour of Go 1.24 — A
Spyglass Dispatch: Bromo Sapiens
Wednesday, January 15, 2025
Masculine Startups • The Fall of Xbox • Meta's Misinformation Off Switch • TikTok's Switch Off The Spyglass Dispatch is a newsletter sent on weekdays featuring links and commentary on timely
The $1.9M client
Wednesday, January 15, 2025
Money matters, but this invisible currency matters more. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
⚙️ Federal data centers
Wednesday, January 15, 2025
Plus: Britain's AI roadmap
Post from Syncfusion Blogs on 01/15/2025
Wednesday, January 15, 2025
New blogs from Syncfusion Introducing the New .NET MAUI Bottom Sheet Control By Naveenkumar Sanjeevirayan This blog explains the features of the Bottom Sheet control introduced in the Syncfusion .NET
The Sequence Engineering #469: Llama.cpp is The Framework for High Performce LLM Inference
Wednesday, January 15, 2025
One of the most popular inference framework for LLM apps that care about performance. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
3 Actively Exploited Zero-Day Flaws Patched in Microsoft's Latest Security Update
Wednesday, January 15, 2025
THN Daily Updates Newsletter cover The Kubernetes Book: Navigate the world of Kubernetes with expertise , Second Edition ($39.99 Value) FREE for a Limited Time Containers transformed how we package and