Special: When the Cloud Has a Bad Day: Cloudflare, AWS us-east-1 & GitHub Outages

Ship It Weekly · 2025-11-20 · 13 min

Episode notes

In this special kickoff episode of Ship It Weekly , Brian walks through three major outages from the last few weeks and what they actually mean for DevOps, SRE, and platform teams. Instead of just reading status pages, we look at how each incident exposes assumptions in our own architectures and runbooks: Topics in this episode: • Cloudflare’s global outage and what happens when your CDN/WAF becomes a single point of failure • The AWS us-east-1 incident and why “multi-AZ in one region” isn’t a full disaster recovery strategy • GitHub’s Git operations / Codespaces outage and how fragile our CI/CD and GitOps flows can be • Practical questions to ask about your own setup: CDN bypass, cross-region readiness, backups for Git and CI This episode is more of a themed “special” to kick things off. Going forward, most episodes will follow a lighter news format: a couple of main stories from the week in DevOps/SRE/platform engineering, a quick tools and releases segment, and one culture/on-call or burnout topic. Specials like this will pop up when there’s a big incident or theme worth unpacking.

More from Ship It Weekly

All episodes →

Explore the best B2B Engineering & DevTools podcasts →

Listen to this episode All Ship It Weekly episodes →