How Datadog Monitors Its Own Monolith at Scale
The CTO Podcast with Fexingo · 2026-05-30 · 8 min
Episode notes
Episode 20 of The CTO Podcast dives into a paradox: how does Datadog, the company that sells observability software, actually monitor its own massive monolith? Lucas and Luna walk through the architecture behind Datadog's internal dogfooding strategy - a single codebase that handles millions of metrics per second. They explore the tradeoffs of keeping a monolith versus microservices, how the engineering team built an internal tool called 'Watchtower' to catch regressions before they hit customers, and why Datadog's CTO decided against splitting the core observability pipeline into separate services. Along the way, they reveal a specific threshold: 1.2 million events per second per host, and how the team tracks it. A concrete look at how one company eats its own dog food at planetary scale. #Datadog #Observability #Monolith #EngineeringArchitecture #Dogfooding #Watchtower #Scalability #MetricsPipeline #CTO #TechnicalLeadership #BusinessAndTechnology #Fexingo #FexingoBusiness #BusinessPodcast #Podcast #SoftwareEngineering #Infrastructure #SRE Keep every episode free: buymeacoffee.com/fexingo
More from The CTO Podcast with Fexingo
All episodes →- How Airbnb Rebuilt Search for 8 Million Listings42 / 100
- How GitLab Built a Single Codebase for One Million CI Pipelines45 / 100
- How Slack Rebuilt Its Search Index for 10 Million Daily Queries37 / 100
- How Notion Rebuilt Its Sync Engine for Offline-First
- How Notion Rebuilt Its Block Engine for Hybrid Local-Sync