How Datadog Monitors Its Own 100-Terabyte Infrastructure

The CTO Podcast with Fexingo · 2026-06-16 · 10 min

Episode notes

Episode 54 of The CTO Podcast: Lucas and Luna explore how Datadog, the monitoring giant, uses its own tools to manage a sprawling infrastructure that ingests over 100 terabytes of data daily. They dive into the dogfooding strategy, the architectural choices that keep observability scalable, and the surprising insight that Datadog runs its entire backend on a single PostgreSQL fork - with custom sharding. Lucas explains the engineering org structure behind the monitoring team, and Luna questions whether dogfooding can blind teams to customer pain. Specific examples include how Datadog handles metric cardinality explosion and why they built a separate time-series database internally before launching it as a product. #Datadog #Observability #Dogfooding #TechLeadership #Infrastructure #PostgreSQL #Scalability #TimeSeriesDatabase #EngineeringCulture #Monitoring #CTOPodcast #FexingoBusiness #BusinessPodcast #Architecture #Sharding #MetricCardinality #SRE #CloudNative Keep every episode free: buymeacoffee.com/fexingo

More from The CTO Podcast with Fexingo

All episodes →

Explore the best B2B Engineering & DevTools podcasts →

All The CTO Podcast with Fexingo episodes →