← DevOps Daily with Fexingo: CI/CD, Kubernetes, and Modern Software Operations

How Kubernetes Horizontal Pod Autoscaler Creates Thundering Herds

DevOps Daily with Fexingo: CI/CD, Kubernetes, and Modern Software Operations · 2026-06-21 · 7 min

Episode notes

Lucas and Luna drill into a specific Kubernetes scaling failure: the 'thundering herd' problem caused by the Horizontal Pod Autoscaler. They walk through a real scenario where a retail app's HPA misconfiguration caused 120 pod replicas to start simultaneously, overwhelming the API server and database connection pool. Lucas explains the mechanism behind HPA's default behavior — how it calculates target replicas based on average utilization — and why a sudden traffic spike can trigger a cascade. Luna brings data from a case study at a major e-commerce company that saw latency spike 8x during a flash sale due to this exact pattern. They discuss mitigation strategies: using the '--horizontal-pod-autoscaler-sync-period' flag to slow down scaling, setting 'scaleDown' stabilization windows, and leveraging VPA alongside HPA for smoother response. The episode gives listeners a concrete fix they can test in staging on Monday.

All DevOps Daily with Fexingo: CI/CD, Kubernetes, and Modern Software Operations episodes →