How Stripe Rebuilt Payment Routing for 99.999% Uptime

The CTO Podcast with Fexingo · 2026-06-15 · 9 min

Episode notes

Stripe's payment infrastructure processes billions of dollars annually, and their routing engine - the system that decides which bank or processor gets each transaction - is a marvel of distributed systems engineering. In this episode, Lucas and Luna explore how Stripe rebuilt its payment routing layer to achieve five-nines uptime, handling failures at the bank level in milliseconds without user impact. They break down the architecture: the state machine that tracks each transaction through six phases, the circuit-breaker pattern that isolates failing processors, and the decision-tree optimization that cut latency by 40 percent. Lucas explains why routing is the hardest problem in payments - more complex than fraud detection or compliance - and how Stripe's design influenced the broader fintech industry. Luna draws parallels to how other critical infrastructure systems, from DNS to CDNs, solve similar reliability problems. A concrete look at what it takes to move money reliably at internet scale.

More from The CTO Podcast with Fexingo

All episodes →

Explore the best B2B Engineering & DevTools podcasts →

All The CTO Podcast with Fexingo episodes →