How Discord Rebuilt Its Voice Engine for Latency

The CTO Podcast with Fexingo · 2026-06-06 · 8 min

Episode notes

In this episode of The CTO Podcast, Lucas and Luna dive into Discord's architectural overhaul of its real-time voice system. They explore how the team reduced latency from hundreds of milliseconds to under 50 by switching from a traditional client-server model to a mesh-based WebRTC architecture. The discussion covers the trade-offs of running their own media servers versus outsourcing, the engineering challenge of synchronizing 50 users in a single voice channel without a central coordinator, and how Discord handled the transition without disrupting its 150 million monthly active users. Lucas explains the key insight: rather than optimizing the existing pipeline, Discord rethought the entire signaling and media routing layer around a 'selective forwarding unit' pattern. Luna presses on the operational cost of running proprietary infrastructure at scale, and Lucas shares the surprising finding that the rewrite actually reduced server spend by 30 percent. The episode closes with a reflection on when to rebuild versus patch.

More from The CTO Podcast with Fexingo

All episodes →

Explore the best B2B Engineering & DevTools podcasts →

All The CTO Podcast with Fexingo episodes →