How Discord Rebuilt Its Voice Engine for Latency
The CTO Podcast with Fexingo · 2026-06-06 · 8 min
Episode notes
In this episode of The CTO Podcast, Lucas and Luna dive into Discord's architectural overhaul of its real-time voice system. They explore how the team reduced latency from hundreds of milliseconds to under 50 by switching from a traditional client-server model to a mesh-based WebRTC architecture. The discussion covers the trade-offs of running their own media servers versus outsourcing, the engineering challenge of synchronizing 50 users in a single voice channel without a central coordinator, and how Discord handled the transition without disrupting its 150 million monthly active users. Lucas explains the key insight: rather than optimizing the existing pipeline, Discord rethought the entire signaling and media routing layer around a 'selective forwarding unit' pattern. Luna presses on the operational cost of running proprietary infrastructure at scale, and Lucas shares the surprising finding that the rewrite actually reduced server spend by 30 percent. The episode closes with a reflection on when to rebuild versus patch.
More from The CTO Podcast with Fexingo
All episodes →- How Airbnb Rebuilt Search for 8 Million Listings62 / 100
- How GitLab Built a Single Codebase for One Million CI Pipelines65 / 100
- How Slack Rebuilt Its Search Index for 10 Million Daily Queries57 / 100
- How Notion Rebuilt Its Sync Engine for Offline-First
- How Notion Rebuilt Its Block Engine for Hybrid Local-Sync