How Spotify Rebuilt Its Recommender System for 600 Million Users
The CTO Podcast with Fexingo · 2026-06-19 · 13 min
Episode notes
In this episode of The CTO Podcast, Lucas and Luna dive into how Spotify rebuilt its core recommender engine from a batch-based collaborative filtering system to a real-time graph neural network serving 600 million users. They explore the specific architectural decisions behind Spotify's migration from Apache Spark and nightly model retraining to a streaming pipeline with TensorFlow and graph embeddings. Lucas explains why the team chose to model user listening sessions as dynamic graphs, how they reduced cold-start latency from hours to under 30 seconds, and the trade-offs they made in compute cost versus recommendation freshness. Luna presses on the practical challenges of A/B testing recommender changes at scale and how Spotify balanced personalization with exploration. The episode also touches on engineering org decisions, including how Spotify structured cross-functional squads around product outcomes rather than model components. By the end, listeners will understand why graph neural networks are becoming the standard for recommendation at tech giants and what it takes to deploy them in production.
More from The CTO Podcast with Fexingo
All episodes →- How Airbnb Rebuilt Search for 8 Million Listings62 / 100
- How GitLab Built a Single Codebase for One Million CI Pipelines65 / 100
- How Slack Rebuilt Its Search Index for 10 Million Daily Queries57 / 100
- How Notion Rebuilt Its Sync Engine for Offline-First
- How Notion Rebuilt Its Block Engine for Hybrid Local-Sync