How Spotify Rebuilt Its Recommender System for 600 Million Users

The CTO Podcast with Fexingo · 2026-06-19 · 13 min

Episode notes

In this episode of The CTO Podcast, Lucas and Luna dive into how Spotify rebuilt its core recommender engine from a batch-based collaborative filtering system to a real-time graph neural network serving 600 million users. They explore the specific architectural decisions behind Spotify's migration from Apache Spark and nightly model retraining to a streaming pipeline with TensorFlow and graph embeddings. Lucas explains why the team chose to model user listening sessions as dynamic graphs, how they reduced cold-start latency from hours to under 30 seconds, and the trade-offs they made in compute cost versus recommendation freshness. Luna presses on the practical challenges of A/B testing recommender changes at scale and how Spotify balanced personalization with exploration. The episode also touches on engineering org decisions, including how Spotify structured cross-functional squads around product outcomes rather than model components. By the end, listeners will understand why graph neural networks are becoming the standard for recommendation at tech giants and what it takes to deploy them in production.

More from The CTO Podcast with Fexingo

All episodes →

Explore the best B2B Engineering & DevTools podcasts →

All The CTO Podcast with Fexingo episodes →