system design · system-design
Design Netflix Recommendation Engine
Offline candidate generation + online ranking, "row-of-rows" homepage, A/B testing infra, personalization signals.
Theory
Explanation
Intuition first, formal definition second. Skim the bullets if you already know this; read the prose if you don't.
Homepage = rows of titles. Each row generated by a different "row algorithm" (Continue Watching, Because You Watched X, Trending Now, etc.). Each row picks candidates, then a meta-ranker orders rows themselves. Personalization compounds through both layers.
Offline: nightly batch ranks all titles per profile using factorization + neural collab filter. Stored per (profile_id, row_id) in materialized cache. Online: at homepage load, fetch row materializations, apply real-time signals (recency, fresh releases), pick top-K rows + top-K per row. A/B framework allocates experiments; metrics dashboards measure long-term retention not just CTR.
When to use
Content discovery, e-commerce homepages.
When not to
Strict ordering required (search by intent).
flowchart LR Watch[Watch History] --> Offline[Offline Pipeline · nightly] Offline --> Models[Row Models · CF, content, trending] Models --> Cache[(Materialized Row Cache · per profile)] HomeAPI[Homepage API] --> Cache HomeAPI --> RT[Real-time Signals] HomeAPI --> Meta[Meta-Ranker · row order] Meta --> Client[Client] AB[A/B Framework] -.flag.-> Meta AB --> Metrics[(Long-term Retention)]
Key insights
- Two-stage = row picks candidates, meta-ranker orders rows. Both personalized.
- Offline materialization saves online compute; freshness via real-time overlay.
- A/B success metrics = retention + engagement, not just CTR.
- Diversity injection prevents filter-bubble (always show one row outside comfort zone).
- Cold-start handled by content-based + onboarding survey.