system design · system-design
Design Google Search Autocomplete (Type-Ahead)
Trie at scale, top-K ranking per prefix, freshness, multi-language. Most-asked Google + Amazon SDI.
Theory
Explanation
Intuition first, formal definition second. Skim the bullets if you already know this; read the prose if you don't.
Suggest top 10 queries that start with what the user typed. Must return in <100ms over billions of queries. Trie pre-computed offline holds top-K per prefix; online lookup is a single hash + read.
Offline: aggregate query logs daily, count queries per (prefix, query) pair, store top-K per prefix in a trie. Trie serialized to KV store. Online: client sends prefix → server hashes → fetches top-K from Redis → ranks/filters → returns. Personalization layer optional (user history feeds into reranking). Freshness via incremental updates from stream: trending queries promoted in near-real-time.
When to use
Search boxes, e-commerce product search, email recipient autocomplete.
When not to
Tiny corpora, just client-side filter. Personal contacts, local index.
flowchart LR Client([Client]) -->|prefix| API[Autocomplete API] API --> Cache[(Redis · prefix → top-K)] Cache -.miss.-> KV[(Persistent KV)] Stream[Query Stream] --> Aggr[Daily Aggregator] Aggr --> Build[Trie Builder] Build --> KV Build --> Cache Trending[Trending Stream] -.real-time.-> Cache
Key insights
- Pre-compute top-K per prefix offline; online is read-only.
- Trie need not be sent to client, server-side lookup with hot prefix in Redis is faster.
- Misspelling tolerance via edit-distance variants pre-computed for top queries.
- Personalization adds latency; cap at +50ms with per-user lookup.
- Freshness via incremental stream merge, trending events promoted within minutes.