system design · system-design

Design Google Search Autocomplete (Type-Ahead)

Trie at scale, top-K ranking per prefix, freshness, multi-language. Most-asked Google + Amazon SDI.

hard3hgeneralredissystem-design
Ask GPTConfidence

Theory

Explanation

Intuition first, formal definition second. Skim the bullets if you already know this; read the prose if you don't.

Suggest top 10 queries that start with what the user typed. Must return in <100ms over billions of queries. Trie pre-computed offline holds top-K per prefix; online lookup is a single hash + read.

Offline: aggregate query logs daily, count queries per (prefix, query) pair, store top-K per prefix in a trie. Trie serialized to KV store. Online: client sends prefix → server hashes → fetches top-K from Redis → ranks/filters → returns. Personalization layer optional (user history feeds into reranking). Freshness via incremental updates from stream: trending queries promoted in near-real-time.

When to use

Search boxes, e-commerce product search, email recipient autocomplete.

When not to

Tiny corpora, just client-side filter. Personal contacts, local index.

flowchart LR
  Client([Client]) -->|prefix| API[Autocomplete API]
  API --> Cache[(Redis · prefix → top-K)]
  Cache -.miss.-> KV[(Persistent KV)]
  Stream[Query Stream] --> Aggr[Daily Aggregator]
  Aggr --> Build[Trie Builder]
  Build --> KV
  Build --> Cache
  Trending[Trending Stream] -.real-time.-> Cache

Key insights

  • Pre-compute top-K per prefix offline; online is read-only.
  • Trie need not be sent to client, server-side lookup with hot prefix in Redis is faster.
  • Misspelling tolerance via edit-distance variants pre-computed for top queries.
  • Personalization adds latency; cap at +50ms with per-user lookup.
  • Freshness via incremental stream merge, trending events promoted within minutes.