system design · system-design
Design a Distributed Cache (Memcached / mcrouter)
Consistent hashing, mcrouter routing, gutter pool, lease tokens. Meta E6+ infra design signature.
Theory
Explanation
Intuition first, formal definition second. Skim the bullets if you already know this; read the prose if you don't.
Memcached gives O(1) GET/SET but the cluster manages many failure modes Meta engineered around: hot keys, thundering herd, stale-reads after backend writes, regional failover. mcrouter is the proxy layer that hides all this from clients.
Clients talk to mcrouter (sidecar). mcrouter handles: consistent hashing to memcached nodes, gutter pool for failed-node fallback, lease tokens to prevent stale-set after DB write, intra-pool replication for hot keys, cross-region invalidation via async event log. On node failure, gutter pool accepts requests with short TTL; primary repaired in background.
When to use
Read-heavy workloads at scale (social, e-commerce, ads).
When not to
Strong consistency required, use DB. Sub-key fan-out workloads (use Redis).
flowchart LR App[App Server] --> MC[mcrouter sidecar] MC -->|consistent hash| Pool[Memcached Pool] Pool --> Node1[(Node 1)] Pool --> Node2[(Node 2)] Pool --> NodeN[(Node N)] MC -.on failure.-> Gutter[Gutter Pool · short TTL] App --> DB[(Backing DB)] DB -.invalidate.-> MC Region1[Region 1] -.async invalidate.-> Region2[Region 2]
Key insights
- Lease tokens prevent stale-set: client gets a token with empty miss; only token-holder can set; concurrent updates yield to leader.
- Gutter pool absorbs traffic during partial outage without exposing miss-storm to DB.
- Hot key replicated to N nodes within pool; client picks random replica.
- Cross-region invalidation via async log, eventually consistent, but reads from regional caches are fast.
- mcrouter doubles as feature-flag layer, can route specific keys for migrations.