Module Review: Caching

As a Senior Staff Engineer, I expect you to master the following to pass this module:

Quantify Latency: Discern the difference between an L1 hit (1ns) and a WAN trip (150ms).
Defend the DB: Choose the right eviction and write strategy for high-throughput systems.
Hardware Intuition: Explain how Pointer Overhead, Write Amplification, and Light-Speed constraints affect software design.

1. Cheat Sheet: The Numbers You Must Know

Operation	Latency (Approx)	Analogy
L1/L2 Cache	CPU Internal	< 10ns	Heartbeat
RAM (Redis)	Memory Access	~100ns	Brushing Teeth
SSD Read	Fast Disk	~150,000ns (150µs)	Weekend Trip
Network (LAN)	Data Center	~500,000ns (0.5ms)	1 Week Vacation
Network (WAN)	Cross-Region	~150,000,000ns (150ms)	4 Years (University)

2. Flashcards: Test Your Knowledge

What is the "Thundering Herd" problem?

Tap to reveal

Concurrency Spike

When a popular cache key expires, thousands of requests hit the database simultaneously to regenerate it. Solved by Mutex Locks or Refresh-Ahead.

LRU vs LFU?

Tap to reveal

Recency vs Frequency

LRU: Evicts least recently used (Good for Recency bias).
LFU: Evicts least frequently used (Good for long-term popularity, resistant to scans).

Write-Through vs Write-Back?

Tap to reveal

Safety vs Speed

Write-Through: Writes to DB + Cache synchronously (Safe).
Write-Back: Writes to Cache, async to DB (Fast, risk of data loss).

What is Consistent Hashing?

Tap to reveal

Distributed Scaling

A technique to map keys to nodes on a ring. When a node is added/removed, only 1/N keys are moved, minimizing cache misses.

Redis Persistence: RDB vs AOF?

Tap to reveal

Snapshot vs Log

RDB: Periodic snapshots (Compact, fast restart, data loss window).
AOF: Append-only log (Durable, large file, slower restart).

What is Cache Penetration?

Tap to reveal

Querying Non-Existent Data

Users query keys that don't exist in DB (e.g. id=-1). Bypasses cache and hits DB. Solved by Bloom Filters or caching Nulls.

2.5 War Story: The Thundering Herd at Ticketmaster

Imagine Taylor Swift concert tickets go on sale. Millions of users hit the homepage. The cache key concert:taylor_swift:details is extremely hot.

Suddenly, the cache TTL expires.

The Thundering Herd: In the exact millisecond the cache misses, 50,000 requests bypass the cache and hit the database simultaneously to fetch the exact same concert details.
The Meltdown: The database, unable to handle 50,000 complex joins simultaneously, spikes in CPU, runs out of connections, and crashes.
The Fix (Mutex Lock):
- Request 1 arrives, gets a cache miss, and acquires a Mutex Lock (e.g., Redis SETNX).
- Request 1 goes to the database.
- Requests 2-50,000 arrive, see the lock is held, and sleep for 50ms, then retry the cache.
- Request 1 updates the cache and releases the lock.
- Requests 2-50,000 wake up, hit the now-populated cache, and the database survives.

3. Decision Matrix: Redis vs Memcached

Feature	Redis	Memcached
Data Types	Strings, Lists, Sets, Hashes, Bitmaps	Strings Only (Binary Blobs)
Architecture	Single-Threaded (Event Loop)	Multi-Threaded
Persistence	Yes (RDB + AOF)	No (Pure In-Memory)
Clustering	Native (Hash Slots)	Client-Side Only (Consistent Hashing)
Best For	Complex Apps, Queues, Leaderboards	Simple KV Caching, Scaling Vertical CPUs

Key Takeaway: Protect the DB by using Bloom Filters for Cache Penetration and Mutex Locks for Thundering Herds.

4. Anatomy of a Cache Entry (LRU Example)

To understand Pointer Overhead, let’s dissect what a single Redis string key actually looks like in memory (simplified):

Component	Size	Purpose
Key String	Variable (e.g., 10 bytes)	The actual key (`user:123`).
Value String	Variable (e.g., 20 bytes)	The cached payload (`{"name":"Alice"}`).
Metadata (TTL)	8 bytes	Expiration timestamp.
LRU Pointers (Prev/Next)	16 bytes (8 bytes each)	Pointers to maintain the Double-Linked List for the LRU eviction queue.
Hash Table Pointer	8 bytes	Pointer from the main hash slot to this entry.
Total Overhead	~32+ bytes	Extra memory consumed per entry!

Key Insight: Caching 1 million tiny 5-byte values doesn’t take 5MB of RAM; due to Pointer Overhead and metadata, it might take 40MB+. This is why hardware intuition matters!

5. Hardware-First System Design Checklist

Layer	Item	Why?
CPU	Are we using arrays for spatial locality?	Hits L1/L2 cache lines (64 bytes).
RAM	Have we accounted for Pointer Overhead?	LRU metadata can increase RAM usage by 40%+.
Network	Is BGP Anycast routing to the nearest PoP?	Bypasses the “Speed of Light” light-speed penalty.
Disk	Do we have Battery-Backed (BCC) RAID?	Safely use Write-Back for massive throughput.
Kernel	Is `vm.dirty_ratio` tuned correctly?	Optimizes Write Coalescing and reduces SSD wear.

6. Staff Engineer Challenge

The Micro-Caching Paradox: Imagine you have a highly dynamic “Real-time Stock Price” service. You can’t cache the prices for 60 seconds because data staleness would be illegal. However, your database is catching fire under 100,000 requests per second.

Challenge: How would you use Micro-Caching (TTL < 1s) and Request Collapsing to save the hardware without violating business requirements?

Hint: Think about the “Thundering Herd” and how many requests arrive within the same 10ms window.

7. Glossary

Need to review terms? Check out the System Design Glossary.

Module Review: Caching

Module Review: Caching

1. Cheat Sheet: The Numbers You Must Know

2. Flashcards: Test Your Knowledge

What is the "Thundering Herd" problem?

Concurrency Spike

LRU vs LFU?

Recency vs Frequency

Write-Through vs Write-Back?

Safety vs Speed

What is Consistent Hashing?

Distributed Scaling

Redis Persistence: RDB vs AOF?

Snapshot vs Log

What is Cache Penetration?

Querying Non-Existent Data

2.5 War Story: The Thundering Herd at Ticketmaster

3. Decision Matrix: Redis vs Memcached

4. Anatomy of a Cache Entry (LRU Example)

5. Hardware-First System Design Checklist

6. Staff Engineer Challenge

7. Glossary

Found this lesson helpful?