Module Review: Caching

[!IMPORTANT] As a Senior Staff Engineer, I expect you to master the following to pass this module:

  1. Quantify Latency: Discern the difference between an L1 hit (1ns) and a WAN trip (150ms).
  2. Defend the DB: Choose the right eviction and write strategy for high-throughput systems.
  3. Hardware Intuition: Explain how Pointer Overhead, Write Amplification, and Light-Speed constraints affect software design.

1. Cheat Sheet: The Numbers You Must Know

Operation Latency (Approx) Analogy  
L1/L2 Cache CPU Internal < 10ns Heartbeat
RAM (Redis) Memory Access ~100ns Brushing Teeth
SSD Read Fast Disk ~150,000ns (150µs) Weekend Trip
Network (LAN) Data Center ~500,000ns (0.5ms) 1 Week Vacation
Network (WAN) Cross-Region ~150,000,000ns (150ms) 4 Years (University)

2. Flashcards: Test Your Knowledge

What is the "Thundering Herd" problem?

Tap to reveal

Concurrency Spike

When a popular cache key expires, thousands of requests hit the database simultaneously to regenerate it. Solved by Mutex Locks or Refresh-Ahead.

LRU vs LFU?

Tap to reveal

Recency vs Frequency

LRU: Evicts least recently used (Good for Recency bias).
LFU: Evicts least frequently used (Good for long-term popularity, resistant to scans).

Write-Through vs Write-Back?

Tap to reveal

Safety vs Speed

Write-Through: Writes to DB + Cache synchronously (Safe).
Write-Back: Writes to Cache, async to DB (Fast, risk of data loss).

What is Consistent Hashing?

Tap to reveal

Distributed Scaling

A technique to map keys to nodes on a ring. When a node is added/removed, only 1/N keys are moved, minimizing cache misses.

Redis Persistence: RDB vs AOF?

Tap to reveal

Snapshot vs Log

RDB: Periodic snapshots (Compact, fast restart, data loss window).
AOF: Append-only log (Durable, large file, slower restart).

What is Cache Penetration?

Tap to reveal

Querying Non-Existent Data

Users query keys that don't exist in DB (e.g. id=-1). Bypasses cache and hits DB. Solved by Bloom Filters or caching Nulls.



2.5 War Story: The Thundering Herd at Ticketmaster

Imagine Taylor Swift concert tickets go on sale. Millions of users hit the homepage. The cache key concert:taylor_swift:details is extremely hot.

Suddenly, the cache TTL expires.

  1. The Thundering Herd: In the exact millisecond the cache misses, 50,000 requests bypass the cache and hit the database simultaneously to fetch the exact same concert details.
  2. The Meltdown: The database, unable to handle 50,000 complex joins simultaneously, spikes in CPU, runs out of connections, and crashes.
  3. The Fix (Mutex Lock):
    • Request 1 arrives, gets a cache miss, and acquires a Mutex Lock (e.g., Redis SETNX).
    • Request 1 goes to the database.
    • Requests 2-50,000 arrive, see the lock is held, and sleep for 50ms, then retry the cache.
    • Request 1 updates the cache and releases the lock.
    • Requests 2-50,000 wake up, hit the now-populated cache, and the database survives.

3. Decision Matrix: Redis vs Memcached

Feature Redis Memcached
Data Types Strings, Lists, Sets, Hashes, Bitmaps Strings Only (Binary Blobs)
Architecture Single-Threaded (Event Loop) Multi-Threaded
Persistence Yes (RDB + AOF) No (Pure In-Memory)
Clustering Native (Hash Slots) Client-Side Only (Consistent Hashing)
Best For Complex Apps, Queues, Leaderboards Simple KV Caching, Scaling Vertical CPUs

Key Takeaway: Protect the DB by using Bloom Filters for Cache Penetration and Mutex Locks for Thundering Herds.



4. Anatomy of a Cache Entry (LRU Example)

To understand Pointer Overhead, let’s dissect what a single Redis string key actually looks like in memory (simplified):

Component Size Purpose
Key String Variable (e.g., 10 bytes) The actual key (user:123).
Value String Variable (e.g., 20 bytes) The cached payload ({"name":"Alice"}).
Metadata (TTL) 8 bytes Expiration timestamp.
LRU Pointers (Prev/Next) 16 bytes (8 bytes each) Pointers to maintain the Double-Linked List for the LRU eviction queue.
Hash Table Pointer 8 bytes Pointer from the main hash slot to this entry.
Total Overhead ~32+ bytes Extra memory consumed per entry!

Key Insight: Caching 1 million tiny 5-byte values doesn’t take 5MB of RAM; due to Pointer Overhead and metadata, it might take 40MB+. This is why hardware intuition matters!

5. Hardware-First System Design Checklist

Layer Item Why?
CPU Are we using arrays for spatial locality? Hits L1/L2 cache lines (64 bytes).
RAM Have we accounted for Pointer Overhead? LRU metadata can increase RAM usage by 40%+.
Network Is BGP Anycast routing to the nearest PoP? Bypasses the “Speed of Light” light-speed penalty.
Disk Do we have Battery-Backed (BCC) RAID? Safely use Write-Back for massive throughput.
Kernel Is vm.dirty_ratio tuned correctly? Optimizes Write Coalescing and reduces SSD wear.

6. Staff Engineer Challenge

[!IMPORTANT] The Micro-Caching Paradox: Imagine you have a highly dynamic “Real-time Stock Price” service. You can’t cache the prices for 60 seconds because data staleness would be illegal. However, your database is catching fire under 100,000 requests per second.

Challenge: How would you use Micro-Caching (TTL < 1s) and Request Collapsing to save the hardware without violating business requirements?

Hint: Think about the “Thundering Herd” and how many requests arrive within the same 10ms window.


7. Glossary

Need to review terms? Check out the System Design Glossary.