Case Study: The YouTube View Counter

In December 2012, “Gangnam Style” by Psy broke YouTube’s view counter. The video hit 2,147,483,647 views — the maximum value of a 32-bit signed integer. YouTube’s counter literally overflowed and rolled over to negative numbers. Engineers had to hotfix the system while it was live, migrating to a 64-bit integer without any downtime. The problem wasn’t storage — it was architecture: a single database row with a view_count column that every view tried to UPDATE ... SET view_count = view_count + 1. At 50,000 views/second, every API call was fighting for the same row lock. This is the “Hot Key” problem. The solution isn’t a bigger row lock — it’s rethinking the entire counter architecture.

[!IMPORTANT] In this lesson, you will master (PEDALS framework):

The Hot Key Problem: Why a single Redis key or DB row can become the bottleneck for your entire system.

CRDT Mechanisms: Using G-Counters and PN-Counters to achieve eventual consistency without central locks.

Counter Sharding: Splitting one counter into N sub-counters to distribute writes across independent resources.

Write-Behind Caching: Absorbing 300k RPS with local buffers and flushing to Redis every 5 seconds.

Note: This case study follows the PEDALS framework and is optimized for interview preparation.

1. Process (Clarification)

Problem: Design a Distributed Counter service that tracks Video Views at massive scale. Real-World Example: “Despacito” on YouTube has 8 Billion views. During its peak, it was receiving > 50,000 views/second. The Challenge: A standard database row cannot handle 50k updates/second due to lock contention.

2. Estimate (Capacity)

DAU: 1 Billion Users.
Average Views: 5 videos / user / day = 5 Billion views / day.
QPS (Average): 5 × 10⁹ / 86400 ≈ 58,000 writes/sec.
Peak QPS: Assume 5x multiplier = 300,000 writes/sec.
Storage:
Video ID (8 bytes) + Count (8 bytes) = 16 bytes per video.
1 Billion Videos = 10⁹ × 16 bytes ≈ 16 GB (Hot data fits in Memory).

3. Design (High-Level)

The architecture splits the “Write Path” (High Throughput) from the “Read Path” (High Availability).

Client sends increment request to Load Balancer.
API Service receives request.
Fast Path: API Service increments a counter in Redis Cluster.
Slow Path: A background worker (“The flusher”) reads from Redis and updates the SQL Database every N seconds.

4. Articulate (Deep Dive)

The “Hot Key” Problem

The naive approach uses a simple Redis INCR:

INCR video:123:views

Problem: Redis is single-threaded. A single key lives on a single Redis node. That node can handle ~50k - 100k RPS. If “Despacito” gets 300k RPS, the single Redis node will melt (CPU 100%).

The “Elite” Pattern: CRDT counters

For multi-region deployments, sharding isn’t enough. You need CRDTs (Conflict-free Replicated Data Types).

G-Counter (Grow-only): Each node maintains its own counter. To merge, you take the MAX(local, remote) for each node’s entry and sum them. This handles concurrent updates across the globe without a central lock.
PN-Counter (Positive-Negative): Similar to G-Counter but allows decrements by using two internal G-Counters (increments and decrements).

Solution: Counter Sharding

We split the single counter into N sub-counters.

Key: video:123:views → Shards: video:123:views:0 … video:123:views:9
Write: Pick a random shard i and INCR.
Read: GET all shards and sum them up.

# Write Path (O(1))
shard_id = random.randint(0, N-1)
redis.incr(f"video:{vid}:views:{shard_id}")

# Read Path (O(N))
total = 0
for i in range(N):
  total += int(redis.get(f"video:{vid}:views:{i}") or 0)
return total

Write-Behind (Buffering)

For 300k RPS, even sharding Redis might be expensive. Strategy:

Local Buffer: Each API Server keeps a local Map<VideoID, Count>.
Batch: It accumulates writes for 5 seconds.
Flush: Every 5 seconds, it sends one INCRBY command to Redis with the aggregated value (e.g., +50).

5. List (Trade-offs) & Interactive Decision Visualizer

Compare Global Lock vs Sharded Counters vs Write-Behind. Monitor the System Health (Error Rate & Latency).

TRAFFIC

STRATEGY

Lock Sharded Write-Behind

QUEUE

DB ROW LOCK

IDLE

PERSISTED

Processed: 0

Time: 0.0s

RPS: 0

Errors: 0

6. Scale (Summary)

CORE ARCHITECTURE

Client → LB → API
API → Redis (Sharded)
Worker → SQL (Async)

KEY TRADE-OFFS

**Write-Behind**: Speed vs Safety.
**Sharding**: Scale vs Complexity.
**Redis**: Latency vs Cost.

Staff Engineer Tip: The Atomic CPU Trick. On a single physical machine, for ultra-high performance counters (like within a Load Balancer’s metrics), don’t use locks at all. Use CPU Atomic Atomics (like LOCK XADD on x86). This performs the increment directly in the L1/L2 cache of the CPU, bypassing the kernel scheduler entirely. At the scale of 50M ops/sec, the “Lock Contention” isn’t in your software — it’s in the Cache Coherency Protocol (MESI) on the CPU bus.

Mnemonic — “Count = Shard + Buffer + Flush”: Hot Key? Shard the counter (N random sub-keys). Too high write rate? Buffer locally (Write-Behind, 5s flush). Too many shards to sum? Cache the sum in Redis with 60s TTL. Multi-region? Use G-Counter CRDTs to avoid global locks. Scalability ladder: Single Row → Redis INCR → Sharded Counter → Write-Behind Buffer → Distributed CRDT.

Case Study: The YouTube View Counter

Case Study: The YouTube View Counter

1. Process (Clarification)

2. Estimate (Capacity)

3. Design (High-Level)

4. Articulate (Deep Dive)

The “Hot Key” Problem

The “Elite” Pattern: CRDT counters

Solution: Counter Sharding

Write-Behind (Buffering)

5. List (Trade-offs) & Interactive Decision Visualizer

6. Scale (Summary)

Found this lesson helpful?