Case Study: The “YouTube View Counter”

Note: This case study follows the PEDALS framework and is optimized for interview preparation.

1. Process (Clarification)

Problem: Design a Distributed Counter service that tracks Video Views at massive scale. Real-World Example: “Despacito” on YouTube has 8 Billion views. During its peak, it was receiving > 50,000 views/second. The Challenge: A standard database row cannot handle 50k updates/second due to lock contention.

2. Estimate (Capacity)

  • DAU: 1 Billion Users.
  • Average Views: 5 videos / user / day = 5 Billion views / day.
  • QPS (Average): 5 × 109 / 86400 ≈ 58,000 writes/sec.
  • Peak QPS: Assume 5x multiplier = 300,000 writes/sec.
  • Storage:
    • Video ID (8 bytes) + Count (8 bytes) = 16 bytes per video.
    • 1 Billion Videos = 109 × 16 bytes ≈ 16 GB (Hot data fits in Memory).

3. Design (High-Level)

The architecture splits the “Write Path” (High Throughput) from the “Read Path” (High Availability).

Client Load Balancer API Service Redis Cluster (Sharded Counters) Flush Worker SQL DB INCR (Hot) Async Read Persist
  1. Client sends increment request to Load Balancer.
  2. API Service receives request.
  3. Fast Path: API Service increments a counter in Redis Cluster.
  4. Slow Path: A background worker (“The flusher”) reads from Redis and updates the SQL Database every N seconds.

4. Articulate (Deep Dive)

The “Hot Key” Problem

The naive approach uses a simple Redis INCR:

INCR video:123:views

Problem: Redis is single-threaded. A single key lives on a single Redis node. That node can handle ~50k - 100k RPS. If “Despacito” gets 300k RPS, the single Redis node will melt (CPU 100%).

Solution: Counter Sharding

We split the single counter into N sub-counters.

  • Key: video:123:views -> Shards: video:123:views:0video:123:views:9
  • Write: Pick a random shard i and INCR.
  • Read: GET all shards and sum them up.
# Write Path (O(1))
shard_id = random.randint(0, N-1)
redis.incr(f"video:{vid}:views:{shard_id}")

# Read Path (O(N))
total = 0
for i in range(N):
    total += int(redis.get(f"video:{vid}:views:{i}") or 0)
return total

Write-Behind (Buffering)

For 300k RPS, even sharding Redis might be expensive. Strategy:

  1. Local Buffer: Each API Server keeps a local Map<VideoID, Count>.
  2. Batch: It accumulates writes for 5 seconds.
  3. Flush: Every 5 seconds, it sends one INCRBY command to Redis with the aggregated value (e.g., +50).

5. List (Trade-offs) & Interactive Decision Visualizer

Compare Global Lock vs Sharded Counters vs Write-Behind. Monitor the System Health (Error Rate & Latency).

TRAFFIC
STRATEGY
DB ROW LOCK
IDLE
0
Processed: 0
Time: 0.0s
RPS: 0
Errors: 0

6. Scale (Summary)

CORE ARCHITECTURE
  • Client -> LB -> API
  • API -> Redis (Sharded)
  • Worker -> SQL (Async)
KEY TRADE-OFFS
  • **Write-Behind**: Speed vs Safety.
  • **Sharding**: Scale vs Complexity.
  • **Redis**: Latency vs Cost.