Design a Real-Time Gaming Leaderboard

1. What is a Real-Time Leaderboard?

In competitive gaming (Fortnite, League of Legends), players crave instant feedback. When you win a match, you want to see your rank jump from #50 to #45 immediately.

This is a Write-Heavy and Read-Heavy problem. Sorting 10 million rows in a SQL database on every update (ORDER BY score DESC) is O(N log N) and will crush the CPU. We need a data structure that keeps elements sorted as they are inserted.

[!TIP] Use Cases:

  • Gaming: Global Rank, Clan Rank.
  • Social: “Most Active Users”.
  • Sales: “Top Performers”.

2. Requirements & Goals

2.1 Functional Requirements

  1. Update Score: Player completes a level -> Score increases.
  2. Get Rank: “What is my global rank?”
  3. Get Top K: “Who are the top 10 players?”
  4. Get Neighbors: “Who is just above and below me?” (Relative Rank).

2.2 Non-Functional Requirements

  1. Low Latency: Rank updates must be visible < 20ms.
  2. High Throughput: Handle 50,000 score updates/sec during tournaments.
  3. Real-time Consistency: No caching delay.
  4. Accuracy: Ranks must be exact for the Top 1000. Approximations are okay for rank #5,000,000.

3. Capacity Estimation

3.1 Traffic Analysis

  • Total Users: 50 Million.
  • DAU: 10 Million.
  • Writes: 10 matches/user/day -> 100 Million updates/day.
    • QPS: 100M / 86400 &approx; 1,150 updates/sec (Avg).
    • Peak: 50x (Tournament End) -> 50,000 updates/sec.

3.2 Memory Estimation

Redis is the standard choice here.

  • Entry Size: UserID (8 bytes) + Score (8 bytes) + Overhead (pointers) &approx; 100 bytes.
  • Total RAM: 50M Users * 100 bytes = 5 GB.
  • Conclusion: Fits easily in a single Redis instance (typically 32GB+). The bottleneck is CPU, not RAM.

4. System APIs

REST API for client interactions.

Method Endpoint Description
POST /v1/scores Updates a user’s score. Payload: { userId: "u1", score: 2500 }
GET /v1/leaderboard/top Returns top 10. Response: [{user: "u1", rank: 1}, ...]
GET /v1/leaderboard/me Returns current user’s rank and neighbors.

5. Database Design

We use the Write-Aside Cache pattern, but with a twist: Redis is the primary source for Ranking, while SQL is the primary source for Durable History.

5.1 Redis (The Rank Engine)

  • Data Structure: Sorted Set (ZSET).
  • Key: leaderboard:global
  • Value: UserId
  • Score: Points

5.2 SQL (The History)

  • Table: user_scores
  • Columns: user_id, score, updated_at.

6. High-Level Architecture

System Architecture: Real-Time Leaderboard
Redis ZSET (In-Memory Ranking) | SQL Source of Truth | Write-Aside Cache
Update Path (ZADD)
Query Path (Rank/Top-K)
Durable Persistence
Score Service
Node 1
Node 2
Redis: Leaderboard
#1 Player_A 2,500
#2 Player_B 2,450
#3 Player_C 2,100
...
Sorted Set (ZSET)
PostgreSQL
Durable User Scores
POST /score ZADD (O(log N)) Async Persistence ZREVRANK / ZRANGE GET /rank

7. Component Design (Deep Dive)

7.1 Redis Sorted Sets & Skiplists

Why Redis ZSET?

  • Hash Map: O(1) lookups, but unordered. Cannot get “Top 10”.
  • B-Tree: Good for ordering, but complex to implement lock-free in memory.
  • Skiplist: Used by Redis. A probabilistic data structure that allows O(log N) search, insert, and delete. It works by having multiple “levels” of linked lists, allowing us to skip over many elements at once.

Python Implementation

import redis

r = redis.Redis(host='localhost', port=6379, db=0)

def update_score(user_id, score):
    # ZADD leaderboard 1500 "PlayerA"
    r.zadd('leaderboard', {user_id: score})

def get_rank(user_id):
    # ZREVRANK leaderboard "PlayerA" (0-based index)
    rank = r.zrevrank('leaderboard', user_id)
    return rank + 1 if rank is not None else None

def get_top_k(k=10):
    # ZRANGE leaderboard 0 9 WITHSCORES
    return r.zrevrange('leaderboard', 0, k-1, withscores=True)

8. Data Partitioning & Sharding

If we have 500 Million users, a single Redis node might hit CPU limits (Redis is single-threaded).

8.1 Strategy A: Sharding by Score Range (Partitioning)

  • Shard 1: Score 0 - 1000
  • Shard 2: Score 1001 - 2000
  • Pros: Distributes load.
  • Cons: Hotspots (everyone is in Shard 1 initially). Moving users between shards is hard.

8.2 Strategy B: Bucket Ranking (Approximation)

For truly global scale (1B users), we don’t need exact ranks for everyone.

  • Top 1000: Precise Redis ZSET.
  • The Rest: Store in “Buckets” (Histogram) in SQL/Redis.
    • Bucket 100-200: 50k users.
    • Bucket 200-300: 20k users.
  • My Rank = Count(All Higher Buckets) + RankInMyBucket.

9. Reliability, Caching, & Load Balancing

9.1 Handling Ties

In games, 1000 players might have a score of 500. Who is #1?

  • Standard: Lexicographical (Alphabetical). Not fair.
  • Time-Based: The player who reached the score first should be higher.
  • Formula: Score = Points + (1 / Timestamp).
    • Actually, since higher is better, we want Score = Points + (MAX_TIME - Timestamp) / MAX_TIME.
    • This appends a decimal fraction representing time.

10. Interactive Decision Visualizer: Redis ZSET Simulator

Add players and watch the Sorted Set automatically re-order them in O(log N) time.

Redis ZSET Live Demo

Simulate `ZADD` and `ZREVRANGE`

RANK / PLAYER SCORE

11. System Walkthrough

Let’s trace the flow of updating a score and fetching a rank.

Scenario: Updating Score

  1. User Action: Player “Faker” wins a match. Score +50.
  2. API Call: POST /scores
    { "userId": "u_faker", "score": 2550 }
    
  3. Redis Write:
    • Command: ZADD leaderboard:global 2550 "u_faker"
    • Complexity: O(log N).
    • Internals: Redis updates the Skiplist, moving the node for “u_faker” to its new sorted position.
  4. Persistence: API asynchronously updates the SQL DB.

Scenario: Fetching Top 10

  1. Client: GET /leaderboard/top.
  2. Redis Read:
    • Command: ZREVRANGE leaderboard:global 0 9 WITHSCORES
    • Complexity: O(log N + M).
    • Response: ["u_faker", 2550, "u_s1mple", 2450, ...]
  3. Hydration: API fetches user metadata (Avatar, Name) for these IDs from Memcached/SQL.
  4. Response: Returns rich JSON to client.

12. Low-Level Optimizations

12.1 Skiplist Memory Overhead

Redis ZSETs use a Skiplist + Hash Map.

  • Overhead: Each element has multiple pointers (forward/backward) for the Skiplist levels. This adds ~80 bytes per element overhead.
  • Impact: 100M users = 10GB data + 8GB overhead.
  • Optimization: For small leaderboards (e.g., Clan Rank < 100 members), Redis automatically uses a Ziplist (compressed array) instead of a Skiplist to save memory. Configure zset-max-ziplist-entries 128.

13. Interview Gauntlet

Q1: How do you handle regional leaderboards?

  • Answer: Create separate ZSET keys: leaderboard:na, leaderboard:eu. When updating score, write to both leaderboard:global and leaderboard:na.

Q2: What if the score is floating point?

  • Answer: Redis ZSET scores are doubles. However, precision issues can occur. It’s safer to store as Integers (multiply by 100) or use lexicographical tie-breaking manually.

Q3: Can we use a Relational DB for Top 10?

  • Answer: Yes, if we use an index on (score DESC). SELECT * FROM scores ORDER BY score DESC LIMIT 10 is fast (O(log N)). The problem is Rank Retrieval for a specific user (SELECT count(*) FROM scores WHERE score > my_score), which is O(N) and slow.

Q4: How do you archive monthly leaderboards?

  • Answer: Use the RENAME command to atomically move leaderboard:current to leaderboard:jan_2024 at midnight. Start a fresh ZSET.

Q5: What is the time complexity of ZCOUNT?

  • Answer: O(log N). It traverses the Skiplist to find the start and end scores, then calculates the rank difference.

14. Summary: The Whiteboard Strategy

1. Requirements

  • Func: Update Score, Get Rank.
  • Non-Func: Real-time (<20ms).

2. Architecture

[Client] -> [API] -> [Redis ZSET]
|
(Async) -> [SQL DB]

* Redis: Primary for Rank.
* SQL: Primary for History.

3. Data & API

ZADD leaderboard 500 "User1"
ZREVRANK leaderboard "User1"

4. Deep Dives

  • Skiplists: The magic behind O(log N).
  • Ties: Use timestamp decimal.