Design Facebook Messenger (Chat)

[!IMPORTANT] In this lesson, you will master:

System Architecture: Designing a real-time message routing system.

Connection Protocols: Understanding WebSockets vs. Long Polling.

Data Model: Handling high-throughput message storage using wide-column stores.

Reliability: Message synchronization and End-to-End Encryption (E2EE).

1. What is a Chat System?

A chat system allows users to send text, images, and videos to each other in real-time. It supports 1-on-1 conversations and Group chats.

Real-World Examples

WhatsApp / Signal: Store messages on device (mostly). End-to-end encrypted. Architecture is “Store-and-Forward”.
FB Messenger / Slack: Store messages on server. Multi-device sync is critical. We will focus on this Cloud-Based architecture.

Try it yourself

[!TIP] Try it yourself: Open WhatsApp Web and your phone. Send a message from your phone. It appears on the Web instantly. Turn off your phone’s internet. The Web version says “Phone not connected” (for WhatsApp) or continues working (for Messenger). Why?

2. Process Requirements

Functional Requirements

1-on-1 Chat: Low latency delivery.
Group Chat: Up to 500 members.
Presence: “Online”, “Last Seen”, “Typing…”.
Receipts: Sent, Delivered, Read.
Multi-Device: Sync messages across phone and desktop.

Non-Functional Requirements

Low Latency: Real-time experience is key (< 100ms).
Consistency: Messages must appear in order (FIFO).
Availability: High.
Security:
TLS: Encryption in transit.
E2E (End-to-End): Using the Signal Protocol (Double Ratchet Algorithm). Keys are stored on user devices, not servers. The server only sees encrypted blobs.

3. Estimate

DAU: 2 Billion.
Messages: 100 Billion per day.
Storage:
Avg msg size = 100 Bytes.
100B * 100B = 10TB / day.
5 Years = 18 PB.
We need a massive, write-heavy database.
Bandwidth: 10TB / 86400 seconds ≈ 115 MB/s on average. Peak traffic might be 2x or 3x this, reaching ~345 MB/s.
Hardware Reality (Concurrent Connections): This is the real bottleneck of real-time systems. If 10% of our 2B DAU are online simultaneously, that is 200 Million concurrent WebSockets. A heavily optimized Linux server (with tuned ulimit and ephemeral ports) can hold around 500k connections. We will need at least 200M / 500k = 400 massive, memory-heavy connection servers just to maintain these open sockets!

4. Data Model

System APIs

We need a mix of REST (for profile/auth) and WebSocket (for chat).

REST API (HTTP)

Method	Endpoint	Description
`POST`	`/v1/chat/create`	Start a new chat (returns `chat_id`).
`POST`	`/v1/group/add_member`	Add user to group.

WebSocket Events (Bidirectional)

sendMessage(chat_id, content)
receiveMessage(sender_id, content)
userStatusChanged(user_id, status)

Database Design

We need a database that handles extremely high write throughput and efficient range queries (to load chat history).

MySQL?

Possible, but index maintenance becomes heavy at 100B writes/day.

HBase / Cassandra (Wide Column Store)

Winner (labeled in diagram).
Key: chat_id (Partition Key).
Clustering Key: message_id (Sort Key, Snowflake).
Values: sender_id, content_text, media_url, created_at.
Allows very fast “Get me the last 50 messages for Chat 123” queries by scanning the sequential clustering keys within a single partition.

5. Architecture

Real-Time Message Routing Architecture.

Analogy: The Post Office vs. The Dedicated Courier
Traditional HTTP requests are like sending a letter through the post office—you drop it off, wait days, and periodically check your mailbox (polling). WebSockets, however, are like having a dedicated courier standing at your door with a direct radio link to the sender. The moment a message is dispatched, the courier hands it to you instantly. The challenge in system design is managing millions of these dedicated couriers without running out of "radio frequencies" (server ports and memory).

System Architecture: Real-Time Chat System

WebSocket | Stateful Chat Servers | Service Discovery (Zookeeper)

WebSocket

Message Route

Discovery

Clients

Chat Cloud

Storage & Discovery

Sender

Receiver

Load Balancer

Chat Server 1

(User A)

Chat Server 2

(User B)

Zookeeper

Discovery

HBase

Storage

6. Localized Details

Connection Protocols

How do we keep the connection open? See Polling vs Push.

A. Polling (The Old Way)

Client asks “Any new messages?” every 1 second.
Pros: Simple HTTP.
Cons: Wasted resources. Server load is high even if no one is talking. High latency.

B. Long Polling

Client asks “Any new messages?”. Server holds the connection open until a message arrives (or timeout).
Pros: Less load than polling.
Cons: Connection setup overhead is still there.

C. WebSockets (The Modern Way)

Bidirectional, persistent TCP connection.
Server can push to Client instantly.
Pros: Lowest latency, lowest overhead.
Cons: Needs stateful servers (Server must know who is connected to it).

Group Chat Optimization

Group chats introduce a “Fanout” problem.

Small Groups (< 500 members)

Push Model: When User A sends a message, the server loops through all 500 members, finds their WebSocket connections, and pushes the message.
Why: 500 lookups is fast.

Large Groups / Channels (> 5000 members)

Pull Model (or Hybrid): We don’t push to everyone.
Online users might be listening to a Pub/Sub channel (e.g., Kafka topic).
Inactive users will just fetch the history when they open the app next time.

Message Synchronization & Reliability

How do we ensure messages are delivered in order and synced across devices?

Sequence IDs & Multi-Device Sync

We cannot rely on timestamp (clock skew). We use Sequence IDs per chat.

The Sync Protocol:

State: Each device (Phone, Laptop) maintains a local LastReadID.
Reconnect: When a device connects, it sends SYNC(LastReadID).
Fetch: Server queries HBase/Cassandra: SELECT * FROM msgs WHERE chat_id=123 AND msg_id > LastReadID.
Forward: Server pushes missing messages to that specific socket.

[!TIP] War Story: The New Year’s Eve Outage WhatsApp once faced massive delays during New Year’s Eve due to ID generation bottlenecks. By moving from a global sequence generator to localized sequence IDs (scoped to each chat room), they eliminated the global lock and handled millions of messages per second seamlessly.

The Sequence ID Solution

Why not just use time? Because server clocks drift.

War Story: The Time-Drift Phantom
In early versions of distributed chat systems, relying on NTP (Network Time Protocol) for timestamping led to a notorious bug: messages arriving "from the future" or "out of order". If Server A was 50ms ahead of Server B, a reply processed by Server A might get a timestamp earlier than the original message processed by Server B. This forced engineering teams to abandon absolute timestamps for ordering, shifting entirely to logical clocks and sequence IDs to guarantee causal ordering.

Logical Clocks: A Sequence ID is a counter (1, 2, 3…) unique to a Chat Room.
Implementation: This is tricky. You can’t use a global counter (slow).
Solution: Since one person writes at a time (usually), the client can propose a temp ID, and the server assigns the final authoritative ID.
K-V Store: We store Current_Max_ID in Redis for each Chat ID. INCR is atomic.

Active Session Sync:

If User A is online on both Phone and Laptop.
The Chat Server detects two active WebSocket connections for UserA.
When a message arrives for UserA, the server fans out the message to all active sockets.
Both devices receive the message instantly.

Client State: ID: 5
--- Sync (ID: 5) --->
Server DB: [6, 7, 8]

Presence Service

Heartbeat: Client sends a heartbeat every 5s over the WebSocket.
Redis: Presence Service updates “Last Seen” in Redis with TTL = 10s.
If Heartbeat stops, TTL expires → User is Offline.
Fanout: When User A comes online, we fanout this status to all their friends (Pub/Sub).

End-to-End Encryption (Signal Protocol)

How does WhatsApp ensure End-to-End Encryption (E2EE) where even the server cannot read the messages?

It uses the Signal Protocol, which combines X3DH (Extended Triple Diffie-Hellman) and the Double Ratchet Algorithm.

1. Key Generation (X3DH)

When Alice installs the app, she generates keys and uploads the public parts to the server:

Identity Key (IK): Long-term identity.
Signed PreKey (SPK): Medium-term key.
One-Time PreKeys (OPK): A stack of single-use keys.

2. Session Setup (Async)

When Bob wants to message Alice:

Bob fetches Alice’s public keys (IK, SPK, OPK) from the server.
Bob performs X3DH to derive a shared secret SK.
Bob encrypts the first message using SK and sends it (along with his public keys).
Crucially: Alice does NOT need to be online.

3. Double Ratchet (Forward Secrecy)

For every new message, the encryption key changes.

KDF Chain: A Key Derivation Function generates a new key for each message.
Forward Secrecy: If a hacker steals the key for Message 50, they cannot derive the key for Message 49 (Backward Secrecy) or Message 51 (Forward Secrecy).
Self-Healing: If a key is compromised, the next Diffie-Hellman exchange resets the chain.

[!TIP] Interview Tip: Mention “Double Ratchet” for E2EE questions. It’s the gold standard (used by WhatsApp, Signal, Facebook Secret Conversations).

Interactive Decision Visualizer

Polling vs WebSocket Simulator

[!TIP] Try it yourself: Click the buttons to compare network overhead. Notice how Polling creates many requests (packets) even when no data is available.

HTTP Polling 🐢

Client

Server

Requests: 0

WebSocket ⚡

Client

Server

Requests: 1

7. Scale

Observability (RED Method)

Rate: Messages Sent per Second.
Errors: Failed Message Deliveries / WebSocket Disconnects.
Duration: End-to-End Latency (Sender → Server → Receiver).

Key Metrics:

active_connections: Number of open WebSockets per server (Capacity Planning).
message_queue_depth: If using Kafka for async tasks.

Deployment Strategy

Connection Draining: When deploying a new Chat Server, we cannot just kill the old one (it has active WebSockets). We must stop accepting new connections and wait for old ones to close (or force reconnect).
Blue/Green: Essential for zero-downtime updates of stateful services.

8. Summary & Evaluation

Requirements Traceability

Requirement	Design Decision	Justification
Real-Time Delivery	WebSockets	Only persistent connections can achieve <100ms latency.
Message Ordering	Sequence IDs	Time-sortable IDs (Snowflake) per chat ensure FIFO.
Scalability (Storage)	HBase / Cassandra	Wide-column stores handle billions of small writes efficiently.
Group Chat	Hybrid Fanout	Push for small groups, Pull for large channels.
Presence	Redis + Heartbeat	Ephemeral keys with TTL map perfectly to “Online Status”.

Interview Gauntlet

Q1: How do you handle “Read Receipts” in a Group of 500?

Answer: We do NOT send a separate event for every read. We aggregate them. The client sends “Read up to ID 100”. The server updates the “Read Watermark” for that user in that group.

Q2: What happens if a user is offline?

Answer: The message is stored in the DB (HBase). When the user connects, they send their LastReadID. The server queries HBase for WHERE chat_id = X AND msg_id > LastReadID.

Q3: How do you store images?

Answer: Never in the WebSocket. Upload to S3 (HTTP) → Get URL → Send URL in WebSocket message.

Design Facebook Messenger (Chat)

Design Facebook Messenger (Chat)

1. What is a Chat System?

Real-World Examples

Try it yourself

2. Process Requirements

Functional Requirements

Non-Functional Requirements

3. Estimate

4. Data Model

System APIs

Database Design

5. Architecture

6. Localized Details

Connection Protocols

A. Polling (The Old Way)

B. Long Polling

C. WebSockets (The Modern Way)

Group Chat Optimization

Small Groups (< 500 members)

Large Groups / Channels (> 5000 members)

Message Synchronization & Reliability

Sequence IDs & Multi-Device Sync

The Sequence ID Solution

Presence Service

End-to-End Encryption (Signal Protocol)

1. Key Generation (X3DH)

2. Session Setup (Async)

3. Double Ratchet (Forward Secrecy)

Interactive Decision Visualizer

Polling vs WebSocket Simulator

HTTP Polling 🐢

WebSocket ⚡

7. Scale

Observability (RED Method)

Deployment Strategy

8. Summary & Evaluation

Requirements Traceability

Interview Gauntlet

Whiteboard Summary

Real-Time Arch

Storage & Ops

Found this lesson helpful?