Real-Time Communication Strategies
[!TIP] Interview Tip: “How do you scale a Chat App?” is a trick question. The hard part isn’t storing messages (Database), it’s routing them to the right user who might be connected to a different server. Answer: Redis Pub/Sub (See Pub/Sub Pattern).
1. The Options
A. Short Polling (The “Are we there yet?” Kid)
Client asks every 2 seconds: “New msg?”
- Pros: Simple. Works on everything.
- Cons: High latency, server load (headers overhead). Wastes battery.
B. Long Polling (The “Wait for it” approach)
Client asks “New msg?”. Server holds the connection open until data arrives (or timeout).
- Pros: Better than short polling.
- Cons: Still re-establishes connections frequently. Headers overhead per message.
C. WebSockets (The “Phone Call”)
A persistent, bi-directional TCP connection.
- Pros: Instant, low overhead (after handshake), Full Duplex (Send & Receive).
- Cons: Stateful. If server crashes, connection dies. Hard to scale (Load Balancers need Sticky Sessions).
D. Server-Sent Events (SSE) (The “Radio”)
Server pushes data to Client over HTTP. Client cannot push back (must use regular POST).
- Pros: Simple HTTP, Auto-reconnect, Firewall friendly.
- Cons: One-way only.
- (Perfect for Live Notification Systems).
E. WebRTC (The “Walkie Talkie”)
Peer-to-Peer (P2P) communication directly between browsers (Audio/Video/Data).
- Pros: Lowest latency (UDP). Offloads server bandwidth.
- Cons: Complex setup (ICE, STUN, TURN). Hard to record/monitor.
Interactive Demo: Protocol Racer
See the difference in “Traffic Shape”.
- Short Polling: Spammy. Lots of Red (Overhead).
- WebSockets: One Green Line (Persistent).
2. Scaling WebSockets (The Hard Part)
WebSockets are Stateful.
- User A connects to Server 1.
- User B connects to Server 2.
- User A sends “Hello”. Server 1 receives it.
- Problem: Server 1 doesn’t know about User B. User B is on Server 2.
The Solution: Pub/Sub (Redis)
We need a “Message Bus” connecting all servers.
- Server 1 receives message from User A.
- Server 1 publishes to Redis channel
room_1. - Server 2 (subscribed to
room_1) receives the event. - Server 2 pushes message to User B.
2.5 The Load Balancer Trap: Sticky Sessions
If Client A connects to Server 1 via WebSocket, that connection is persistent. If the connection drops and Client A reconnects, the Load Balancer might send them to Server 2.
- Problem: Server 2 doesn’t know who Client A is (Session data is on Server 1).
- Solution: Sticky Sessions (Session Affinity). The LB hashes the Client IP and ensures they always go to the same server.
Interactive Demo: Sticky vs Round Robin
- Sticky OFF: Clients bounce between servers.
- Sticky ON: Client Red always goes to Server 1. Client Blue always goes to Server 2.
[!WARNING] The Thundering Herd: When a server restarts, 1 Million connected WebSocket clients will instantly try to reconnect. This DDoS attack (by your own users) can take down your Auth Service and Load Balancer. Fix: Add a random “Jitter” (delay) to the client reconnection logic (e.g., reconnect in
Random(0, 30)seconds).
Interactive Demo: The Thundering Herd
Simulate a server crash and reconnection strategy.
- No Jitter: Everyone reconnects at
T=0. Server Spikes to 100% CPU. - With Jitter: Reconnections spread out. Server stays stable.
Interactive Demo: Distributed Chat
Visualize the “Pub/Sub” flow.
- Alice is on Server A. Bob is on Server B.
- Type a message for Alice.
- Watch it travel: Alice -> Server A -> Redis -> Server B -> Bob.
System Walkthrough: The Life of a Chat Message
How does a message get from Alice to Bob, Charlie, and Dave? This is the Fanout pattern.
- Client (Alice): Sends JSON to Server A (Persistent WebSocket).
{ "type": "msg", "room_id": "101", "text": "Hello Everyone" } - Server A: Does NOT look for Bob. It simply Publishes to Redis.
PUBLISH room:101 '{"u":"Alice","t":"Hello Everyone"}' - Redis: Fanout. It checks who is listening to
room:101.- Server B is listening.
- Server C is listening.
- Server B: Receives event. Checks its local WebSocket list.
- “Ah, Bob is connected to me on Socket ID 99.” -> Pushes data to Bob.
- Server C: Receives event. Checks its local WebSocket list.
- “Ah, Charlie is here.” -> Pushes data to Charlie.
3. Keeping it Alive: The Heartbeat
WebSockets are persistent. But if the WiFi drops silently, the Server might think the connection is open for hours (wasting resources).
- The Solution: Application-Level Pings.
- Client: Sends
PINGevery 30s. - Server: Replies
PONG. - If Server misses 3 PINGs -> Close Socket.
The Heartbeat (Active Liveness)
If the client crashes silently (e.g., WiFi off), the Server thinks the connection is still open. We must send periodic PINGs.
4. The Future: WebTransport (HTTP/3)
WebSockets are built on TCP. This means they suffer from Head-of-Line Blocking (if one packet is lost, everything waits). WebTransport is the modern alternative built on HTTP/3 (QUIC).
Why WebTransport?
- Datagrams: You can send fire-and-forget UDP-like packets (great for gaming).
- Streams: You can open multiple reliable streams (like HTTP/2). If one stream stalls, others keep going.
- Single Handshake: It reuses the HTTP/3 connection. No separate TCP handshake.
[!NOTE] Adoption: WebTransport is still new but rapidly gaining support for high-performance use cases (Cloud Gaming, Real-Time Trading).
Comparison Table
| Feature | Short Polling | WebSockets | SSE | WebRTC | WebTransport |
|---|---|---|---|---|---|
| Protocol | HTTP/1.1 | TCP | HTTP/1.1 | UDP/TCP | HTTP/3 (QUIC) |
| Direction | Client Pull | Bidirectional | Server Push | P2P (Bidirectional) | Bidirectional |
| Latency | High | Low | Low | Lowest (UDP) | Low (UDP/QUIC) |
| Complexity | Low | High (Stateful) | Medium | Very High | High |
| Use Case | Dashboards | Chat, Games | Notifications | Zoom/Video | Cloud Gaming, Trading |