Real-Time: Polling, WebSockets, SSE

In 2022, Slack’s engineering team discovered that their WebSocket servers were hitting the Linux file descriptor limit during peak hours — causing silent connection drops for thousands of users. In 2021, Discord handled 2.7 million concurrent WebSocket connections on a single service. The secret? They didn’t just “use WebSockets” — they engineered around the stateful scaling trap that kills most real-time systems.

Real-time features feel simple until your server crashes during the Super Bowl commercial break and 100,000 users reconnect simultaneously. That’s when the engineering really starts.

[!IMPORTANT] In this lesson, you will master:

  1. The Protocol Decision Matrix: When to choose Polling, WebSockets, SSE, or WebRTC — and why getting this wrong at 1M users costs you your SLA.
  2. The Thundering Herd: Handling the “DDoS from your own users” when a server restarts (and the Jitter fix).
  3. Stateful Scaling: Why WebSockets require Redis Pub/Sub and Sticky Sessions, and the exact File Descriptor math to hit 1 Million concurrent connections.

1. The Options

A. Short Polling (The “Are we there yet?” Kid)

Client asks every 2 seconds: “New msg?”

  • Pros: Simple. Works on everything.
  • Cons: High latency, server load (headers overhead). Wastes battery.

B. Long Polling (The “Wait for it” approach)

Client asks “New msg?”. Server holds the connection open until data arrives (or timeout).

  • Pros: Better than short polling.
  • Cons: Still re-establishes connections frequently. Headers overhead per message.

C. WebSockets (The “Phone Call”)

A persistent, bi-directional TCP connection.

  • Pros: Instant, low overhead (after handshake), Full Duplex (Send & Receive).
  • Cons: Stateful. If server crashes, connection dies. Hard to scale (Load Balancers need Sticky Sessions).

D. Server-Sent Events (SSE) (The “Radio”)

Server pushes data to Client over HTTP. Client cannot push back (must use regular POST).

E. WebRTC (The “Walkie Talkie”)

Peer-to-Peer (P2P) communication directly between browsers (Audio/Video/Data).

  • Pros: Lowest latency (UDP). Offloads server bandwidth.
  • Cons: Complex setup (ICE, STUN, TURN). Hard to record/monitor.

2. Interactive Demo: Protocol Racer

See the difference in “Traffic Shape”.

  • Short Polling: Spammy. Lots of Red (Overhead).
  • WebSockets: One Green Line (Persistent).

Protocol Racer: Traffic Shape

Polling (Request-Response) vs WebSockets (Persistent Stream)

Short Polling (HTTP/1.1)
SERVER
CLIENT
WebSockets (Full Duplex)
SERVER
CLIENT

3. Scaling WebSockets (The Hard Part)

WebSockets are Stateful.

  • User A connects to Server 1.
  • User B connects to Server 2.
  • User A sends “Hello”. Server 1 receives it.
  • Problem: Server 1 doesn’t know about User B. User B is on Server 2.

The Solution: Pub/Sub (Redis)

We need a “Message Bus” connecting all servers.

  1. Server 1 receives message from User A.
  2. Server 1 publishes to Redis channel room_1.
  3. Server 2 (subscribed to room_1) receives the event.
  4. Server 2 pushes message to User B.
  • Solution: Sticky Sessions (Session Affinity). The LB hashes the Client IP and ensures they always go to the same server.

[!NOTE] Hardware-First Intuition: Every WebSocket connection is a File Descriptor (FD). Linux has a default limit (e.g., 1024 or 65k). To scale to 1 Million connections, you must tune ulimit -n and manage RAM carefully. A single idle connection can consume 10KB-50KB of RAM for buffers. 1M connections = 50GB of RAM just for the sockets!

B. Backpressure: The Slow Consumer

In a WebSocket, the server can push data faster than the client (e.g., a mobile phone on 3G) can read it.

  • The Problem: The server’s output buffer grows until the server runs out of RAM and crashes (OOM).
  • The Fix: Backpressure. The application must monitor the client’s bufferedAmount and stop sending data if it exceeds a threshold (e.g., 1MB).

C. Security Deep Dive: CSWSH

Unlike normal AJAX, WebSockets are not restricted by Same-Origin Policy (SOP).

  • The Threat: Cross-Site WebSocket Hijacking (CSWSH). A malicious site can open a WebSocket to your API using the user’s cookies.
  • The Fix: Always check the Origin header on the server during the WebSocket handshake. If the origin doesn’t match your domain, reject the connection.

4. WebRTC: The P2P Powerhouse

While WebSockets use a server, WebRTC attempts to bypass the server entirely for lowest latency.

  1. Signaling: The clients talk to a server (via WebSockets) just to exchange their public IP/Port info (“SDP Offer/Answer”).
  2. STUN Server: Used to discover the client’s public IP.
  3. TURN Server: If NAT is too restrictive, the server acts as a relay (expensive, but necessary for ~20% of users).
  4. Security: WebRTC is the only browser protocol that requires encryption (DTLS/SRTP).

Interactive Demo: Sticky vs Round Robin

  • Sticky OFF: Clients bounce between servers.
  • Sticky ON: Client Red always goes to Server 1. Client Blue always goes to Server 2.

Sticky Sessions vs. Round Robin

Ensuring Stateful connections stay on the correct server

🔴
Client 1
🔵
Client 2
⚖️
LB
🏠
SRV 1
🏠
SRV 2
Click on a Client to simulate reconnection

[!IMPORTANT] The Thundering Herd: When a server restarts, 1 Million connected WebSocket clients will instantly try to reconnect. This DDoS attack (by your own users) can take down your Auth Service and Load Balancer. Fix: Add a random “Jitter” (delay) to the client reconnection logic (e.g., reconnect in Random(0, 30) seconds).

Interactive Demo: The Thundering Herd

Simulate a server crash and reconnection strategy.

  • No Jitter: Everyone reconnects at T=0. Server Spikes to 100% CPU.
  • With Jitter: Reconnections spread out. Server stays stable.

Thundering Herd: Reconnection Spike

Simulating CPU impact of 1 Million simultaneous reconnects

INSTANT CPU LOAD
Processing Auth & Handshakes
0%
DDoS THRESHOLD

5. Interactive Demo: Distributed Chat

Visualize the “Pub/Sub” flow.

  1. Alice is on Server A. Bob is on Server B.
  2. Type a message for Alice.
  3. Watch it travel: Alice → Server A → Redis → Server B → Bob.

Distributed Chat: Scaling State

Using Redis Pub/Sub to bridge stateful servers

💻
ALICE
CONNECTED TO SERVER A
🏠
SERVER A
🔴
REDIS PUB/SUB (Message Bus)
🏠
SERVER B
📱
BOB
CONNECTED TO SERVER B
CONNECTED: SYSTEM READY
✉️

System Walkthrough: Fanout at Scale

  1. Fanout: When 1 user sends a message to a room with 10,000 users, the server must replicate that message 10,000 times.
  2. The Bottleneck: Replicating bits is cheap, but context switching and interrupt handling for 10,000 TCP writes is expensive.
  3. The Fix: Use a Gateway Service (like Uber’s “Morpheus”) that specializes in high-fanout delivery, offloading the business logic servers.

6. Keeping it Alive: The Heartbeat

WebSockets are persistent. But if the WiFi drops silently, the Server might think the connection is open for hours (wasting resources).

  • The Solution: Application-Level Pings.
  • Client: Sends PING every 30s.
  • Server: Replies PONG.
  • If Server misses 3 PINGs → Close Socket.

Active Heartbeat: Liveness Detection

Identifying stale sockets before they waste server resources

💻
Edge Client
🏠
Socket Srv
STATE: ACTIVE

7. The Future: WebTransport (HTTP/3)

WebSockets are built on TCP. This means they suffer from Head-of-Line Blocking (if one packet is lost, everything waits). WebTransport is the modern alternative built on HTTP/3 (QUIC).

Why WebTransport?

  1. Datagrams: You can send fire-and-forget UDP-like packets (great for gaming).
  2. Streams: You can open multiple reliable streams (like HTTP/2). If one stream stalls, others keep going.
  3. Single Handshake: It reuses the HTTP/3 connection. No separate TCP handshake.

8. Mobile Considerations: Battery Life

Why does WhatsApp use a custom protocol (or Long Polling) instead of just short polling?

  • Radio Power States: Mobile radios (4G/5G) have “High Power” and “Low Power” (Idle) states.
  • Polling Cost: Waking up the radio every 2 seconds keeps it in “High Power” mode, draining battery.
  • The Fix: Use Persistent Connections (WebSockets/Push Notifications). The radio can stay in a low-power “listening” mode until data arrives.

9. Comparison Table

Feature Short Polling WebSockets SSE WebRTC WebTransport
Protocol HTTP/1.1 TCP HTTP/1.1 UDP/TCP HTTP/3 (QUIC)
Direction Client Pull Bidirectional Server Push P2P (Bidirectional) Bidirectional
Latency High Low Low Lowest (UDP) Low (UDP/QUIC)
Complexity Low High (Stateful) Medium Very High High
Use Case Dashboards Chat, Games Notifications Zoom/Video Cloud Gaming, Trading

Staff Engineer Tip: Tuning for 1M Connections. To hit 1M concurrent connections on a single Linux node, you must tune the kernel:

  1. TCP Port Range: sysctl -w net.ipv4.ip_local_port_range="1024 65535" allows more outbound connections.
  2. File Descriptors: ulimit -n 1048576 increases the per-process limit.
  3. TCP Memory: sysctl -w net.ipv4.tcp_mem='768432 1024576 1536864' ensures the kernel has enough RAM for socket buffers.
  4. Ephemeral Port Reuse: net.ipv4.tcp_tw_reuse=1 allows rapid reconnection from the same source. At this scale, Interrupt Coalescing on your NIC (Network Interface Card) becomes mandatory to prevent the CPU from spending 100% of its time just handling “Packet Received” signals.