Real-Time: Polling, WebSockets, SSE
In 2022, Slack’s engineering team discovered that their WebSocket servers were hitting the Linux file descriptor limit during peak hours — causing silent connection drops for thousands of users. In 2021, Discord handled 2.7 million concurrent WebSocket connections on a single service. The secret? They didn’t just “use WebSockets” — they engineered around the stateful scaling trap that kills most real-time systems.
Real-time features feel simple until your server crashes during the Super Bowl commercial break and 100,000 users reconnect simultaneously. That’s when the engineering really starts.
[!IMPORTANT] In this lesson, you will master:
- The Protocol Decision Matrix: When to choose Polling, WebSockets, SSE, or WebRTC — and why getting this wrong at 1M users costs you your SLA.
- The Thundering Herd: Handling the “DDoS from your own users” when a server restarts (and the Jitter fix).
- Stateful Scaling: Why WebSockets require Redis Pub/Sub and Sticky Sessions, and the exact File Descriptor math to hit 1 Million concurrent connections.
1. The Options
A. Short Polling (The “Are we there yet?” Kid)
Client asks every 2 seconds: “New msg?”
- Pros: Simple. Works on everything.
- Cons: High latency, server load (headers overhead). Wastes battery.
B. Long Polling (The “Wait for it” approach)
Client asks “New msg?”. Server holds the connection open until data arrives (or timeout).
- Pros: Better than short polling.
- Cons: Still re-establishes connections frequently. Headers overhead per message.
C. WebSockets (The “Phone Call”)
A persistent, bi-directional TCP connection.
- Pros: Instant, low overhead (after handshake), Full Duplex (Send & Receive).
- Cons: Stateful. If server crashes, connection dies. Hard to scale (Load Balancers need Sticky Sessions).
D. Server-Sent Events (SSE) (The “Radio”)
Server pushes data to Client over HTTP. Client cannot push back (must use regular POST).
- Pros: Simple HTTP, Auto-reconnect, Firewall friendly.
- Cons: One-way only.
- (Perfect for Live Notification Systems).
E. WebRTC (The “Walkie Talkie”)
Peer-to-Peer (P2P) communication directly between browsers (Audio/Video/Data).
- Pros: Lowest latency (UDP). Offloads server bandwidth.
- Cons: Complex setup (ICE, STUN, TURN). Hard to record/monitor.
2. Interactive Demo: Protocol Racer
See the difference in “Traffic Shape”.
- Short Polling: Spammy. Lots of Red (Overhead).
- WebSockets: One Green Line (Persistent).
Protocol Racer: Traffic Shape
Polling (Request-Response) vs WebSockets (Persistent Stream)
3. Scaling WebSockets (The Hard Part)
WebSockets are Stateful.
- User A connects to Server 1.
- User B connects to Server 2.
- User A sends “Hello”. Server 1 receives it.
- Problem: Server 1 doesn’t know about User B. User B is on Server 2.
The Solution: Pub/Sub (Redis)
We need a “Message Bus” connecting all servers.
- Server 1 receives message from User A.
- Server 1 publishes to Redis channel
room_1. - Server 2 (subscribed to
room_1) receives the event. - Server 2 pushes message to User B.
- Solution: Sticky Sessions (Session Affinity). The LB hashes the Client IP and ensures they always go to the same server.
[!NOTE] Hardware-First Intuition: Every WebSocket connection is a File Descriptor (FD). Linux has a default limit (e.g., 1024 or 65k). To scale to 1 Million connections, you must tune
ulimit -nand manage RAM carefully. A single idle connection can consume 10KB-50KB of RAM for buffers. 1M connections = 50GB of RAM just for the sockets!
B. Backpressure: The Slow Consumer
In a WebSocket, the server can push data faster than the client (e.g., a mobile phone on 3G) can read it.
- The Problem: The server’s output buffer grows until the server runs out of RAM and crashes (OOM).
- The Fix: Backpressure. The application must monitor the client’s
bufferedAmountand stop sending data if it exceeds a threshold (e.g., 1MB).
C. Security Deep Dive: CSWSH
Unlike normal AJAX, WebSockets are not restricted by Same-Origin Policy (SOP).
- The Threat: Cross-Site WebSocket Hijacking (CSWSH). A malicious site can open a WebSocket to your API using the user’s cookies.
- The Fix: Always check the
Originheader on the server during the WebSocket handshake. If the origin doesn’t match your domain, reject the connection.
4. WebRTC: The P2P Powerhouse
While WebSockets use a server, WebRTC attempts to bypass the server entirely for lowest latency.
- Signaling: The clients talk to a server (via WebSockets) just to exchange their public IP/Port info (“SDP Offer/Answer”).
- STUN Server: Used to discover the client’s public IP.
- TURN Server: If NAT is too restrictive, the server acts as a relay (expensive, but necessary for ~20% of users).
- Security: WebRTC is the only browser protocol that requires encryption (DTLS/SRTP).
Interactive Demo: Sticky vs Round Robin
- Sticky OFF: Clients bounce between servers.
- Sticky ON: Client Red always goes to Server 1. Client Blue always goes to Server 2.
Sticky Sessions vs. Round Robin
Ensuring Stateful connections stay on the correct server
[!IMPORTANT] The Thundering Herd: When a server restarts, 1 Million connected WebSocket clients will instantly try to reconnect. This DDoS attack (by your own users) can take down your Auth Service and Load Balancer. Fix: Add a random “Jitter” (delay) to the client reconnection logic (e.g., reconnect in
Random(0, 30)seconds).
Interactive Demo: The Thundering Herd
Simulate a server crash and reconnection strategy.
- No Jitter: Everyone reconnects at
T=0. Server Spikes to 100% CPU. - With Jitter: Reconnections spread out. Server stays stable.
Thundering Herd: Reconnection Spike
Simulating CPU impact of 1 Million simultaneous reconnects
5. Interactive Demo: Distributed Chat
Visualize the “Pub/Sub” flow.
- Alice is on Server A. Bob is on Server B.
- Type a message for Alice.
- Watch it travel: Alice → Server A → Redis → Server B → Bob.
Distributed Chat: Scaling State
Using Redis Pub/Sub to bridge stateful servers
System Walkthrough: Fanout at Scale
- Fanout: When 1 user sends a message to a room with 10,000 users, the server must replicate that message 10,000 times.
- The Bottleneck: Replicating bits is cheap, but context switching and interrupt handling for 10,000 TCP writes is expensive.
- The Fix: Use a Gateway Service (like Uber’s “Morpheus”) that specializes in high-fanout delivery, offloading the business logic servers.
6. Keeping it Alive: The Heartbeat
WebSockets are persistent. But if the WiFi drops silently, the Server might think the connection is open for hours (wasting resources).
- The Solution: Application-Level Pings.
- Client: Sends
PINGevery 30s. - Server: Replies
PONG. - If Server misses 3 PINGs → Close Socket.
Active Heartbeat: Liveness Detection
Identifying stale sockets before they waste server resources
7. The Future: WebTransport (HTTP/3)
WebSockets are built on TCP. This means they suffer from Head-of-Line Blocking (if one packet is lost, everything waits). WebTransport is the modern alternative built on HTTP/3 (QUIC).
Why WebTransport?
- Datagrams: You can send fire-and-forget UDP-like packets (great for gaming).
- Streams: You can open multiple reliable streams (like HTTP/2). If one stream stalls, others keep going.
- Single Handshake: It reuses the HTTP/3 connection. No separate TCP handshake.
8. Mobile Considerations: Battery Life
Why does WhatsApp use a custom protocol (or Long Polling) instead of just short polling?
- Radio Power States: Mobile radios (4G/5G) have “High Power” and “Low Power” (Idle) states.
- Polling Cost: Waking up the radio every 2 seconds keeps it in “High Power” mode, draining battery.
- The Fix: Use Persistent Connections (WebSockets/Push Notifications). The radio can stay in a low-power “listening” mode until data arrives.
9. Comparison Table
| Feature | Short Polling | WebSockets | SSE | WebRTC | WebTransport |
|---|---|---|---|---|---|
| Protocol | HTTP/1.1 | TCP | HTTP/1.1 | UDP/TCP | HTTP/3 (QUIC) |
| Direction | Client Pull | Bidirectional | Server Push | P2P (Bidirectional) | Bidirectional |
| Latency | High | Low | Low | Lowest (UDP) | Low (UDP/QUIC) |
| Complexity | Low | High (Stateful) | Medium | Very High | High |
| Use Case | Dashboards | Chat, Games | Notifications | Zoom/Video | Cloud Gaming, Trading |
Staff Engineer Tip: Tuning for 1M Connections. To hit 1M concurrent connections on a single Linux node, you must tune the kernel:
- TCP Port Range:
sysctl -w net.ipv4.ip_local_port_range="1024 65535"allows more outbound connections. - File Descriptors:
ulimit -n 1048576increases the per-process limit. - TCP Memory:
sysctl -w net.ipv4.tcp_mem='768432 1024576 1536864'ensures the kernel has enough RAM for socket buffers. - Ephemeral Port Reuse:
net.ipv4.tcp_tw_reuse=1allows rapid reconnection from the same source. At this scale, Interrupt Coalescing on your NIC (Network Interface Card) becomes mandatory to prevent the CPU from spending 100% of its time just handling “Packet Received” signals.