Module Review: Load Balancing

🧠 Flashcards

Test your recall. Click a card to flip it.

L4 Load Balancing

Tap to reveal

Transport Layer

Routing based on IP & Port only. Fast but "dumb". Does not decrypt SSL (TCP Passthrough). Uses eBPF/XDP for speed.

L7 Load Balancing

Tap to reveal

Application Layer

Routing based on URL, Headers, Cookies. Smart but CPU heavy. Requires SSL Termination (Decryption).

Maglev Hashing

Tap to reveal

Google's Consistent Hashing

Uses a massive permutation table to achieve O(1) lookup time for distributing packets. Superior to Ring Hashing at scale.

Active-Passive

Tap to reveal

High Availability

One LB handles traffic. The other sleeps. If Active dies, Passive wakes up via Heartbeat check (VRRP/Keepalived).

Connection Pooling

Tap to reveal

Latency Optimization

The LB keeps connections to the backend open (Keep-Alive) to avoid paying the TCP 3-Way Handshake cost for every request.

Consistent Hashing

Tap to reveal

Scaling Strategy

A hash ring strategy that minimizes data movement when adding/removing servers. Crucial for Distributed Caches.

Sidecar Proxy

Tap to reveal

Service Mesh

A reverse proxy attached to every service instance (e.g., Envoy). Handles mTLS, Retries, and Observability.

Peak EWMA

Tap to reveal

Stability Metric

Exponential Weighted Moving Average. Used by Linkerd to detect slow servers while ignoring short-lived spikes.

QUIC (HTTP/3)

Tap to reveal

UDP Protocol

Modern protocol running on UDP. Challenges L4 LBs because it requires tracking Connection IDs (CIDs) instead of IP tuples.

TLS Fingerprinting

Tap to reveal

Security (JA3)

Identifying clients (e.g., Bots vs Browsers) by analyzing the specific parameters of their SSL Client Hello handshake.

SNI

Tap to reveal

Server Name Indication

Allows L4 Load Balancers to peek at the hostname during the TLS Handshake without full decryption.

Thundering Herd

Tap to reveal

Concurrency Problem

When many processes wake up simultaneously to handle an event (or reconnect), overwhelming the system. Solved by Jitter.

GSLB

Tap to reveal

Global Server Load Balancing

Distributing traffic across data centers worldwide using DNS (GeoDNS) or Anycast (BGP) to reduce latency.

Bounded Load

Tap to reveal

Consistent Hashing Optimization

A technique to prevent hot shards by rejecting requests to an overloaded node and passing them to the next peer on the ring.

📝 Scenario Quiz

1. You are designing a video streaming service (Netflix). You need maximum throughput for video chunks. Which LB do you choose?

2. You have a Microservices architecture where `/api` goes to Service A and `/payment` goes to Service B. Which LB is required?

3. Your backend servers have varying hardware specs (some fast, some slow). Which algorithm is BEST?

4. You need to process 10M packets per second for a DDoS scrubber. The standard Linux Kernel is too slow. What technology do you use?

5. You want to detect if a client is a Bot or a real Chrome browser, even if they spoof the User-Agent. What technique helps?

📋 Cheat Sheet

L4 vs L7

Feature	L4 Load Balancer	L7 Load Balancer
Layer	Transport (TCP/UDP)	Application (HTTP)
Visibility	IP & Port (Envelope)	URL, Headers, Body (Content)
Speed	Ultra High (eBPF)	Slower (CPU Intensive)
Decryption	No (Pass-through)	Yes (SSL Termination)
Caching	Impossible	Possible (Static Files)

Concepts

Concept	Definition
SPOF	Single Point of Failure. If the LB dies, the site dies.
Sticky Session	Ensuring a user’s requests always go to the same server (via IP Hash or Cookie).
Maglev	Google’s Consistent Hashing algorithm for O(1) lookups.
Least Conn	Smart routing to the server with fewest active connections.
P2C	Power of Two Choices. Pick 2 random servers, choose the best. O(1) efficiency.
Peak EWMA	Peak Exponential Weighted Moving Average. Reacts quickly to latency spikes.
Active-Passive	High Availability setup where a backup LB takes over if the primary fails.
Sidecar Proxy	A helper proxy (Envoy) that runs alongside a service to handle network logic.
Connection Pooling	Reusing persistent TCP connections to avoid handshake overhead.
GSLB	Global Server Load Balancing. Using DNS or Anycast to route users to the closest datacenter.
Bounded Load	Consistent Hashing optimization to prevent hot shards.
JA3	TLS Fingerprinting standard used to identify the client application (e.g., bot vs browser).
QUIC	New UDP-based protocol (HTTP/3) that improves performance but complicates L4 load balancing.

Technology Choice

Tool	Best For
Nginx	General purpose web server, Static files, Simple L7 LB.
HAProxy	High performance, pure LB. Best for massive scale TCP/HTTP.
Envoy	Service Mesh (Sidecar). Observability, Distributed Tracing.
Traefik	Kubernetes/Docker Ingress. Auto-discovery.
Katran	Facebook’s eBPF-based L4 Load Balancer.

🏗️ Whiteboard Summary

1. The Problem

Vertical Scaling fails (Kitchen Fire).
DNS Round Robin fails (Caching).
Need a Single VIP entry point.

2. Architecture

L4: Fast, Encrypted, Dumb.
L7: Smart, Decrypted, Slow.
Active-Passive: For HA.

3. Algorithms

Round Robin: Simple.
Least Conn: Variable workloads.
P2C: Hyperscale (O(1)).
Maglev: Google Scale.

4. Optimization

Health Checks: Deep vs Shallow.
Conn Pooling: Reduce Handshakes.
Draining: Zero Downtime Deploy.
Security: WAF & JA3.