L4 vs L7: The Intelligence Gap
In 2020, Cloudflare was hit with the largest DDoS attack recorded at the time — 765 Gbps from 316,000 IPs. Their L4 infrastructure (Katran, eBPF-based) absorbed 100% of it without a single request reaching an origin server. Why? Because at L4, packets are processed at NIC driver level before they even touch the Linux kernel, making the attack traffic invisible to running applications. Conversely, the same year Netflix ran an A/B test and discovered that routing /video requests and /api requests through the same backend server (because their L4 LB couldn’t distinguish them) was responsible for 15% of their streaming quality degradation during peak hours. One uses a smart proxy, the other a blazing-fast blind forwarder. Both are correct, but for entirely different problems.
[!IMPORTANT] In this lesson, you will master:
- The OSI Boundary: Why L4 is “Blind & Fast” while L7 is “Smart & Slow”.
- Kernel Fast-Path: How eBPF and XDP allow L4 to bypass the bottleneck of the Linux kernel.
- Hardware Intuition: Understanding the CPU tax of TLS Termination and the “New Connection” overhead.
1. The Mailroom Analogy
How much does your Load Balancer know about the traffic it handles? The difference between Layer 4 and Layer 7 load balancing is best understood with an analogy.
Layer 4 (Transport): The Fast Mail Sorter
Imagine a mail sorter at a post office.
- Action: He looks at the Envelope (IP Address & Port).
- Knowledge: He sees “To: 123 Main St”. He throws it into the “Zone A” bin.
- Blindness: He has no idea what is inside. Is it a bill? A love letter? A bomb? He doesn’t know and doesn’t care.
- Result: He is incredibly Fast and uses very little brain power (CPU).
Layer 7 (Application): The Executive Assistant
Imagine a CEO’s assistant.
- Action: She opens the letter.
- Knowledge: She reads the content (HTTP Headers, URL, Cookies, Data).
- Intelligence:
- “This is a bill” → Send to Finance Dept.
- “This is a fan letter” → Send to PR Dept.
- “This is junk mail” → Throw it away (WAF).
- Result: She is smarter but Slower (opening letters takes time).
2. Layer 4 (Transport Layer)
At L4, the LB operates at the TCP/UDP level (refer to the OSI Model).
- What it sees:
Source IP:Port→Dest IP:Port. - Encryption: It usually performs TCP Passthrough. It does NOT decrypt SSL/TLS traffic. It just forwards the encrypted stream of bytes to the backend.
- Pros: Ultra-high throughput (millions of packets/sec), low CPU usage, no need to manage certificates on the LB.
- Cons: Cannot route based on content (e.g., can’t send
/videoto a video server).
[!TIP] SNI (Server Name Indication): Modern L4 LBs can sometimes peek at the “Host” name during the initial TLS handshake without fully decrypting the packet, allowing basic routing (e.g.,
api.comvsweb.com).
Modern L4: eBPF & XDP
In hyperscale environments (Facebook, Cloudflare), even standard L4 (Kernel-based) is too slow.
- eBPF (Extended Berkeley Packet Filter): Allows running sandboxed programs inside the Linux Kernel.
- XDP (eXpress Data Path): Allows the LB to process packets directly at the Network Driver level, bypassing the Linux Kernel stack entirely.
- Result: LBs like Katran (Facebook) can process millions of packets per second with minimal CPU.
Staff Engineer Tip: Why L4 Context Switching Matters. Standard L4 LBs still run in “Kernel Space”. Every time a packet arrives, Linux performs a Context Switch to the kernel, processes the 5-tuple hash, and switches back. At 1Gbps, this is fine. At 100Gbps, the CPU spends 80% of its time just switching context. XDP solves this by running the load balancing logic on the NIC driver itself. The packet is either dropped or redirected before the CPU even knows it exists.
[!NOTE] Hardware-First Intuition: In L4 load balancing, the LB often uses a technique called DSR (Direct Server Return). The LB only processes the incoming request. The backend server is configured to reply directly to the client, bypassing the LB for the outgoing (usually much larger) response. This reduces the LB’s bandwidth load by 90% and removes it from the “Data Path” for responses.
3. Layer 7 (Application Layer)
At L7, the LB speaks HTTP/HTTPS/gRPC.
- What it sees: URL (
/api/v1), Headers (User-Agent: iPhone), Cookies (session_id=xyz), and Payload. - Encryption: It MUST decrypt SSL (TLS Termination) to read the content. It then re-encrypts (or sends plain HTTP) to the backend.
- Pros:
- Smart Routing: Route
/apito Microservice A and/staticto Microservice B. - Caching: It can cache static assets (images, CSS) (see Caching).
- Security: Can block SQL injection attacks (WAF).
- Cons: High CPU usage (Decryption is expensive).
Staff Engineer Tip: To mitigate the “L7 Penalty”, use SSL Offloading. Modern CPUs have instructions like AES-NI that speed up decryption, but high-traffic gateways use dedicated Hardware Security Modules (HSM) or SSL acceleration cards to handle the math in parallel, keeping the main CPU free for routing logic.
The Gateway Service (BFF) Pattern
L7 Load Balancers are the foundation of the API Gateway pattern (or Backend For Frontend).
- Protocol Conversion: Accept REST (HTTP/1.1) from the browser, convert to gRPC for internal microservices.
- Authentication: Validate JWT tokens at the edge so internal services don’t have to.
- Rate Limiting: Throttling users based on their API Key (Header inspection).
4. Deep Dive: The New Challenger (QUIC & HTTP/3)
Historically, the web ran on TCP. Layer 4 was TCP, Layer 7 was HTTP. QUIC (HTTP/3) changes the game. It runs on UDP, not TCP.
- The Challenge: Most legacy L4 Load Balancers are optimized for TCP. They see UDP packets and might drop them or handle them poorly (e.g., no connection tracking).
- Connection ID: Unlike TCP (which uses IP:Port tuples), QUIC uses a Connection ID (CID). This allows a user to switch from Wi-Fi to 4G without dropping the connection.
- L4 Complexity: An L4 LB supporting QUIC must understand the CID to route packets to the same backend server, even if the client’s IP changes.
5. Deep Dive: Observability Differences
Monitoring L4 vs L7 requires different mindsets.
L4 Observability (Packet Level)
- Metrics: Bandwidth, Packets Per Second (PPS), TCP Retransmissions, Active Connections.
- Blind Spot: You cannot see why a connection was closed (e.g., did the app return 500 or 404?). You only see TCP RST or FIN.
L7 Observability (Request Level)
- Metrics: Requests Per Second (RPS), HTTP Status Codes (2xx, 4xx, 5xx), Latency per URL.
- Distributed Tracing: L7 LBs can inject Trace IDs (e.g.,
X-Trace-Id) into headers, allowing you to trace a request across your entire microservice fleet.
6. Decision Matrix: L4 vs L7
| Feature | Layer 4 (Transport) | Layer 7 (Application) |
|---|---|---|
| Speed | ⚡️ Extremely Fast (eBPF) | 🐢 Slower (Decryption cost) |
| Complexity | Low (Set & Forget) | High (Certs, Rules) |
| Protocols | Any TCP/UDP | HTTP, gRPC, WebSocket |
| Routing | IP & Port Only | URL, Headers, Cookies |
| Security | IP Allow/Block | WAF (SQLi, XSS) |
| Use Case | DNS, Databases, Massive Ingress | API Gateways, Microservices |
7. The Lost IP Problem & PROXY Protocol
When an L7 LB proxies a request, it terminates the connection and opens a new connection to the backend server. The Backend Server sees the LB’s IP as the source.
Solution 1: HTTP Headers (L7 Only)
The LB injects X-Forwarded-For: ClientIP. Works great for HTTP.
Solution 2: The PROXY Protocol (L4 & L7)
What if you aren’t using HTTP? What if it’s a database connection? Developed by HAProxy, the PROXY Protocol adds a small header at the beginning of the TCP connection containing the original client IP.
- The requirement: The backend application MUST be configured to understand this header (e.g., Nginx
listen 80 proxy_protocol;).
8. Interactive Demo: The Packet Inspector
[!TIP] Try it yourself: Visualize how the LB processes a packet in L4 vs L7 mode.
- L4 Mode: The packet is a “Locked Box”. Low CPU. The LB routes blindly based on luck or simple hash.
- L7 Mode: The packet is “Unlocked”. We can see Headers like
User-Agent. High CPU (Decryption). The LB routes intelligently.
9. Summary
- L4 is for speed and raw TCP handling. eBPF makes it even faster.
- L7 is for business logic and smart routing but costs CPU.
- TLS Termination offloads decryption but creates a bottleneck.
- PROXY Protocol is the bridge that allows L4 LBs to pass client IPs.
Mnemonic — “L4 is Blind, L7 is Smart”: L4 = Fast Mail Sorter (sees envelope only, never opens). L7 = Executive Assistant (reads content, routes intelligently, but slower). Default to L4 for DDoS absorption and TCP databases; use L7 where routing intelligence is required.
Staff Engineer Tip: Deploy L4 + L7 in Two-Tier Architecture. The ideal production setup: L4 (eBPF/XDP) as the outermost tier absorbs volumetric attacks and handles raw packet throughput. L7 sits behind it handling TLS termination, URL routing, and auth. Your intelligent L7 proxy is never directly exposed to DDoS traffic, and your L4 tier never needs certificates. This is exactly how Cloudflare and the AWS ALB+NLB combination work.