Latency and Jitter

[!NOTE] This module explores the core principles of Latency and Jitter, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.

1. The Real-World Hook: Why Do Voice Calls Lag?

Imagine you’re on a critical international Zoom call, pitching a system design to an executive team. Suddenly, your colleague’s voice turns robotic, speeds up like a chipmunk, and then cuts out entirely for two seconds before resuming mid-sentence. You check your internet connection—you have 500 Mbps of bandwidth. So why is the call failing?

The answer lies in two critical network metrics that bandwidth alone cannot solve: Latency and Jitter. In system design, especially for real-time applications like VoIP, live video streaming, or high-frequency trading, understanding these metrics is just as vital as scaling your databases.


2. Latency (Delay)

Latency is the absolute time it takes for a packet to travel from the source to the destination. It is often measured in system design and networking as Round Trip Time (RTT).

The Anatomy of Latency

Latency isn’t just one monolithic delay; it’s an aggregate of four distinct hardware and physical constraints. When a user in Tokyo clicks a button to query a database in Virginia, the packet experiences:

  1. Propagation Delay (The Physics Limit): The time it takes for the signal to travel through the physical medium (fiber optic cables). This is strictly limited by the speed of light in glass (roughly 200,000 km/s, or 30% slower than in a vacuum).
    • Example: The absolute minimum propagation delay across the Pacific Ocean (~10,000 km) is about 50ms one-way. You cannot optimize past physics.
  2. Serialization Delay (The Pipeline Width): The time it takes a router’s network interface card (NIC) to physically push the bits onto the wire. A 10 Gbps link takes less time to serialize a 1500-byte packet than a 1 Gbps link.
  3. Queuing Delay (The Traffic Jam): The time a packet spends waiting in a router’s output buffer because the link is currently congested with other traffic. This is highly variable.
  4. Processing Delay (The Bouncer): The time a router takes to read the packet header, check its routing table, and determine the output interface. Modern ASICs have reduced this to microseconds, but complex firewalls or NAT can increase it.

War Story: High-Frequency Trading (HFT) firms care so deeply about propagation delay that they spent $300 million to lay a straighter fiber-optic cable between Chicago and New Jersey (Project Spread Networks) just to shave off 3 milliseconds of RTT.


3. Jitter (Delay Variation)

If Latency is the total travel time, Jitter is the variation in that travel time.

  • If packets arrive at 10ms, 10ms, 10ms, jitter is 0. (Ideal)
  • If packets arrive at 10ms, 40ms, 5ms, jitter is High. (Problematic)

Why Jitter Destroys Real-Time Streams

High jitter is the ultimate enemy of real-time communication. Unlike downloading a file where TCP just reassembles out-of-order packets eventually, a VoIP application (using UDP) needs a steady stream of audio data to play to the speaker.

If packets bunch up or arrive late due to varying Queuing Delays across the internet, the receiver can’t predict when the next piece of audio will arrive. This results in:

  • Clipping/Stuttering: The application runs out of audio to play while waiting for a late packet.
  • Over-running: A clump of delayed packets arrives all at once, forcing the application to discard them or speed up playback (the “chipmunk” effect).

4. Interactive: Jitter and The Jitter Buffer

To mitigate jitter, receivers employ a Jitter Buffer.

  • Instead of playing an audio packet the exact millisecond it arrives, the receiver holds it in a small buffer (e.g., 30ms to 50ms).
  • This adds a small amount of fixed latency to the call, but it “smooths out” the unpredictable arrival times, allowing steady playback.
  • The Tradeoff: If the jitter is greater than the buffer size, late packets are discarded. If you increase the buffer too much to prevent this, the total latency becomes noticeable to human conversation (the awkward “you talk, no you talk” collision).

Interact with the visualizer below to see how unpredictable network delays affect a steady transmission pipeline and how the receiving application interprets the stream.

Sender (Steady 1s intervals) Receiver Application
Received Audio Stream (Buffer Playback)
Waiting for transmission...