Advanced Configuration & Tuning

This module explores the core principles of advanced Kafka configuration, deriving optimal settings from first principles and hardware constraints to build world-class, production-ready expertise.

Once you understand the basics, you need to tune your producer for your specific workload. Are you optimizing for Throughput (Massive logs) or Latency (Real-time trading)?

1. Reliability vs Availability

`min.insync.replicas` (Broker Setting)

This is the most critical setting for data durability. It defines the minimum number of replicas that must acknowledge a write when acks=all.

Scenario: Replication Factor = 3, min.insync.replicas = 2.
Success: 2 or 3 brokers alive.
Failure: Only 1 broker alive. The producer receives NotEnoughReplicasException.

Warning: Availability Impact

If you set min.insync.replicas=2 and you only have 2 brokers, if one goes down, your producer will stop working. This favors Consistency over Availability (CP in CAP theorem).

2. Throughput vs Latency

The two main knobs are batch.size and linger.ms.

The “Bus Station” Analogy

Think of the Kafka Producer as a bus station.

batch.size is the capacity of the bus (e.g., 64 seats).
linger.ms is the maximum time the bus will wait at the station for passengers before leaving, even if it’s not full.

If the bus fills up (batch.size reached) before the time is up (linger.ms), it leaves immediately. If the time is up (linger.ms reached), it leaves regardless of how many passengers are on board.

Goal	`linger.ms`	`batch.size`	Result
Low Latency	0 ms	16 KB (Default)	Send immediately. High network overhead.
High Throughput	20-100 ms	64-128 KB	Wait to fill batches. Efficient network usage.

Compression (`compression.type`)

Compressing batches reduces network bandwidth but increases CPU usage on the Producer and Broker.

snappy: Low CPU, good compression. (Google)
lz4: Extremely fast compression/decompression. Low CPU.
zstd: High compression (like Gzip) with low CPU (like Lz4). (Facebook)

3. Interactive: Tuning Simulator

Adjust the slider to see how linger.ms affects Throughput and Latency.

linger.ms: 0 ms

Latency

5 ms

Throughput

Low

CPU Usage

High

Sending immediately. High overhead per message.

4. Buffer Memory (`buffer.memory`)

The “Water Tank” Case Study

Imagine the producer’s buffer memory as a water tank. Your application is a hose pouring water into the tank (producing messages), and the network is a pipe draining water out of the tank (sending messages to brokers).

The producer holds messages in heap memory before sending.

buffer.memory: Total bytes of memory the producer can use to buffer records waiting to be sent (Default: 32MB).
max.block.ms: If the buffer is full (sender thread is slow), producer.send() will block for this long (Default: 60s).

Scenario: Your application produces 100 MB/sec. Your network can only send 50 MB/sec.

Buffer fills up in ~0.6 seconds (32MB / 50MB/s surplus).
send() starts blocking.
Application throughput drops to 50 MB/sec (backpressure).
If network stops completely, send() throws TimeoutException after 60s.

5. Configuration Code

Java

Properties props = new Properties();

// 1. Throughput Tuning
props.put("linger.ms", "20");
props.put("batch.size", Integer.toString(32 * 1024)); // 32KB
props.put("compression.type", "zstd");

// 2. Buffer Memory (Increase for high throughput)
props.put("buffer.memory", Integer.toString(64 * 1024 * 1024)); // 64MB

// 3. Timeout (Don't block forever)
props.put("max.block.ms", "3000"); // 3 seconds

Go

w := &kafka.Writer{
    Addr:     kafka.TCP("localhost:9092"),
    Topic:    "analytics",

    // Throughput
    BatchTimeout: 20 * time.Millisecond,
    BatchSize:    1000,
    Compression:  kafka.Zstd,

    // Buffer limits (approximate in kafka-go via batch bytes)
    BatchBytes:   1024 * 1024, // 1MB batch limit

    // Async writes (non-blocking) are default in kafka-go's Writer
    // unless you use WriteMessages explicitly which blocks until success.
    Async: false,
}

Advanced Configuration & Tuning

Advanced Configuration & Tuning

1. Reliability vs Availability

min.insync.replicas (Broker Setting)

2. Throughput vs Latency

The “Bus Station” Analogy

Compression (compression.type)

3. Interactive: Tuning Simulator

4. Buffer Memory (buffer.memory)

The “Water Tank” Case Study

5. Configuration Code

Java

Go

Found this lesson helpful?

`min.insync.replicas` (Broker Setting)

Compression (`compression.type`)

4. Buffer Memory (`buffer.memory`)