Welcome to the Apache Kafka Glossary. Here you will find definitions for common abbreviations and technical terms used throughout the course.
Example Usage: Hover over ISR to see the definition.
Core Concepts
Term
Full Name
Definition
Topic
Topic
A logical stream of events (e.g., “orders”, “clicks”). Topics are the primary way of organizing messages in Kafka.
Partition
Partition
The unit of scalability in Kafka. A topic is split into multiple partitions, which can be distributed across different brokers. Ordering is guaranteed only within a partition.
Offset
Offset
A unique integer ID assigned to every message within a partition, representing its position in the log.
Broker
Kafka Broker
A single Kafka server. Brokers receive messages from producers, store them on disk, and serve them to consumers.
Producer
Producer
A client application that publishes (writes) events to Kafka topics.
Consumer
Consumer
A client application that subscribes to (reads) events from Kafka topics.
Consumer Group
Consumer Group
A group of consumers that work together to consume a topic. Each partition in the topic is consumed by exactly one consumer in the group.
Zookeeper
Apache Zookeeper
A centralized service for maintaining configuration information, naming, providing distributed synchronization, and group services. Used by older Kafka versions for metadata management.
KRaft
Kafka Raft Metadata Mode
The new consensus protocol in Kafka (replacing Zookeeper) that manages metadata and controller election internally.
Replication & Consistency
Term
Full Name
Definition
Leader
Partition Leader
The replica that handles all reads and writes for a partition.
Follower
Partition Follower
A replica that passively replicates the log from the leader. Followers exist for fault tolerance.
ISR
In-Sync Replicas
The set of replicas that are fully caught up with the leader. Only members of the ISR are eligible to become the new leader if the current leader fails.
High Watermark
High Watermark
The offset of the last message that has been successfully replicated to all ISR members. Messages up to this point are considered “committed” and visible to consumers.
LEO
Log End Offset
The offset of the last message appended to the leader’s log (regardless of replication status).
acks
Acknowledgments
A producer configuration (acks=0, 1, all) that determines how many replicas must acknowledge a write before it is considered successful.
min.insync.replicas
Minimum In-Sync Replicas
A configuration ensuring that a write is only accepted if at least N replicas (including the leader) acknowledge it.
Unclean Leader Election
Unclean Leader Election
A configuration allowing a non-ISR replica to become leader, potentially causing data loss but preserving availability.
Storage & Internals
Term
Full Name
Definition
Segment
Log Segment
A physical file on disk (e.g., 0000.log) that stores a portion of a partition’s data. Partitions are split into segments for easier management and deletion.
Index
Offset Index
A file (.index) that maps offsets to physical file positions in the log segment, enabling fast lookups.
TimeIndex
Timestamp Index
A file (.timeindex) that maps timestamps to offsets, allowing lookups by time.
Log Compaction
Log Compaction
A cleanup policy where Kafka retains at least the last known value for each message key, rather than deleting old messages based on time.
Zero Copy
Zero Copy
A technique used by Kafka to send data from the disk cache directly to the network socket without copying it to application memory, maximizing throughput.
Page Cache
Page Cache
The operating system’s main memory cache used to store file data. Kafka relies heavily on the page cache for performance.
Sticky Partitioner
Sticky Partitioner
A producer strategy that batches messages for the same partition to reduce latency and load, even if no key is provided.
Rebalancing
Group Rebalancing
The process where a Consumer Group redistributes partitions among its members (e.g., when a consumer joins or leaves).
Found this lesson helpful?
Mark it as mastered to track your progress through the course.