Kafka Broker Properties
Broker Properties in Kafkaπ
broker.idπ
listenersπ
Example Config
# The address the socket server listens on. If not configured, the host name will be equal to the value of
# java.net.InetAddress.getCanonicalHostName(), with PLAINTEXT listener name, and port 9092.
# FORMAT:
# listeners = listener_name://host_name:port
# EXAMPLE:
# listeners = PLAINTEXT://your.host.name:9092
listeners=PLAINTEXT://localhost:9092
# Name of listener used for communication between brokers.
inter.broker.listener.name=PLAINTEXT
# Listener name, hostname and port the broker will advertise to clients.
# If not set, it uses the value for "listeners".
advertised.listeners=PLAINTEXT://localhost:9092
# A comma-separated list of the names of the listeners used by the controller.
# This is required if running in KRaft mode. On a node with `process.roles=broker`, only the first listed listener will be used by the broker.
controller.listener.names=CONTROLLER
# Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details
listener.security.protocol.map=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL
zookeeper.connectπ
log.dirsπ
num.recovery.thread.per.data.dirπ
Alright, letβs simplify this π
π Whatβs happening?π
- Kafka stores messages on disk in log segments (files).
- When a broker starts or shuts down, it needs to open, check, or close all these log files.
- To do this work, Kafka uses a pool of threads.
π Where threads are used:π
- Normal startup β open each partitionβs log files.
- Startup after a crash β carefully check + truncate log files (takes longer).
- Shutdown β close log files properly.
βοΈ Default settingπ
- By default, Kafka uses 1 thread per log directory.
- Example: if you have 3 log directories β 3 threads total.
π Why increase threads?π
- If you have many partitions and a broker crashed, recovery can take hours with just 1 thread per directory.
- Increasing threads allows parallel recovery β much faster startup.
π Important noteπ
The config is called:
- If you set it to
8
and you have3
log.dirs, total =8 Γ 3 = 24 threads
. - More threads β faster startup/recovery.
π Layman analogy: Imagine you have 10,000 books (partitions) to put back on shelves after a storm (broker crash).
- With 1 librarian per shelf (default), it takes hours.
- With 8 librarians per shelf (more threads), all books are sorted much faster.
Why Truncate Log Segments?π
Great question π Letβs break it down simply.
π Why truncation is needed after a crashπ
- Kafka writes data to disk in log segments.
- Each segment has an ordered sequence of messages.
- When a broker crashes (power cut, OOM, kill -9, etc.), some data may have been partially written (corrupted, incomplete).
π What happens after restartπ
- Kafka reopens the log files.
- It checks the last segment for incomplete or corrupted records.
- If it finds bad records at the end of the file β it truncates (cuts off) the broken part so only valid data remains.
β Why this is importantπ
- Ensures data consistency: no half-written messages are exposed to consumers.
- Keeps the log index aligned with the actual data.
- Avoids strange errors like βmessage length mismatchβ or βinvalid checksum.β
π Exampleπ
Imagine writing messages to a notebook:
When you reopen, Kafka erases the half-written sentence on Page 3, so the notebook only contains complete entries.
π Safety netπ
- Kafka only truncates data that wasnβt fully acknowledged (not committed).
- So producers/consumers wonβt lose confirmed messages β only the garbage left behind by the crash.
π In short: Truncation after a crash = clean up the mess at the end of the log so the broker can continue safely.