Kafka Broker Properties

Broker Properties in Kafka🔗

broker.id🔗

listeners🔗

Example Config

# The address the socket server listens on. If not configured, the host name will be equal to the value of
# java.net.InetAddress.getCanonicalHostName(), with PLAINTEXT listener name, and port 9092.
#   FORMAT:
#     listeners = listener_name://host_name:port
#   EXAMPLE:
#     listeners = PLAINTEXT://your.host.name:9092
listeners=PLAINTEXT://localhost:9092

# Name of listener used for communication between brokers.
inter.broker.listener.name=PLAINTEXT

# Listener name, hostname and port the broker will advertise to clients.
# If not set, it uses the value for "listeners".
advertised.listeners=PLAINTEXT://localhost:9092

# A comma-separated list of the names of the listeners used by the controller.
# This is required if running in KRaft mode. On a node with `process.roles=broker`, only the first listed listener will be used by the broker.
controller.listener.names=CONTROLLER

# Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details
listener.security.protocol.map=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL

zookeeper.connect🔗

log.dirs🔗

num.recovery.thread.per.data.dir🔗

Alright, let’s simplify this 👇

📝 What’s happening?🔗

Kafka stores messages on disk in log segments (files).
When a broker starts or shuts down, it needs to open, check, or close all these log files.
To do this work, Kafka uses a pool of threads.

🔑 Where threads are used:🔗

Normal startup → open each partition’s log files.
Startup after a crash → carefully check + truncate log files (takes longer).
Shutdown → close log files properly.

⚙️ Default setting🔗

By default, Kafka uses 1 thread per log directory.
Example: if you have 3 log directories → 3 threads total.

🚀 Why increase threads?🔗

If you have many partitions and a broker crashed, recovery can take hours with just 1 thread per directory.
Increasing threads allows parallel recovery → much faster startup.

📌 Important note🔗

The config is called:

num.recovery.threads.per.data.dir

If you set it to 8 and you have 3 log.dirs, total = 8 × 3 = 24 threads.
More threads → faster startup/recovery.

👉 Layman analogy: Imagine you have 10,000 books (partitions) to put back on shelves after a storm (broker crash).

With 1 librarian per shelf (default), it takes hours.
With 8 librarians per shelf (more threads), all books are sorted much faster.

Why Truncate Log Segments?🔗

Great question 👍 Let’s break it down simply.

📝 Why truncation is needed after a crash🔗

Kafka writes data to disk in log segments.
Each segment has an ordered sequence of messages.
When a broker crashes (power cut, OOM, kill -9, etc.), some data may have been partially written (corrupted, incomplete).

🔎 What happens after restart🔗

Kafka reopens the log files.
It checks the last segment for incomplete or corrupted records.
If it finds bad records at the end of the file → it truncates (cuts off) the broken part so only valid data remains.

✅ Why this is important🔗

Ensures data consistency: no half-written messages are exposed to consumers.
Keeps the log index aligned with the actual data.
Avoids strange errors like “message length mismatch” or “invalid checksum.”

📌 Example🔗

Imagine writing messages to a notebook:

Page 1: OK
Page 2: OK
Page 3: crash halfway through sentence...

When you reopen, Kafka erases the half-written sentence on Page 3, so the notebook only contains complete entries.

🔒 Safety net🔗

Kafka only truncates data that wasn’t fully acknowledged (not committed).
So producers/consumers won’t lose confirmed messages — only the garbage left behind by the crash.

👉 In short: Truncation after a crash = clean up the mess at the end of the log so the broker can continue safely.