Skip to content

How does Producer Guarantee Exactly Once Semantics in Kafka?🔗

Excellent question — this gets to the core of Kafka’s delivery guarantees and offset management model.

Let’s go step-by-step: first, we’ll clarify who controls offsets, and then explain how the producer achieves exactly-once semantics (EOS) using producer IDs, epochs, and transactional commits.


🧩 1. Who maintains offsets — broker or producer?🔗

Offsets are maintained by the broker.🔗

Let’s unpack that clearly.


Producer’s view🔗

When a producer sends messages to a Kafka topic-partition, it does not know the offsets in advance. Offsets are assigned by the partition leader broker when it writes the batch to its log.

Process:

  1. Producer creates a batch of messages.
  2. It sets the base offset = 0 temporarily (a placeholder).
  3. Sends the batch to the leader broker for that partition.
  4. The leader writes the batch to the partition’s log file, assigning real offsets sequentially.
  5. The broker then returns a ProduceResponse to the producer that includes:

  6. The base offset of the first record.

  7. Any error codes.

So, the producer never generates offsets — the broker does.


Consumer’s view🔗

For consumers, offsets represent their position in the log — how far they’ve read.

  • Consumers read offsets assigned by brokers.
  • They commit offsets (to Kafka or external store) to mark progress.

So, offsets are a broker-managed sequence that both producers and consumers depend on.


⚙️ 2. How Kafka assigns offsets (summary)🔗

Stage Who assigns Description
Produce Broker (leader) Assigns offsets sequentially per partition as batches are appended.
Consume Broker provides Consumer fetch requests specify the starting offset, broker returns records and their offsets.
Commit Consumer Saves last processed offset to Kafka (__consumer_offsets topic).

Offsets are strictly increasing and immutable within a partition — ensuring total order for that partition.


🧠 3. How producers ensure exactly-once semantics (EOS)🔗

Kafka’s exactly-once semantics were introduced in Kafka 0.11 through idempotent producers and transactions.

Let’s break this into two layers:


Layer 1 — Idempotent producer (no duplicates)🔗

Normally, when a producer retries (due to a temporary network failure), the same message could be written multiple times.

Example:

  1. Producer sends message M1.
  2. Broker receives it, writes it, but the response back to producer is lost.
  3. Producer retries M1.
  4. Without EOS, the broker writes M1 again — duplicates.

To fix this, Kafka introduced the idempotent producer, which ensures:

"No message will be written twice to the same topic-partition, even if retries happen."


🧩 How idempotent producers work🔗

Each producer is assigned three key identifiers:

Field Description
Producer ID (PID) Unique 64-bit ID assigned by the broker when producer starts.
Producer Epoch Incremented when producer restarts (used to detect old sessions).
Sequence Number Incremented for each record sent to a partition.

Each partition the producer writes to has its own sequence counter.


🔁 Example🔗

Message PID Partition Sequence
M1 500 0 0
M2 500 0 1
M3 500 0 2

If a retry occurs, and M2 is sent again with the same PID and sequence, the broker checks:

“Have I already seen (PID=500, partition=0, seq=1)?”

If yes → duplicate ignored. If no → accept and append.

Thus, retries no longer create duplicates.


✅ Key guarantees from idempotent producer🔗

  • Each (PID, partition, sequence) tuple is unique and ordered.
  • The broker uses this metadata to detect and ignore duplicate writes.
  • This works automatically when:
enable.idempotence=true

Layer 2 — Transactions (atomic multi-partition writes)🔗

Idempotent producers guarantee no duplicates per partition, but what if a producer writes to multiple partitions or topics as part of one logical operation?

Example:

Producer writes:
  - Message to topic A, partition 0
  - Message to topic B, partition 2

We want either both messages committed, or neither (atomicity).

This is achieved with Kafka transactions.


🧩 How transactions work🔗

Each transaction groups multiple produce requests into a single atomic unit.

Steps:

  1. Producer begins a transaction:

producer.initTransactions();
producer.beginTransaction();
2. Producer sends records to multiple partitions. 3. When done, producer calls:

producer.commitTransaction();

or rolls back:

producer.abortTransaction();
4. The broker uses a transaction coordinator (one per producer) to manage state. 5. Kafka marks the affected records as:

  • Committed (visible to consumers)
  • Aborted (hidden from consumers)

🧱 Transaction metadata on broker🔗

Brokers store transaction state in an internal topic:

__transaction_state

It contains info about:

  • Transaction IDs
  • Producers’ current epochs
  • Whether transactions are ongoing, committed, or aborted

Consumers that are transactional-aware (using isolation level read_committed) only see messages from committed transactions.


⚖️ Combining the two layers🔗

Feature Provided by Ensures
Idempotence Producer ID + sequence numbers No duplicates per partition
Transactions Transaction coordinator Atomicity across multiple partitions
Read committed isolation Consumers Consumers only see committed data

Together, these provide exactly-once semantics (EOS) end-to-end.


✅ Exactly-once delivery flow (step-by-step)🔗

Step Action Who handles it Guarantee
1 Producer sends messages with PID + seq Producer → Broker Broker detects duplicates
2 Broker appends unique messages Broker No duplicate writes
3 Producer commits transaction Producer → Coordinator Atomic commit across partitions
4 Consumer reads messages Consumer Sees each committed record exactly once

💡 4. Putting it all together🔗

Concept Controlled by Purpose
Offsets Broker Identify message position in partition log
Producer ID, Epoch, Sequence Producer + Broker Deduplicate writes (idempotence)
Transaction Coordinator Broker (special role) Group multiple writes into a single atomic commit
Consumer offsets Consumer (stored in __consumer_offsets) Track progress per consumer group
Exactly-once semantics (EOS) Producer + Broker + Consumer Ensure one logical event = one delivery, no duplicates, no loss

🔐 5. Summary🔗

Question Answer
Who assigns offsets? The broker assigns offsets sequentially per partition. Producers do not set them.
Who maintains offsets? Brokers maintain partition offsets; consumers maintain their read offsets separately.
How does the producer ensure exactly-once? By using idempotence (PID + sequence numbers) and transactions (atomic commits).
How are duplicates prevented? Broker checks (PID, partition, sequence) to detect and ignore duplicates.
How are multi-partition writes atomic? Kafka’s transaction coordinator manages commit or abort for all partitions in the transaction.

In simple terms:

  • Offsets are always assigned and managed by the broker.
  • Producers achieve exactly-once semantics using a combination of:

  • Idempotent producers (no duplicates),

  • Transactions (atomic commits), and
  • Read-committed consumers (see only committed data).

Together, these make Kafka capable of true end-to-end exactly-once delivery — even across failures, retries, and multi-partition writes.