How does Kafka Ensure Partial Idempotence?π
1. The problem it solves β duplicate messages during retriesπ
Without idempotence, a Kafka producer can accidentally send the same message multiple times under certain failure conditions.
Hereβs a common scenario:
- The producer sends a batch to the broker.
- The broker successfully writes it to the log.
- The acknowledgment from the broker to the producer is lost (e.g., due to a network glitch).
- The producer assumes the send failed and retries the message.
- The broker accepts the retry as a new message and writes it again.
Result: The same message is stored twice, with two different offsets. Consumers downstream will see the record twice, which can cause problems in financial systems, metrics aggregation, and any system requiring strict consistency.
Thatβs where idempotence comes in.
2. What idempotence meansπ
The term idempotent means that performing the same operation multiple times yields the same result as performing it once.
In Kafka:
Enabling idempotence ensures that even if a producer retries the same message, the broker will only commit it once, never creating duplicates.
So, no matter how many times the producer resends due to transient failures, the broker will de-duplicate it automatically.
3. How it works internallyπ
When enable.idempotence=true, Kafka activates the Idempotent Producer Protocol, introduced in Kafka 0.11.0.
Hereβs what happens under the hood:
Step 1: Assigning a unique Producer ID (PID)π
- When a producer connects to the cluster, the Kafka broker assigns it a unique 64-bit Producer ID (PID).
- This PID identifies that producer session.
Step 2: Sequence numbers per partitionπ
- Each time the producer sends a batch to a partition, it includes a monotonically increasing sequence number for that partition.
-
Example:
-
Partition 0 β sequence numbers 1, 2, 3, ...
- Partition 1 β sequence numbers 1, 2, 3, ... (separate sequence per partition)
Step 3: Deduplication on the brokerπ
-
When the broker receives a batch, it checks:
-
The Producer ID
- The sequence number
- If it has already seen that sequence number from the same producer, it recognizes the message as a duplicate and discards it.
This means that if the producer retries the same batch (after a timeout, network error, or leader election), Kafka guarantees exactly one write to the topic log.
4. Key configuration interactionsπ
When enable.idempotence=true, Kafka automatically adjusts certain other producer parameters to safe defaults:
| Setting | Automatically adjusted value | Purpose |
|---|---|---|
acks |
all |
Waits for acknowledgment from all in-sync replicas. |
retries |
Integer.MAX_VALUE |
Retries indefinitely until success or timeout. |
max.in.flight.requests.per.connection |
β€ 5 | Limits concurrent in-flight requests to maintain ordering guarantees. |
If you set conflicting configurations manually (for example, acks=1 with idempotence), Kafka will override them or throw a configuration error.
5. What idempotence guarantees (and what it doesnβt)π
| Property | Guarantee with enable.idempotence=true |
|---|---|
| No duplicates (producer β broker) | β Guaranteed (no matter how many retries occur). |
| Preserved ordering per partition | β Guaranteed, as sequence numbers track per-partition order. |
| Exactly-once delivery across retries | β Achieved within a single producer session. |
| Durability across producer restarts | β Not guaranteed (new PID assigned on restart). |
| Transactions across multiple partitions/topics | β Not guaranteed (use transactions for that). |
In other words:
- Idempotence = Exactly-once per producer session per partition.
- Transactions = Exactly-once across sessions and partitions.
6. Limitationsπ
While enable.idempotence=true is powerful, it has scope boundaries:
-
PID is lost when the producer restarts. After a restart, a new PID is generated, and the sequence numbers reset. The broker treats the new producer as distinct β so it canβt deduplicate messages across sessions.
-
Only applies per partition. Kafka ensures ordering and deduplication independently for each partition, not across multiple partitions.
-
Consumer-side duplication still possible if the consumer reprocesses messages without proper offset management (for example, after failure or replay).
For complete exactly-once semantics across multiple topics and sessions, you need transactional producers (initTransactions() / commitTransaction()).
7. Example timelineπ
| Event | Without idempotence | With idempotence |
|---|---|---|
| Producer sends record R1 | Broker writes R1 | Broker writes R1 |
| Acknowledgment lost | Producer retries R1 | Producer retries R1 |
| Broker receives retry | Broker writes R1 again β duplicate | Broker sees same sequence β skips duplicate |
| Consumer reads | Sees R1 twice | Sees R1 once |
8. Recommended configurationπ
For strong reliability and ordering guarantees, use the following settings:
acks=all
enable.idempotence=true
retries=Integer.MAX_VALUE
max.in.flight.requests.per.connection=1
delivery.timeout.ms=120000
linger.ms=5
batch.size=32768
This ensures:
- No duplicates (idempotence).
- Ordered writes.
- Infinite retries for transient errors.
- Reasonable batching and throughput.
If you also need exactly-once across multiple topics or partitions (for example, when using Kafka Streams), enable transactions:
9. Summaryπ
| Concept | Description |
|---|---|
| Goal | Prevent duplicate messages caused by retries or network errors. |
| Mechanism | Unique producer ID (PID) and sequence numbers per partition. |
| Effect | Broker deduplicates duplicate sends automatically. |
| Scope | Guarantees exactly-once per producer session, per partition. |
| Next level | Combine with transactions for exactly-once semantics across multiple partitions and sessions. |
In short:π
Setting enable.idempotence=true ensures that even if a producer retries due to timeouts, disconnects, or leader re-elections, Kafka will store each message exactly once per partition.
It eliminates duplicate writes caused by retries and is the foundation for Kafkaβs exactly-once delivery guarantees.
Where is PID Stored?π
1. Where the Producer ID (PID) comes fromπ
When a Kafka producer with enable.idempotence=true starts, it doesnβt generate its own PID.
Instead, it requests a unique Producer ID from the Kafka cluster controller (a special broker responsible for coordination tasks).
Step-by-step:π
- The producer connects to any broker.
- That broker forwards a request to the cluster controller.
- The controller allocates a 64-bit Producer ID (PID) and returns it to the producer.
- The producer caches this PID locally in memory.
From then on, the producer includes this PID with every produce request it sends.
Where itβs stored:π
-
Producer side:
-
Stored in producer memory (inside the producer client instance).
- Not persisted to disk by default.
- Lost when the producer application restarts or crashes.
-
Broker side:
-
Each broker maintains the PIDβsequence mapping in memory (part of its replication log state).
- The mapping is also persisted in the partition log to survive broker restarts (see next sections).
2. What the broker stores β the PID and sequence mappingπ
Each broker tracks, for every partition it manages:
- The latest PID that has written to that partition.
- The last sequence number seen from that PID.
This information is maintained so that the broker can detect duplicates if a producer retries a batch.
Internally:π
When a broker receives a record batch, it checks:
- The PID in the request header.
- The sequence number range of the batch (e.g., 120β125).
- The last sequence number it has already written for that PID.
If the new batchβs sequence number overlaps with an existing range, or if itβs exactly equal to one already committed, the broker recognizes it as a duplicate and discards it.
Where brokers keep this metadata:π
This depends on what type of state it is.
| State | Storage location | Volatility | Description |
|---|---|---|---|
| PID β sequence mapping (active) | Broker memory (per-partition state) | Lost if broker restarts | Used for fast duplicate detection during normal operation |
| PID and last sequence metadata (committed) | Kafka log segment (on disk) | Persistent | Stored as part of the message batch header and as control records |
| Transaction-related PIDs | __transaction_state internal topic | Persistent | Tracks PID ownership and transaction status for EOS producers |
3. Persistence of PID and sequence metadataπ
Kafka persists this metadata indirectly through control records written into the partition log itself.
Each message batch written by an idempotent producer includes in its header:
producerId(PID)producerEpochbaseSequencelastSequenceisTransactionalflag
These headers are stored inside the log segment on disk along with the actual messages.
So if a broker restarts:
- It replays the log from disk during recovery.
- During this replay, it reconstructs the PID β last sequence number mapping in memory.
- After recovery, duplicate detection continues working as if nothing happened.
Thatβs why Kafkaβs idempotence remains consistent across broker restarts, but not across producer restarts (since the producer gets a new PID each time).
4. What happens on producer restartπ
When the producer application restarts:
- The old PID is gone (it was in memory only).
- Kafka assigns a new PID to the new instance.
- Because itβs a new PID, the broker treats this as an entirely new producer identity.
- The old PIDβs sequence tracking remains on the broker for a time but is unrelated to the new producer.
This is why idempotence guarantees are session-scoped β they only apply while the same producer instance (and PID) is alive.
If you want to maintain idempotence across restarts, you need transactions with a transactional.id.
5. Transactional producers and PID persistenceπ
When you set:
Kafka upgrades from an idempotent producer to a transactional producer.
In this case:
- The PID is no longer ephemeral.
- Itβs stored persistently in Kafkaβs internal topic:
transactional.id to a PID and epoch number.
* If the producer restarts with the same transactional.id, it retrieves the same PID (with incremented epoch).
* This allows Kafka to continue deduplication across producer restarts β a key part of exactly-once semantics (EOS).
6. Summary: where PID and sequence numbers liveπ
| Component | Data Stored | Where Stored | Persistence | Purpose |
|---|---|---|---|---|
| Producer | PID, per-partition sequence numbers | Producer memory | Lost on restart | Identifies producer session; generates sequence numbers |
| Broker (active memory) | Last sequence number per PID per partition | Broker memory | Reconstructed after restart | Used for duplicate detection during runtime |
| Broker (disk) | PID, baseSequence, lastSequence (in batch headers) | Partition log segments | Persistent | Replay on restart to rebuild state |
| Transactional producers | PID β transactional.id mapping | __transaction_state internal topic |
Persistent | Enables idempotence across restarts and partitions |
7. Illustrated data flow exampleπ
- Producer starts β requests PID from controller β receives
PID = 12345. -
Producer writes to partition 0:
-
Batch header:
{ PID=12345, baseSeq=0, lastSeq=9 } - Broker appends batch to log segment file
/kafka-logs/topic-0/000000000001.log - Broker updates its in-memory table:
{PID=12345, baseSeq=0, lastSeq=9} due to timeout.
- Broker checks and sees itβs already written β discards duplicate.
- Broker restarts later β reads log segment β rebuilds PID mapping.
8. In summaryπ
- PID (Producer ID): Assigned by Kafka controller, stored in producer memory; identifies the producer session.
- Sequence numbers: Maintained by producer per partition; embedded in message batch headers.
- Broker state: Maintains last sequence per PID in memory for duplicate detection; reconstructs it from the partition log after restart.
- Transactions: Persist PID β transactional.id mapping in
__transaction_state, enabling exactly-once delivery across restarts.
Bottom line:π
enable.idempotence=true activates a coordinated system where:
- The producer tracks sequence numbers.
- The broker tracks what it has already accepted per producer.
- Both use persistent log metadata so that duplicate suppression continues even after broker restarts.
If you also configure transactional.id, Kafka persists the PID mapping cluster-wide, extending these guarantees across producer sessions and enabling true exactly-once semantics.