|
|
Kafka Consumer Offset Management
Author: Venkata Sudhakar
A Kafka offset is a number that tracks how far a consumer has read in a partition. When a consumer commits an offset, it tells Kafka "I have successfully processed everything up to this point." If the consumer crashes and restarts, it resumes from the last committed offset. Getting offset commits right determines whether your application processes every message exactly once, at least once, or risks losing messages. Kafka provides two commit modes. Auto-commit (enable.auto.commit=true) commits offsets every 5 seconds in the background - simple, but it can commit an offset before your processing is done, causing message loss on restart. Manual commit gives you full control: you call commitSync() or commitAsync() only after you have successfully processed the message, guaranteeing at-least-once delivery. The below example shows auto-commit vs manual commit and why the timing of the commit matters.
It gives the following output,
Processing: {"orderId":"ORD-1001","amount":99.99}
Processing: {"orderId":"ORD-1002","amount":49.99}
# With manual commit:
# - Offset committed AFTER both messages processed
# - If app crashes mid-batch, both messages are re-processed on restart
# - At-least-once delivery guaranteed
# With auto-commit:
# - Offset committed at 5s mark regardless of processing state
# - If app crashes after commit but before processing, messages are lost
Use manual commitSync() for critical workloads where losing a message is not acceptable. commitAsync() is faster but does not retry on failure - use it inside a batch loop and only use commitSync() on shutdown or after a complete batch. The tradeoff is always between safety (manual sync) and throughput (auto or async).
|
|