Skip to content

Processing vs Event Time in FlinkπŸ”—


1. Processing Time (System Time)πŸ”—

Processing time means:

β€œUse the time of the machine where the event is being processed.”

So the timestamp comes from the system clock of the Task Manager.

CharacteristicsπŸ”—

  • Fastest
  • No need for watermarks
  • No waiting for late events
  • Sensitive to system load, network delays, backpressure

ExampleπŸ”—

If an event arrives at 12:00:10 PM system time, that’s its processing time β€” even if that event was actually produced at 11:59:55 AM.

Use casesπŸ”—

  • Basic dashboards
  • Monitoring where slight inaccuracies are okay
  • High-throughput counters
  • No out-of-order handling needed

2. Event TimeπŸ”—

Event time means:

β€œUse the timestamp inside the event β€” when it actually occurred in the real world.”

Flink extracts the timestamp from the event using:

assignTimestampsAndWatermarks(...)

Key characteristicsπŸ”—

  • Handles late and out-of-order events
  • Uses watermarks to decide when to close windows
  • Most accurate
  • Slightly slower (waiting for late events)

ExampleπŸ”—

Suppose a sensor generated data at 11:59:55 AM but reached Flink at 12:00:10 PM.

  • Event time = 11:59:55
  • Processing time = 12:00:10

Event time gives β€œreal-world correctness” instead of β€œarrival-time correctness.”


3. Why event time is criticalπŸ”—

Real streaming data is almost always out-of-order.

Events can arrive late due to:

  • Network delays
  • Clock skew
  • Mobile devices going offline
  • Batch uploads
  • Retries in Kafka

If you use processing time, windows close too early.

If you use event time, windows only close when watermarks say β€œwe have seen almost all events.”


4. Watermarks (core to event time)πŸ”—

Event time requires watermarks β€” these tell Flink:

β€œNo more events older than this timestamp should arrive.”

Watermarks = progress indicators of event time.

Example:

If watermark = 12:00:00, Flink believes:

  • All events with timestamps ≀ 12:00:00 have arrived
  • Safe to close windows up to 12:00:00

Late events after this watermark go to late arrival handling (drop or side outputs).


5. Event Time vs Processing Time: Side-by-Side ComparisonπŸ”—

Feature Processing Time Event Time
Timestamp source System clock Inside event
Handles out-of-order? No Yes
Accuracy Low High
Throughput High Medium
Requires watermarks? No Yes
Windows Close based on arrival time Close based on event time
Late event support Not possible Supported
Use cases Monitoring, quick stats Analytics, correctness-critical

6. Example with a windowπŸ”—

Let’s say you have a 1-minute window.

If using processing time:πŸ”—

Window 12:00:00–12:01:00 closes at 12:01:00 system time. If an event with timestamp 11:59:50 arrives at 12:00:50, it goes into the running window. If it arrives at 12:01:05, it is too late.

If using event time:πŸ”—

Window 12:00:00–12:01:00 closes when watermark passes 12:01:00.

If watermark = event_time βˆ’ 5 seconds:

  • Out-of-order events up to 5 seconds late are accepted
  • Windows close correctly based on event timestamps

7. When to use what?πŸ”—

Use event time when:πŸ”—

  • Data comes late, out of order (common in Kafka, IoT, logs)
  • Need accurate results (analytics, billing, fraud detection)
  • You use windows, joins, aggregations

Use processing time when:πŸ”—

  • You need ultra-low latency
  • Out-of-order is not a concern
  • Simple transformations, monitoring, counters

8. One-line summaryπŸ”—

  • Processing time = use machine clock β†’ fast but inaccurate
  • Event time = use timestamp inside event β†’ accurate, requires watermarks