Shuffle = Map Phase + Reduce Phaseπ
MAP PHASE (Shuffle Write) REDUCE PHASE (Shuffle Read)
--------------------------- ----------------------------
Process partitions Fetch shuffled data
β β
Local aggregation (optional) Merge data from all mappers
β β
Partition by key Group by key
β β
Write shuffle files Final aggregation
1. Map Phase (Shuffle Write)π
Runs on executors where data already exists.
Stepsπ
a. Process Input Partitionπ
Example:
b. Local Aggregation (Map-Side Combine)π
Only for operations like reduceByKey:
Important:
- Happens within a single partition only
- Reduces data before shuffle
c. Partitioningπ
Spark decides which reducer gets which key:
d. Write Shuffle Filesπ
- Data written to disk
- Organized per target reducer
2. Reduce Phase (Shuffle Read)π
Runs on executors responsible for final output.
a. Fetch Dataπ
Each reducer:
- Pulls its partition data from all map tasks
- This is where network transfer happens
b. Merge + Group by Keyπ
Example incoming data:
Grouped as:
c. Final Aggregationπ
Critical Question:π
If we already got ("A",3), why do we need reducer?π
Because that "A",3 is not global, it is only local to one partition.
Example Across Partitionsπ
Partition 1π
Partition 2π
After Map Phaseπ
At this point:
- Same key exists in multiple partitions
- Results are partial
Reducerβs Jobπ
Bring all "A" values together:
This requires:
- Shuffle (to co-locate keys)
- Reduce phase (to finalize aggregation)
What If We Skip Reduce Phase?π
You would get:
Which is:
- Incorrect
- Not a true aggregation
Key Insightπ
| Stage | Scope | Role |
|---|---|---|
| Map-side combine | Within partition | Partial aggregation |
| Reduce phase | Across partitions | Final aggregation |
Why This Design Mattersπ
Map Phaseπ
- Reduces data early
- Minimizes shuffle size
Reduce Phaseπ
- Ensures correctness
- Combines distributed partial results
Final Takeawayπ
Map-side aggregation reduces data locally, but reducers are required to combine results across partitions and produce the final correct output.