Salting in PySparkπ
Check below condition we get 15 records.
The id=1 on left side is skewed and assume that the table2 is > 10 MB
Now salt the data, append a random number between 1-10 to the id on both the sides.
All the salted keys go into different partitions and not just one like before.
Now there is an evident problem where we get only 3 records instead of the actual 15. So to tackle this on table2 we need to generate 10 salted keys for each id so that the join is possible.
Before Salting
After Salting
π 2. Where Memory Comes Into Playπ
Salting helps with shuffle balance, but OOM can still happen depending on join strategy:
-
Broadcast Hash Join (BHJ)
-
Spark broadcasts the smaller table to all executors.
- β If the second table is large (multi-GBs), broadcasting it will OOM on driver/executors.
-
Spark usually auto-switches to Sort-Merge Join (SMJ) if the table is bigger than
spark.sql.autoBroadcastJoinThreshold
(default \~10MB). -
Sort-Merge Join (SMJ)
-
Both tables are shuffled and sorted.
- Salting works well here because it reduces shuffle skew.
- β No broadcast β less chance of OOM.
-
But if partitions are still unbalanced (or too wide), you can hit OOM during shuffle spill or sort buffer.
-
Shuffle Hash Join (rare)
-
If memory per executor is low, hash table building can cause OOM.
π 3. So, If the Second Table is Largeβ¦π
- Yes, you can get OOM if Spark mistakenly tries to broadcast it (or if you force broadcast with a hint).
-
If Spark chooses Sort-Merge Join, OOM is less likely, but still possible if:
-
The salted distribution is still skewed (bad salt choice).
- Shuffle partitions are too few (
spark.sql.shuffle.partitions
too low). - Executor memory is too small for sort buffers.
π 4. How to Avoid OOM in Large Second Tableπ
β Best practices:
-
Do not broadcast large table
-
Check plan with
explain()
. -
Disable broadcast for large tables:
-
Use Sort-Merge Join + Salting
-
Works best for large joins.
-
Tune partitions
-
Increase shuffle partitions (
spark.sql.shuffle.partitions
). -
Repartition the large table on join key before join.
-
Use Adaptive Query Execution (AQE)
-
Spark 3+ can dynamically coalesce or split skewed partitions.
-
Enable:
-
Check salt factor
-
Too small β still skewed.
- Too big β data explosion (Cartesian effect).
- Rule of thumb:
salt_factor β (skewed_key_rows / avg_rows_per_key)
.
β Conclusion:
- If the second table is large, Spark will not broadcast it by default (so no OOM in normal SMJ with salting).
-
Youβll only hit OOM if:
-
You force broadcast on a large table, or
- Shuffle/sort partitions are misconfigured.
- With AQE + proper salting, Spark can handle large second tables safely.