Spark Unique Sorted Records

Original Data

Distinct Data

Distinct Based on certain columns

⚠️ Distinct takes no arguments we need to select the columns first and then apply distinct.

Point to note is that the dataframe manager_df has no changes, it just shows the records after dups have been dropped.

Descending order

Sorting on multiple columns

Here first the salary is srranged in desc order then we arrange the name in asc order from those records with same salary.