Spark Agg Functions
Lecture 22 : Aggregate functions🔗
Count as both Action and Transformation🔗
⚠️ When we are doing count on a single column and there is a null in it, its not considered in the count. But for all columns we have nulls in the count.
Above case when we do df.count()
the rows that have all duplicates are counted and we get 10 records but when we do df.select('name').count()
then we get 8 because there are two nulls in name column.
Job created in first case and its not created in second case below.