Skip to content

Spark Ecosystem

Lecture 4 : Spark Ecosystem🔗

image

High Level API : We cna write any SQL queries in python,java etc... there are ML and GraphX librries also.

We can write code in many languages. Low Level API : we can make RDD's and work on them.

image

Where does Spark run?🔗

image

Spark Engine would need some memory for transformation.

  • suppose it needs 4 worker nodes each 20 GB and a driver node of 20 gb.
  • it goes to the cluster manager and asks for total 100 GB of memory, if available then the manager will assign that muuch storage.
  • cluster manager is also called YARN, K8S, Standalone managers