Spark Ecosystem
Lecture 4 : Spark Ecosystem🔗
High Level API : We cna write any SQL queries in python,java etc... there are ML and GraphX librries also.
We can write code in many languages. Low Level API : we can make RDD's and work on them.
Where does Spark run?🔗
Spark Engine would need some memory for transformation.
- suppose it needs 4 worker nodes each 20 GB and a driver node of 20 gb.
- it goes to the cluster manager and asks for total 100 GB of memory, if available then the manager will assign that muuch storage.
- cluster manager is also called YARN, K8S, Standalone managers