Spark basic

200 阅读1分钟

概念

spark cluster consist

一个spark集群包括一个driver + N个executor

  • driver
  • executor

spark cluster

启动方式(部署方式)

不同的启动方式,意味着driver和executor不同的分布方式

  • local:
  • cluster:
    • cluster-client: sparksubmit -mode client,driver运行在物理集群之外,executor在物理集群之内,由resource manager管理
    • cluster-cluster: sparksubmit -mode cluster,driver以及executor均运行在物理集群内,由resource manager管理

集群资源管理

cluter manager type

  • Standalone – a simple cluster manager included with Spark that makes it easy to set up a cluster.
  • Apache Mesos – a general cluster manager that can also run Hadoop MapReduce and service applications.
  • Hadoop YARN – the resource manager in Hadoop 2.
  • Kubernetes – an open-source system for automating deployment, scaling, and management of containerized applications.