了解完Spark中基本的通信原理之后,接下来我们来一起通过应用提交处理流程来一起看一下端点之间是怎么通信的,通过单例模式Standalone来看一下driver怎么注册到master,怎么提交应用等等,可以先把StandaloneAppClient源码阅读完在看后面的解析,会更加容易理解。
StandaloneAppClient源码阅读
import java.util.concurrent._
import java.util.concurrent.{Future => JFuture, ScheduledFuture => JScheduledFuture}
import java.util.concurrent.atomic.{AtomicBoolean, AtomicReference}
import scala.concurrent.Future
import scala.util.{Failure, Success}
import scala.util.control.NonFatal
import org.apache.spark.SparkConf
import org.apache.spark.deploy.{ApplicationDescription, ExecutorState}
import org.apache.spark.deploy.DeployMessages._
import org.apache.spark.deploy.master.Master
import org.apache.spark.internal.Logging
import org.apache.spark.rpc._
import org.apache.spark.scheduler.ExecutorDecommissionInfo
import org.apache.spark.util.{RpcUtils, ThreadUtils}
/**
* Interface allowing applications to speak with a Spark standalone cluster manager.
*
* Takes a master URL, an app description, and a listener for cluster events, and calls
* back the listener when various events occur
*
* @param masterUrls Each url should look like spark://host:port.
*/
private[spark] class StandaloneAppClient(
rpcEnv: RpcEnv,
masterUrls: Array[String],
appDescription: ApplicationDescription,
//当各种事件发生时,部署客户端调用的回调。目前共有五项活动:
//连接到集群、断开连接、被赋予执行器、删除执行器(由于故障或吊销)以及删除工作程序。
listener: StandaloneAppClientListener,
conf: SparkConf)
extends Logging {
//根据url获取地址 通常集群有主备两个master
//map是集合的遍历
//_表示数组中每个String类型的值
private val masterRpcAddresses = masterUrls.map(RpcAddress.fromSparkURL(_))
//注册超时时间
private val REGISTRATION_TIMEOUT_SECONDS = 20
//注册重试次数
private val REGISTRATION_RETRIES = 3
//端点引用
private val endpoint = new AtomicReference[RpcEndpointRef]
//注册应用id
private val appId = new AtomicReference[String]
//是否注册master成功
private val registered = new AtomicBoolean(false)
private class ClientEndpoint(override val rpcEnv: RpcEnv) extends ThreadSafeRpcEndpoint
with Logging {
//master引用
private var master: Option[RpcEndpointRef] = None
// To avoid calling listener.disconnected() multiple times
//为了避免多次调用listener.disconnected()方法 是否已经断开连接
private var alreadyDisconnected = false
// To avoid calling listener.dead() multiple times
//为了避免多次调用listener.dead() 进程是否已经结束
private val alreadyDead = new AtomicBoolean(false)
//AtomicReference是作用是对”对象”进行原子操作。 提供了一种读和写都是原子性的对象引用变量。
//原子意味着多个线程试图改变同一个AtomicReference(例如比较和交换操作)将不会使得AtomicReference处于不一致的状态。
//这里是为了保存client向Master注册的结果 方便后续注销
private val registerMasterFutures = new AtomicReference[Array[JFuture[_]]]
//注册重试计数器
private val registrationRetryTimer = new AtomicReference[JScheduledFuture[_]]
// A thread pool for registering with masters. Because registering with a master is a blocking
// action, this thread pool must be able to create "masterRpcAddresses.size" threads at the same
// time so that we can register with all masters.
//用于向master注册的线程池。因为向master注册是一个阻塞操作,
//所以这个线程池必须能够同时创建“masterRpcAddresses.size”线程,这样我们才能向所有master注册。
private val registerMasterThreadPool = ThreadUtils.newDaemonCachedThreadPool(
"appclient-register-master-threadpool",
masterRpcAddresses.length // Make sure we can register with all masters at the same time
) //确保能在同一时间在所有的Master中注册
// A scheduled executor for scheduling the registration actions
//用于安排注册操作的计划执行器 重新注册
private val registrationRetryThread =
ThreadUtils.newDaemonSingleThreadScheduledExecutor("appclient-registration-retry-thread")
override def onStart(): Unit = {
try {
//第一次向Master注册
registerWithMaster(1)
} catch {
case e: Exception =>
logWarning("Failed to connect to master", e)
markDisconnected()
stop()
}
}
/**
* Register with all masters asynchronously and returns an array `Future`s for cancellation.
*/
//以异步方式向所有master注册,并返回一个数组“Future”以进行取消。
private def tryRegisterAllMasters(): Array[JFuture[_]] = {
//遍历集群中所有的master地址 都需要注册 针对高可用集群一般有备用master
//for和yield搭配 返回多个结果序列 即Array[]
//JFuture表示操作行为 可以取消执行
for (masterAddress <- masterRpcAddresses) yield {
registerMasterThreadPool.submit(new Runnable {
override def run(): Unit = try {
//如果注册成功则直接返回
if (registered.get) {
return
}
logInfo("Connecting to master " + masterAddress.toSparkURL + "...")
//获取master的引用
val masterRef = rpcEnv.setupEndpointRef(masterAddress, Master.ENDPOINT_NAME)
//通过master的引用向master发送注册消息
masterRef.send(RegisterApplication(appDescription, self))
} catch {
case ie: InterruptedException => // Cancelled
case NonFatal(e) => logWarning(s"Failed to connect to master $masterAddress", e)
}
})
}
}
/**
* Register with all masters asynchronously. It will call `registerWithMaster` every
* REGISTRATION_TIMEOUT_SECONDS seconds until exceeding REGISTRATION_RETRIES times.
* Once we connect to a master successfully, all scheduling work and Futures will be cancelled.
*
* nthRetry means this is the nth attempt to register with master.
*/
//以异步方式向所有主机注册。它将每隔REGISTRATION_TIMEOUT_SECONDS秒调用“registerWithMaster”,
//直到超过REGISTRATION _RETRIES次数。一旦我们成功连接到主机,所有调度工作和期货都将被取消。
//nthRetry表示这是第n次尝试向master注册。
private def registerWithMaster(nthRetry: Int): Unit = {
//向所有的master注册 并获取注册行为
registerMasterFutures.set(tryRegisterAllMasters())
registrationRetryTimer.set(registrationRetryThread.schedule(new Runnable {
override def run(): Unit = {
//如果注册成功 注册成功 master会返回RegisteredApplication消息 由recive处理 会把registered置为true
if (registered.get) {
//已经注册成功了 所以取消注册行为
registerMasterFutures.get.foreach(_.cancel(true))
//关闭注册线程池
registerMasterThreadPool.shutdownNow()
} else if (nthRetry >= REGISTRATION_RETRIES) {//如果超过3次 则放弃注册
markDead("All masters are unresponsive! Giving up.")
} else {
//取消当前的注册行为
registerMasterFutures.get.foreach(_.cancel(true))
//重新尝试注册
registerWithMaster(nthRetry + 1)
}
}
}, REGISTRATION_TIMEOUT_SECONDS, TimeUnit.SECONDS))
}
/**
* Send a message to the current master. If we have not yet registered successfully with any
* master, the message will be dropped.
*/
//发消息给当前的master 如果还没有在master中注册成功 则该条消息会被丢弃
private def sendToMaster(message: Any): Unit = {
//match匹配
master match {
//Some()用来表示一个存在的值
//这里用来表示如果master引用存在 则通过master引用向master发消息
case Some(masterRef) => masterRef.send(message)
//如果master引用不存在 即没注册成功 则该条消息会被丢弃
case None => logWarning(s"Drop $message because has not yet connected to master")
}
}
//根据地址判断是否是master
private def isPossibleMaster(remoteAddress: RpcAddress): Boolean = {
masterRpcAddresses.contains(remoteAddress)
}
//收到消息不需要回复的
override def receive: PartialFunction[Any, Unit] = {
//收到Master发回来的注册成功的消息
case RegisteredApplication(appId_, masterRef) =>
// FIXME How to handle the following cases?
// 1. A master receives multiple registrations and sends back multiple
// RegisteredApplications due to an unstable network.
// 2. Receive multiple RegisteredApplication from different masters because the master is
// changing.
//FIXME如何处理以下情况?
//1.由于网络不稳定,Master接收多个注册并发回多个RegisteredApplication。
//2.从不同的Master接收多个RegisteredApplication,因为Master正在更改。
//设置Master返回的应用ID
appId.set(appId_)
//已经注册成功了 设置变量
registered.set(true)
//设置Master的引用
master = Some(masterRef)
//回调给Master 表示连接成功
listener.connected(appId.get)
//收到消息应用已经被移除
case ApplicationRemoved(message) =>
//Master已经移除应用 设置客户端死亡
markDead("Master removed our application: %s".format(message))
//停止客户端
stop()
//收到消息Executor添加成功
case ExecutorAdded(id: Int, workerId: String, hostPort: String, cores: Int, memory: Int) =>
val fullId = appId + "/" + id
logInfo("Executor added: %s on %s (%s) with %d core(s)".format(fullId, workerId, hostPort,
cores))
//回调表示Executor添加了
listener.executorAdded(fullId, workerId, hostPort, cores, memory)
//收到消息Executor已经更新
case ExecutorUpdated(id, state, message, exitStatus, workerHost) =>
val fullId = appId + "/" + id
val messageText = message.map(s => " (" + s + ")").getOrElse("")
logInfo("Executor updated: %s is now %s%s".format(fullId, state, messageText))
//如果Executor已经结束
if (ExecutorState.isFinished(state)) {
//回调 Executor已经移除
listener.executorRemoved(fullId, message.getOrElse(""), exitStatus, workerHost)
} else if (state == ExecutorState.DECOMMISSIONED) {//DECOMMISSIONED--退役的
listener.executorDecommissioned(fullId,
ExecutorDecommissionInfo(message.getOrElse(""), workerHost))
}
//收到消息Worker已经被移除
case WorkerRemoved(id, host, message) =>
logInfo("Master removed worker %s: %s".format(id, message))
//回调Worker已经移除了
listener.workerRemoved(id, host, message)
//收到消息Master切换了 masterRef--新的主Master的引用
case MasterChanged(masterRef, masterWebUiUrl) =>
logInfo("Master has changed, new master is at " + masterRef.address.toSparkURL)
//设置新的Master
master = Some(masterRef)
//设置未断开连接
alreadyDisconnected = false
//向Master发消息已经知道master改变了
masterRef.send(MasterChangeAcknowledged(appId.get))
}
//收到消息需要回复的
override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = {
//停止客户端
case StopAppClient =>
//标记应用已经停止
markDead("Application has been stopped.")
//给Master发消息注销应用
sendToMaster(UnregisterApplication(appId.get))
//回调 回复给消息发送者已经停止了应用客户端
context.reply(true)
//停止客户端
stop()
//请求分配Executor
case r: RequestExecutors =>
master match {
//如果Master取到值 即注册成功 则向Master请求消息r(通常是创建),并在Master中创建一个线程来回复结果
case Some(m) => askAndReplyAsync(m, context, r)
//如果还没有在master中注册成功
case None =>
//尝试在向Master注册之前请求执行程序。
logWarning("Attempted to request executors before registering with Master.")
//直接回调否
context.reply(false)
}
//请求杀死Executor
case k: KillExecutors =>
master match {
//如果Master取到值 即注册成功 则向Master请求消息k(通常是kill),并在Master中创建一个线程来回复结果
case Some(m) => askAndReplyAsync(m, context, k)
case None =>
//试图在向Master注册之前杀死执行者。
logWarning("Attempted to kill executors before registering with Master.")
context.reply(false)
}
}
//请求并等待回复
private def askAndReplyAsync[T](
//目标端点引用
endpointRef: RpcEndpointRef,
//回调
context: RpcCallContext,
//消息内容
msg: T): Unit = {
// Ask a message and create a thread to reply with the result. Allow thread to be
// interrupted during shutdown, otherwise context must be notified of NonFatal errors.
//询问一条消息并创建一个线程来回复结果。允许线程在关闭期间中断,否则必须将非致命错误通知上下文。
//端点通过引用发送消息
endpointRef.ask[Boolean](msg).andThen {
//如果发送成功 则回调发送成功
case Success(b) => context.reply(b)
//发送失败则取消
case Failure(ie: InterruptedException) => // Cancelled
//发送失败并且回调错误信息
case Failure(NonFatal(t)) => context.sendFailure(t)
}(ThreadUtils.sameThread)
}
//断开与address的连接
override def onDisconnected(address: RpcAddress): Unit = {
//如果是要断开的地址是Master
if (master.exists(_.address == address)) {
//提示连接到address失败 等待master重新连接
logWarning(s"Connection to $address failed; waiting for master to reconnect...")
//设置断开连接
markDisconnected()
}
}
//网络错误
override def onNetworkError(cause: Throwable, address: RpcAddress): Unit = {
//如果是Master
if (isPossibleMaster(address)) {
//提示无法连接到Master 并告知原因
logWarning(s"Could not connect to $address: $cause")
}
}
/**
* Notify the listener that we disconnected, if we hadn't already done so before.
*/
//通知听众我们断开了连接,如果我们以前没有这样做的话。
def markDisconnected(): Unit = {
//如果没有断开连接
if (!alreadyDisconnected) {
//回调 断开连接
listener.disconnected()
//设置断开连接
alreadyDisconnected = true
}
}
//标记客户端已经死亡
def markDead(reason: String): Unit = {
//如果客户端还在 即应用还在
if (!alreadyDead.get) {
//回调给Master 应用结束 并告知原因
listener.dead(reason)
//标记客户端已经死亡
alreadyDead.set(true)
}
}
//停止重试注册
override def onStop(): Unit = {
//如果还在重试
if (registrationRetryTimer.get != null) {
//取消重试
registrationRetryTimer.get.cancel(true)
}
//关闭重试
registrationRetryThread.shutdownNow()
//取消线程池中每个线程的注册操作
registerMasterFutures.get.foreach(_.cancel(true))
//停止重试线程池
registerMasterThreadPool.shutdownNow()
}
}
//开始
def start(): Unit = {
// Just launch an rpcEndpoint; it will call back into the listener.
//只需启动rpcEndpoint;它会调用回侦听器。
endpoint.set(rpcEnv.setupEndpoint("AppClient", new ClientEndpoint(rpcEnv)))
}
//停止客户端
def stop(): Unit = {
//如果客户端引用不为空
if (endpoint.get != null) {
try {
//返回用于RPC请求操作的默认Spark超时。默认120s
val timeout = RpcUtils.askRpcTimeout(conf)
//等待完成的结果并返回。如果在该超时内结果不可用,则抛出[[RpcTimeoutException]]以指示哪个配置控制超时。
//请求停止客户端 StopAppClient--内部消息类型
timeout.awaitResult(endpoint.get.ask[Boolean](StopAppClient))
} catch {
case e: TimeoutException =>
logInfo("Stop request to Master timed out; it may already be shut down.")
}
//设置客户端引用为空
endpoint.set(null)
}
}
/**
* Request executors from the Master by specifying the total number desired,
* including existing pending and running executors.
*
* @return whether the request is acknowledged.
*/
//通过指定所需的executor数,包括现有的未决和正在运行的executor,向Master请求executor。
//返回 请求是否得到确认。
def requestTotalExecutors(requestedTotal: Int): Future[Boolean] = {
//如果端点还在并且应用还在
if (endpoint.get != null && appId.get != null) {
//端点请求Executor分配给应用
endpoint.get.ask[Boolean](RequestExecutors(appId.get, requestedTotal))
} else {
//在请求Executor之前需要确保driver已经完全初始化
logWarning("Attempted to request executors before driver fully initialized.")
//返回请求失败
Future.successful(false)
}
}
/**
* Kill the given list of executors through the Master.
* @return whether the kill request is acknowledged.
*/
//通过Master杀死其分配的Executor
//返回 杀死Executor的请求是否得到确认
def killExecutors(executorIds: Seq[String]): Future[Boolean] = {
//如果端点还在并且应用还在
if (endpoint.get != null && appId.get != null) {
//端点请求杀死分配给应用的Executor
endpoint.get.ask[Boolean](KillExecutors(appId.get, executorIds))
} else {
//在请求Executor之前需要确保driver已经完全初始化
logWarning("Attempted to kill executors before driver fully initialized.")
//返回请求失败
Future.successful(false)
}
}
}
StandaloneAppClient在启动的时候首先就是向master注册,主要负责注册的方法是registerWithMaster,参数1表示第一次向master提交注册请求:
override def onStart(): Unit = {
try {
//第一次向Master注册
registerWithMaster(1)
} catch {
case e: Exception =>
logWarning("Failed to connect to master", e)
markDisconnected()
stop()
}
}
接下来具体看一下registerWithMaster是怎么注册的,
/**
* Register with all masters asynchronously. It will call `registerWithMaster` every
* REGISTRATION_TIMEOUT_SECONDS seconds until exceeding REGISTRATION_RETRIES times.
* Once we connect to a master successfully, all scheduling work and Futures will be cancelled.
*
* nthRetry means this is the nth attempt to register with master.
*/
//以异步方式向所有master注册。它将每隔REGISTRATION_TIMEOUT_SECONDS秒调用“registerWithMaster”,
//直到超过REGISTRATION _RETRIES次数。一旦我们成功连接到master,所有调度工作和获取注册结果的行为都将被取消。
//nthRetry表示这是第n次尝试向master注册。
private def registerWithMaster(nthRetry: Int): Unit = {
//向所有的master注册 并获取注册行为
registerMasterFutures.set(tryRegisterAllMasters())
registrationRetryTimer.set(registrationRetryThread.schedule(new Runnable {
override def run(): Unit = {
//如果注册成功 注册成功 master会返回RegisteredApplication消息 由recive处理 会把registered置为true
if (registered.get) {
//已经注册成功了 所以取消注册行为
registerMasterFutures.get.foreach(_.cancel(true))
//关闭注册线程池
registerMasterThreadPool.shutdownNow()
} else if (nthRetry >= REGISTRATION_RETRIES) {//如果超过3次 则放弃注册
markDead("All masters are unresponsive! Giving up.")
} else {
//取消当前的注册行为
registerMasterFutures.get.foreach(_.cancel(true))
//重新尝试注册
registerWithMaster(nthRetry + 1)
}
}
}, REGISTRATION_TIMEOUT_SECONDS, TimeUnit.SECONDS))
}
StandaloneAppClient注册到master是异步的,默认注册失败之后重试3次,超过3次或者超过给定的时间则放弃注册,registerMasterFutures负责获取向所有master注册的结果,tryRegisterAllMasters负责向所有的master注册:
/**
* Register with all masters asynchronously and returns an array `Future`s for cancellation.
*/
//以异步方式向所有注册,并返回一个数组“Future”以进行取消。
private def tryRegisterAllMasters(): Array[JFuture[_]] = {
//遍历集群中所有的master地址 都需要注册 针对高可用集群一般有备用master
//for和yield搭配 返回多个结果序列 即Array[]
//JFuture表示操作行为 可以取消执行
for (masterAddress <- masterRpcAddresses) yield {
registerMasterThreadPool.submit(new Runnable {
override def run(): Unit = try {
//如果注册成功则直接返回
if (registered.get) {
return
}
logInfo("Connecting to master " + masterAddress.toSparkURL + "...")
//获取master的引用
val masterRef = rpcEnv.setupEndpointRef(masterAddress, Master.ENDPOINT_NAME)
//通过master的引用向master发送注册消息
masterRef.send(RegisterApplication(appDescription, self))
} catch {
case ie: InterruptedException => // Cancelled
case NonFatal(e) => logWarning(s"Failed to connect to master $masterAddress", e)
}
})
}
}
遍历客户端中保存的master地址,通过注册线程池中的线程注册,先判断是否注册成功,如果没有注册成功,则先通过地址获取master引用,然后通过引用向该master发注册消息。
然后由注册重试计数器registrationRetryTimer保证注册行为在规定超时时间内,并且注册次数不会超过默认注册次数,如果已经注册成功,就通过registerMasterFutures取消其他的注册结果,同时关闭注册线程池,如果超过注册次数则取消注册行为,否则的话重试注册。
master在收到StandaloneAppClient发送过来的RegisterApplication消息后开始处理该类消息:
//注册应用
case RegisterApplication(description, driver) =>
// TODO Prevent repeated registrations from some driver
//如果是备用的Master 则不做处理
if (state == RecoveryState.STANDBY) {
// ignore, don't send response
} else {
logInfo("Registering app " + description.name)
//创建应用
val app = createApplication(description, driver)
//注册应用
registerApplication(app)
logInfo("Registered app " + description.name + " with ID " + app.id)
//持久化引擎中加入该应用
persistenceEngine.addApplication(app)
//向提交给该应用的driver发消息应用注册
driver.send(RegisteredApplication(app.id, self))
//开始调度
schedule()
}
注:从前面的原理我们知道端点在receive中处理收到的消息,master会根据收到的消息类型进行匹配处理。
master会先判断当前状态,如果是备用master收到该消息则不做处理,否则的话开始创建应用createApplication:
//创建应用
private def createApplication(desc: ApplicationDescription, driver: RpcEndpointRef):
ApplicationInfo = {
val now = System.currentTimeMillis()
val date = new Date(now)
val appId = newApplicationId(date)
new ApplicationInfo(now, appId, desc, date, driver, defaultCores)
}
应用创建好之后提交注册:
//注册应用
private def registerApplication(app: ApplicationInfo): Unit = {
//初始化应用地址
val appAddress = app.driver.address
//如果应用地址列表中包含该地址
if (addressToApp.contains(appAddress)) {
//提示在同一个地址上尝试重新注册应用
logInfo("Attempted to re-register application at same address: " + appAddress)
return
}
//在应用metris系统中注册应用
applicationMetricsSystem.registerSource(app.appSource)
//在apps应用列表中加入该应用
apps += app
//添加应用id
idToApp(app.id) = app
//添加应用程序的端点
endpointToApp(app.driver) = app
//添加应用程序的地址
addressToApp(appAddress) = app
//将该应用添加到等待执行的应用列表中
waitingApps += app
}
注册应用:以提交应用的driver地址作为应用的地址,这里说明StandaloneAppClient即是driver,负责提交应用,然后在addressToApp(地址和应用的映射map)中判断是否有应用重新注册,将该应用加到applicationMetricsSystem系统中(metrics系统主要是采集信息然后发送到UI进行显示),然后添加应用id,端点,地址,最后将应用添加到等待执行的应用列表中,后续会讲到的。
应用注册完之后将应用添加到持久化状态引擎中,主要是为了保存该应用的状态,然后给driver发消息:
//向提交给该应用的driver发消息应用注册
driver.send(RegisteredApplication(app.id, self))
driver在收到master发回的RegisteredApplication注册成功消息后,进行相应的处理:
//收到Master发回来的注册成功的消息
case RegisteredApplication(appId_, masterRef) =>
// FIXME How to handle the following cases?
// 1. A master receives multiple registrations and sends back multiple
// RegisteredApplications due to an unstable network.
// 2. Receive multiple RegisteredApplication from different masters because the master is
// changing.
//FIXME如何处理以下情况?
//1.由于网络不稳定,Master接收多个注册并发回多个RegisteredApplication。
//2.从不同的Master接收多个RegisteredApplication,因为Master正在更改。
//设置Master返回的应用ID
appId.set(appId_)
//已经注册成功了 设置变量
registered.set(true)
//设置Master的引用
master = Some(masterRef)
//回调给Master 表示连接成功
listener.connected(appId.get)
driver在收到master发回的注册成功的消息后先获取应用id,然后设置表示已经注册了,接着获取master的引用,最后回调给master。
master最后调用schedule()进行资源调度,这个在下一节进行讲解。
总结:这一节主要讲了Standalone模式下应用提交的过程,包括消息发送、消息处理等等,这里涉及到了一部分master收到消息之后如何处理,master的源码会在下一节进一步介绍,要看懂应用提交、处理、调度、执行完整的过程涉及到master和worker,最好是相互跳转着看,代码一行一行的看,不然会在某处感觉看不懂。