kotlin协程之间父子关系1-Job如何关联的

84 阅读10分钟

示例

运行环境JDK11

implementation("org.jetbrains.kotlinx:kotlinx-coroutines-core:1.9.0")
implementation("org.jetbrains.kotlin:kotlin-stdlib:2.0.21")
private fun logX(any: Any?) {
    println("[Time:${LocalTime.now()} Thread:${Thread.currentThread().name}] $any ".trimIndent())
}

private fun wait(timeOut: Int, char: Char = '.') {
    var time = 0L
    val min = 150L
    if (timeOut < min) return
    while (time <= timeOut) {
        Thread.sleep(min)
        time += min
        print(char)
    }
    println()
}

fun main() {
    // 指定到单线程调度器
    val dispatcher = Executors.newSingleThreadScheduledExecutor().asCoroutineDispatcher()
    
    val parentJob = Job()
   
    val scope = CoroutineScope(dispatcher + parentJob)
    // 协程1
    val coroutine1 = scope.launch {
        logX("coroutine1 start")
        // 协程2
        val coroutine2 = launch {
            logX("coroutine2 start")
            // 让子job无法多等一会,观察父子job之间的关系
            delay(Long.MAX_VALUE - 1)
            logX("coroutine2 end")
        }

        // 协程3
        val coroutine3 = launch {
            logX("coroutine3 start")
            // 让子job无法多等一会,观察父子job之间的关系
            delay(Long.MAX_VALUE - 1)
            logX("coroutine3 end")
        }
		
        // 这样协程1就不会执行完
        coroutine3.join()
        logX("coroutine1 end")
    }

    // 等子job
    wait(10000)

    // 取消
    scope.cancel()
    while (parentJob.isActive)
        Thread.yield()

    // 关闭线程池
    dispatcher.close()
    logX("main end")
}

输出

[Time:18:37:30.848391300 Thread:pool-1-thread-1] coroutine1 start 
[Time:18:37:30.884485100 Thread:pool-1-thread-1] coroutine2 start 
[Time:18:37:30.885488700 Thread:pool-1-thread-1] coroutine3 start 
...................................................................
[Time:18:37:41.289701 Thread:main] main end 

parentJob关联协程1

public fun Job(parent: Job? = null): CompletableJob = JobImpl(parent)
public interface CompletableJob : Job {
	// 结束这个Job
    public fun complete(): Boolean
    
    // 使用给定异常结束这个Job
    public fun completeExceptionally(exception: Throwable): Boolean
}

parentJobCompletableJob类型,这种类型可以使用complete方法结束协程,也可以使用异常来完成协程(completeExceptionally)。

internal open class JobImpl(parent: Job?) : JobSupport(true), CompletableJob {
    init { initParentJob(parent) } // parent为null
    
    /**
     * Returns `true` for job that do not have "body block" to complete and should immediately go into
     * completing state and start waiting for children.
     */
    override val onCancelComplete get() = true

	......
}

同时也继承了JobSupport类。这是一个非常重要的类,继承这个类的节点可以当做子节点,也可以当做父节点。

当使用Job()时,它是没有父Job的,即它自己就是顶层Job

public fun CoroutineScope.launch(
    context: CoroutineContext = EmptyCoroutineContext,
    start: CoroutineStart = CoroutineStart.DEFAULT,
    block: suspend CoroutineScope.() -> Unit
): Job {
    val newContext = newCoroutineContext(context)
    val coroutine = if (start.isLazy)
        LazyStandaloneCoroutine(newContext, block) else
    
        StandaloneCoroutine(newContext, active = true)
    coroutine.start(start, coroutine, block)
    return coroutine
}

运行到第一层的launch,我们看下newContext的值。它是CombinedContext,即一种组合context,里面的element是我们定义的线程池调度器,还有一个left指向的是一个Job。

image-20241030161756829.png

接下来就是初始化StandaloneCoroutineStandaloneCoroutine因为继承自AbstractCoroutine

private open class StandaloneCoroutine(
    parentContext: CoroutineContext,
    active: Boolean
) : AbstractCoroutine<Unit>(parentContext, initParentJob = true, active = active) {
    override fun handleJobException(exception: Throwable): Boolean {
        handleCoroutineException(context, exception)
        return true
    }
}
public abstract class AbstractCoroutine<in T>(
    parentContext: CoroutineContext,
    initParentJob: Boolean,
    active: Boolean
) : JobSupport(active), Job, Continuation<T>, CoroutineScope {
	......
	
	 init {
        // 
        if (initParentJob) initParentJob(parentContext[Job])
    }
	
	.....
}

StandaloneCoroutineinitParentJob的值都是true。parentContext就是上面的创建出来的newContext,里面有Job,即parentContext[Job]是不为null的。它的值就是我们示例中的parentJob

 protected fun initParentJob(parent: Job?) {
        assert { parentHandle == null }
        if (parent == null) {
            parentHandle = NonDisposableHandle
            return
        }
     	// start方法就是确保子协程启动的时候,父协程也必须是启动状态的
     	// 就是改变状态
        parent.start() // make sure the parent is started
     
        val handle = parent.attachChild(this)
        parentHandle = handle
        // now check our state _after_ registering (see tryFinalizeSimpleState order of actions)
        if (isCompleted) {
            handle.dispose()
            parentHandle = NonDisposableHandle // release it just in case, to aid GC
        }
    }

JobSupport有一个状态state字段,表示当前Job的状态。初始化的时候是EMPTY,即EMPTY_ACTIVE或者EMPTY_NEW。如果没有子节点,就是EMPTY。如果有一个子节点,它的值会变成ChildHandleNode。如果有多个子节点,它的值会变成NodeListNodeList是一个双向链表,它自己就是链表的头,链表的ChildHandleNode节点代表的是它的子协程,当然可以有其他的节点,比如协程要结束的时候会添加ListClosed节点,

如果父协程先一步子协程执行完代码块,父协程还需要检查子协程是否结束,就会注册一个ChildCompletion到子协程的NodeList链表中。

private val _state = atomic<Any?>(if (active) EMPTY_ACTIVE else EMPTY_NEW)

internal val state: Any? get() = _state.value
private class ChildHandleNode(
    @JvmField val childJob: ChildJob
) : JobNode(), ChildHandle {
    override val parent: Job get() = job
    override val onCancelling: Boolean get() = true
    override fun invoke(cause: Throwable?) = childJob.parentCancelled(job)
    override fun childCancelled(cause: Throwable): Boolean = job.childCancelled(cause)
}
public final override fun attachChild(child: ChildJob): ChildHandle {
        // 创建一个ChildHandleNode
        val node = ChildHandleNode(child).also { it.job = this }
    	// 
        val added = tryPutNodeIntoList(node) { _, list ->
            // 能到这里,说明Job的state已经是NodeList或者Finishing了
            // 往NodeList这个双向链表中添加node 
            // First, try to add a child along the cancellation handlers
            val addedBeforeCancellation = list.addLast(
                node,
                LIST_ON_COMPLETION_PERMISSION or LIST_CHILD_PERMISSION or LIST_CANCELLATION_PERMISSION
            )
            if (addedBeforeCancellation) {
                // The child managed to be added before the parent started to cancel or complete. Success.
                true
            } else {
                .....
            }
        }
    	// 添加成功,返回node
        if (added) return node
        /** We can only end up here if [tryPutNodeIntoList] detected a final state. */
        node.invoke((state as? CompletedExceptionally)?.cause)
        return NonDisposableHandle
    }
private inline fun tryPutNodeIntoList(
        node: JobNode,
        tryAdd: (Incomplete, NodeList) -> Boolean
    ): Boolean {
    	// 循环检查state
        loopOnState { state ->
            // 协程1刚刚初始化,state的值就是EMPTY_ACTIVE    
            when (state) {
                is Empty -> { // EMPTY_X state -- no completion handlers
                    if (state.isActive) {
                        // try to move to the SINGLE state
                        // state的值从EMPTY_ACTIVE变成了ChildHandleNode
                        if (_state.compareAndSet(state, node)) return true
                    } else
                        promoteEmptyToNodeList(state) // that way we can add listener for non-active coroutine
                }
                // 如果state已经是ChildHandleNode,说明还有当前协程已经有子协程了
                // 如果state已经是NodeList,就看tryAdd方法能不能加到链表了
                is Incomplete -> when (val list = state.list) {
                    // 如果state已经是ChildHandleNode,它的list是null,就先创建一个NodeList
                    // 让state的值变成这个NodeList,同时把之前的ChildHandleNode添加到这个NodeList双向链表中
                    // 因为这里没有返回值,会循环回来继续添加node
                    null -> promoteSingleToNodeList(state as JobNode)
                    
                    // 这里state已经是NodeList状态了,即当前协程已经有子协程了
                    // 使用tryAdd方法来添加node
                    else -> if (tryAdd(state, list)) return true
                }
                // 添加不成功返回false
                else -> return false
            }
        }
    }

tryPutNodeIntoList方法就是把node添加到父协程中,如果state已经是ChildHandleNode,说明还有当前协程已经有一个子协程了。ChildHandleNodeIncomplete的子类。它的listnull,就先创建一个NodeList,让state的值变成这个NodeList,同时把之前的ChildHandleNode添加到这个NodeList双向链表中。

如果state已经是NodeList状态了,即当前协程已经有多个子协程了(>=1),现在想再添加一个node,就要tryAdd方法了。

我们看看parentJob的state,已经变成了ChildHandleNode,说明它关联一个字协程,即示例中的协程1。

image-20241030170032262.png

协程1关联协程2和协程3

launch执行的时候会创建StandaloneCoroutine,这个类中会调用initParentJob方法。

按照上面的逻辑,协程1一开始的state是EMPTY_ACTIVE,后面执行到协程2的launch,同样也会调用initParentJob方法。也是就是说添加关联是子协程的主动行为。让parent自己关联

parent.attachChild(this)

协程1的state从一开始的EMPTY_ACTIVE变成了ChildHandleNode

然后执行到协程3,协程3主动关联它的父Job,这时父job,即协程1,关联协程3时,发现自己的state是ChildHandleNode,就创建一个NodeList,把协程1刚刚的ChildHandleNode添加到这个NodeList链表中,再把协程1的state变成这个NodeList,最后把协程3关联到NodeList中。

但是我们的示例中使用了join(),协程会挂起在这里,等待coroutine3结束,但是我们delay了一个超长的时间,coroutine3是一直delay下去了,执行不完。join操作会向当前协程(协程1)关联一个ChildContinuation节点。当父协程取消的时候可以通知到它的子协程,看看是怎么关联进去的。

// Same as ChildHandleNode, but for cancellable continuation
private class ChildContinuation(
    @JvmField val child: CancellableContinuationImpl<*>
) : JobNode() {
    
    // 为true表示ChildContinuation对取消事件也感兴趣
    // 为false表示ChildContinuation只对完成时间感兴趣
    override val onCancelling get() = true

    override fun invoke(cause: Throwable?) {
        child.parentCancelled(child.getContinuationCancellationCause(job))
    }
}
coroutine3.join()
public final override suspend fun join() {
        ....
        return joinSuspend() // slow-path wait
    }

	// cont就是CancellableContinuationImpl
    private suspend fun joinSuspend() = suspendCancellableCoroutine<Unit> { cont ->
        // We have to invoke join() handler only on cancellation, on completion we will be resumed regularly without handlers
        cont.disposeOnCancellation(invokeOnCompletion(handler = ResumeOnCompletion(cont)))
    }
public suspend inline fun <T> suspendCancellableCoroutine(
    crossinline block: (CancellableContinuation<T>) -> Unit
): T =
    suspendCoroutineUninterceptedOrReturn { uCont ->
        val cancellable = CancellableContinuationImpl(uCont.intercepted(), resumeMode = MODE_CANCELLABLE)
        cancellable.initCancellability()
        block(cancellable)
        cancellable.getResult()
    }

也就是说协程1需要等CancellableContinuationImpl执行结束才能恢复执行。

   public override fun initCancellability() {
      
        val handle = installParentHandle()
            ?: return // fast path -- don't do anything without parent
        // now check our state _after_ registering, could have completed while we were registering,
        // but only if parent was cancelled. Parent could be in a "cancelling" state for a while,
        // so we are helping it and cleaning the node ourselves
        if (isCompleted) {
            // Can be invoked concurrently in 'parentCancelled', no problems here
            handle.dispose()
            _parentHandle.value = NonDisposableHandle
        }
    }
private fun installParentHandle(): DisposableHandle? {
    	
        val parent = context[Job] ?: return null // don't do anything without a parent
        // Install the handle
    	// 当父协程取消的时候可以通知到它的子协程,因为ChildContinuation中有这个this引用
    	// 这个this就是CancellableContinuationImpl,里面有一个delegate,即DispatchedContinuation
    	// 这个DispatchedContinuation就含有外面协程的引用
        val handle = parent.invokeOnCompletion(handler = ChildContinuation(this))
        _parentHandle.compareAndSet(null, handle)
        return handle
    }

context就是协程1中的context,所以parent就是协程1的StandaloneCoroutine

internal fun Job.invokeOnCompletion(
    invokeImmediately: Boolean = true,
    handler: JobNode,
): DisposableHandle = when (this) {
    is JobSupport -> invokeOnCompletionInternal(invokeImmediately, handler)
    else -> invokeOnCompletion(handler.onCancelling, invokeImmediately, handler::invoke)
}

this就是协程1的StandaloneCoroutine,即是JobSupport类型

internal fun invokeOnCompletionInternal(
        invokeImmediately: Boolean,
        node: JobNode
    ): DisposableHandle {
    	// node是ChildContinuation,它的job字段就是协程1的StandaloneCoroutine
        node.job = this
        // Create node upfront -- for common cases it just initializes JobNode.job field,
        // for user-defined handlers it allocates a JobNode object that we might not need, but this is Ok.
        val added = tryPutNodeIntoList(node) { state, list ->
            // ChildContinuation的 onCancelling为true,表示ChildContinuation可以响应取消事件                   
            if (node.onCancelling) {
                // 没有异常,rootCause为null
                val rootCause = (state as? Finishing)?.rootCause
                if (rootCause == null) {
                    // 添加到链表
                    list.addLast(node, LIST_CANCELLATION_PERMISSION or LIST_ON_COMPLETION_PERMISSION)
                } else {
                    .....
                }
            } else {
                ....
            }
        }
        when {
            added -> return node
            invokeImmediately -> node.invoke((state as? CompletedExceptionally)?.cause)
        }
        return NonDisposableHandle
    }

我们现在知道协程1已经关联了协程2和协程3,协程1的state已经是NodeList了,所以tryPutNodeIntoList(node),就直接添加到NodeList的链表中。

image-20241030173717839.png

协程2的状态

协程2中使用了delay方法

public suspend fun delay(timeMillis: Long) {
    if (timeMillis <= 0) return // don't delay
   
    return suspendCancellableCoroutine sc@ { cont: CancellableContinuation<Unit> ->
        // if timeMillis == Long.MAX_VALUE then just wait forever like awaitCancellation, don't schedule.
        if (timeMillis < Long.MAX_VALUE) {
            cont.context.delay.scheduleResumeAfterDelay(timeMillis, cont)
        }
    }
}

contCancellableContinuationImpl

uCont就是协程2对象,内部有一个completion字段,这个值就是执行到launch的时候创建的StandaloneCoroutine

public suspend inline fun <T> suspendCancellableCoroutine(
    crossinline block: (CancellableContinuation<T>) -> Unit
): T =
    suspendCoroutineUninterceptedOrReturn { uCont ->
        val cancellable = CancellableContinuationImpl(uCont.intercepted(), resumeMode = MODE_CANCELLABLE)
        cancellable.initCancellability()
        block(cancellable)
        cancellable.getResult()
    }

suspend修饰的方法,经过CPS修改后,方法最后会有一个Continuation类型参数,我们说的协成是我们自己写大代码块,其实这坨代码块还要经过层层修饰封装,这才有了协程的切换线程,协程的管理等业务。

所以suspendCoroutineUninterceptedOrReturn方法是没有源码的,是虚拟机asm实现的,干什么呢,就是帮我们拿到外面的Continuation类型参数。

执行initCancellability后,协程2的stateEMPTY_ACTIVE变成了ChildContinuation

同时CancellableContinuationImplstate值就会变成CancelFutureOnCancel,当有取消事件的时候,协程2就可以遍历它的NodeList,找到这个CancelFutureOnCancel,执行它的invok方法,把这个delay事件从线程池的任务中删除。

override fun scheduleResumeAfterDelay(timeMillis: Long, continuation: CancellableContinuation<Unit>) {
        val future = (executor as? ScheduledExecutorService)?.scheduleBlock(
            ResumeUndispatchedRunnable(this, continuation),
            continuation.context,
            timeMillis
        )
        // If everything went fine and the scheduling attempt was not rejected -- use it
        if (future != null) {
            continuation.invokeOnCancellation(CancelFutureOnCancel(future))
            return
        }
        // Otherwise fallback to default executor
        DefaultExecutor.scheduleResumeAfterDelay(timeMillis, continuation)
    }

image-20241030174449474.png

协程3的状态

协程3和协程2基本一样,它也有一个ChildContinuation,除此之外,还有一个ResumeOnCompletion

因为使用了coroutine3.join(),协程3

private suspend fun joinSuspend() = suspendCancellableCoroutine<Unit> { cont ->
        // We have to invoke join() handler only on cancellation, on completion we will be resumed regularly without handlers
        cont.disposeOnCancellation(invokeOnCompletion(handler = ResumeOnCompletion(cont)))
    }

cont就是CancellableContinuationImpl

internal fun Job.invokeOnCompletion(
    invokeImmediately: Boolean = true,
    handler: JobNode,
): DisposableHandle = when (this) {
    // this就是coroutine3,即StandaloneCoroutine
    is JobSupport -> invokeOnCompletionInternal(invokeImmediately, handler)
    else -> invokeOnCompletion(handler.onCancelling, invokeImmediately, handler::invoke)
}
internal fun invokeOnCompletionInternal(
        invokeImmediately: Boolean,
        node: JobNode
    ): DisposableHandle {
    	// node是ResumeOnCompletion,this是协程3关联的StandaloneCoroutine
        node.job = this
        // Create node upfront -- for common cases it just initializes JobNode.job field,
        // for user-defined handlers it allocates a JobNode object that we might not need, but this is Ok.
        val added = tryPutNodeIntoList(node) { state, list ->
            ......
            
        }
        when {
            added -> return node
            invokeImmediately -> node.invoke((state as? CompletedExceptionally)?.cause)
        }
        return NonDisposableHandle
    }

image-20241030182221513.png

image-20241030182601102.png

ResumeOnCompletion内部的continuation指向的就是CancellableContinuationImpl,而CancellableContinuationImpl内部的delegate就是DispatchedContinuation,当协程3完后后会通知协程1可以恢复执行,就是因为这里可以引用到协程1,直接调用协程1的resume就可以继续在Join后的代码了。

image-20241030182829940.png