Kotlin协程之launch源码

1,054 阅读10分钟

前言

Kotlin 协程使用挺久了,惭愧的是到现在还不知道它具体怎么实现的。有人说它比线程更高效,也有人说它其实就是一个线程池框架?那到底是不是这样呢,今天就来学习下。

协程体

首先,来一个 launch 使用的例子:

fun main() {
    GlobalScope.launch {
        val result = add()
        println("add result:$result")
    }
    Thread.sleep(1000)
}

suspend fun add(): Long = withContext(Dispatchers.IO) {
    var sum = 0L
    for (i in 0..100) {
        sum += i
    }
    return@withContext sum
}

launch 会帮我们开启一个协程,然后需要传入一个协程体的参数用来处理协程需要处理的事务。

public fun CoroutineScope.launch(
    context: CoroutineContext = EmptyCoroutineContext,
    start: CoroutineStart = CoroutineStart.DEFAULT,
    block: suspend CoroutineScope.() -> Unit
): Job {
//...
}

也就是 suspend CoroutineScope.() -> Unit,怎么理解这个协程体呢,通过字节码和反编译会得到这个协程体的代码

public final class CoroutineTestKt {
   public static final void main() {
      BuildersKt.launch$default((CoroutineScope)GlobalScope.INSTANCE, (CoroutineContext)null, (CoroutineStart)null, (Function2)(new Function2((Continuation)null) {
         int label;

         @Nullable
         public final Object invokeSuspend(@NotNull Object $result) {
            Object var6 = IntrinsicsKt.getCOROUTINE_SUSPENDED();
            Object var10000;
            switch(this.label) {
            case 0:
               ResultKt.throwOnFailure($result);
               this.label = 1;
               var10000 = CoroutineTestKt.add(this);
               if (var10000 == var6) {
                  return var6;
               }
               break;
            case 1:
               ResultKt.throwOnFailure($result);
               var10000 = $result;
               break;
            default:
               throw new IllegalStateException("call to 'resume' before 'invoke' with coroutine");
            }

            long result = ((Number)var10000).longValue();
            String var4 = "add result:" + result;
            boolean var5 = false;
            System.out.println(var4);
            return Unit.INSTANCE;
         }

         @NotNull
         public final Continuation create(@Nullable Object value, @NotNull Continuation completion) {
            Intrinsics.checkNotNullParameter(completion, "completion");
            Function2 var3 = new <anonymous constructor>(completion);
            return var3;
         }

         public final Object invoke(Object var1, Object var2) {
            return ((<undefinedtype>)this.create(var1, (Continuation)var2)).invokeSuspend(Unit.INSTANCE);
         }
      }), 3, (Object)null);
      Thread.sleep(1000L);
   }

   // $FF: synthetic method
   public static void main(String[] var0) {
      main();
   }

   @Nullable
   public static final Object add(@NotNull Continuation $completion) {
      return BuildersKt.withContext((CoroutineContext)Dispatchers.getIO(), (Function2)(new Function2((Continuation)null) {
         int label;

         @Nullable
         public final Object invokeSuspend(@NotNull Object var1) {
            Object var6 = IntrinsicsKt.getCOROUTINE_SUSPENDED();
            switch(this.label) {
            case 0:
               ResultKt.throwOnFailure(var1);
               long sum = 0L;
               int var4 = 0;

               for(byte var5 = 100; var4 <= var5; ++var4) {
                  sum += (long)var4;
               }

               return Boxing.boxLong(sum);
            default:
               throw new IllegalStateException("call to 'resume' before 'invoke' with coroutine");
            }
         }

         @NotNull
         public final Continuation create(@Nullable Object value, @NotNull Continuation completion) {
            Intrinsics.checkNotNullParameter(completion, "completion");
            Function2 var3 = new <anonymous constructor>(completion);
            return var3;
         }

         public final Object invoke(Object var1, Object var2) {
            return ((<undefinedtype>)this.create(var1, (Continuation)var2)).invokeSuspend(Unit.INSTANCE);
         }
      }), $completion);
   }
}

通过反编译的代码只知道协程体里面包括的一些方法,找不到类名和继承关系,但通过字节码可以发现它是继承于 SuspendLambda 这个类。

class com/goach/retrofit/coroutine/CoroutineTestKt$main$1 
extends kotlin/coroutines/jvm/internal/SuspendLambda 
implements kotlin/jvm/functions/Function2 {
}

SuspendLambda 类又是什么呢?通过查看源码可以得到

image.png

知道了协程体的类关系后,再回头看下协程体里面的方法

  1. create 方法,它主要创建了一个 Continuation ,也就是 CoroutineTestKt
  2. invoke 方法,它会调用 create 方法先实例协程体编译的类 CoroutineTestKt,然后调用协程体的 invokeSuspend 方法
  3. invokeSuspend 方法,我们在协程体处理的事务就会在这个方法里面执行,挂起方法调用之前的代码在 switch 里面,通过 label 来判断执行哪段代码,比如上面的 add 方法,以及得到的结果。

那么,协程是怎么执行到协程体这里,以及怎么做到线程切换的呢?

协程创建的过程

要知道协程的创建,那就要来到 launch 方法的源码

public fun CoroutineScope.launch(
    context: CoroutineContext = EmptyCoroutineContext,
    start: CoroutineStart = CoroutineStart.DEFAULT,
    block: suspend CoroutineScope.() -> Unit
): Job {
    val newContext = newCoroutineContext(context)
    val coroutine = if (start.isLazy)
        LazyStandaloneCoroutine(newContext, block) else
        StandaloneCoroutine(newContext, active = true)
    coroutine.start(start, coroutine, block)
    return coroutine
}

第三个参数上面已经说过了,它是我们传入的协程体。注意这里的第二个参数,如果没有传,就默认是 DEFAULT 类型,它还有 ATOMIC,UNDISPATCHED,LAZY。不同类型启动协程方式不一样。这里我们看 DEFAULT 的。上面方法会来到 start 方法

public fun <R> start(start: CoroutineStart, receiver: R, block: suspend R.() -> T) {
    initParentJob()
    start(block, receiver, this)
}

然后在 start 方法,会执行传入的 start 的 invoke 方法,也就是 CoroutineStart 的 invoke 方法。在 invoke 方法里面就会根据枚举类型来启动协程。只是这 start 名字取的,不知道还以为是递归[破涕为笑]。

public operator fun <R, T> invoke(block: suspend R.() -> T, receiver: R, completion: Continuation<T>): Unit =
    when (this) {
        DEFAULT -> block.startCoroutineCancellable(receiver, completion)
        ATOMIC -> block.startCoroutine(receiver, completion)
        UNDISPATCHED -> block.startCoroutineUndispatched(receiver, completion)
        LAZY -> Unit // will start lazily
    }

当 DEFAULT 类型时候,这个地方的 receiver 类型是 StandaloneCoroutine,completion 是把 StandaloneCoroutine 当做 Continuation 传入。startCoroutineCancellable 是协程体的一个内联方法

internal fun <R, T> (suspend (R) -> T).startCoroutineCancellable(receiver: R, completion: Continuation<T>) =
    runSafely(completion) {
        createCoroutineUnintercepted(receiver, completion).intercepted().resumeCancellableWith(Result.success(Unit))
    }

这里调用了3个方法,createCoroutineUnintercepted,intercepted,resumeCancellableWith,首先执行 createCoroutineUnintercepted ,它是一个内联方法

public actual fun <R, T> (suspend R.() -> T).createCoroutineUnintercepted(
    receiver: R,
    completion: Continuation<T>
): Continuation<Unit> {
    val probeCompletion = probeCoroutineCreated(completion)
    return if (this is BaseContinuationImpl)
        create(receiver, probeCompletion)
    else {
        createCoroutineFromSuspendFunction(probeCompletion) {
            (this as Function2<R, Continuation<T>, Any?>).invoke(receiver, it)
        }
    }
}

其实它调用的就是我们上面说的协程体的 create 方法,然后 create 方法会实例化协程体,来看下协程体的 create 方法

public final Continuation create(@Nullable Object value, @NotNull Continuation completion) {
   Intrinsics.checkNotNullParameter(completion, "completion");
   Function2 var3 = new <anonymous constructor>(completion);
   return var3;
}

通过 new 的方法,帮我们实例化了协程体对应的类。也就是经过 createCoroutineUnintercepted 后会得到协程体的实例,接下来继续看 intercepted 方法,它会来的 ContinuationImpl 里面

@Transient
private var intercepted: Continuation<Any?>? = null

public fun intercepted(): Continuation<Any?> =
    intercepted
        ?: (context[ContinuationInterceptor]?.interceptContinuation(this) ?: this)
            .also { intercepted = it }

这是一个协程拦截器,当我们需要对然后协程体初始化后做一些操作时候,可以实现 ContinuationInterceptor,然后添加在 CoroutineContext 里面,同时返回处理后的 Continuation。

执行完 intercepted 之后,接着会执行 resumeCancellableWith

public fun <T> Continuation<T>.resumeCancellableWith(result: Result<T>): Unit = when (this) {
    is DispatchedContinuation -> resumeCancellableWith(result)
    else -> resumeWith(result)
}

由于传入时候没有设置 Dispatchers 所以这里不是 DispatchedContinuation,它是用在线程切换的时候,比如调用 withContext 时候,所以直接执行 resumeWith 方法,来到 BaseContinuationImpl 的 resumeWith 方法里面

public final override fun resumeWith(result: Result<Any?>) {
    // This loop unrolls recursion in current.resumeWith(param) to make saner and shorter stack traces on resume
    var current = this
    var param = result
    while (true) {
      //...省略
        with(current) {
           //...省略
            val outcome: Result<Any?> =
                try {
                    val outcome = invokeSuspend(param)
                    if (outcome === COROUTINE_SUSPENDED) return
                    Result.success(outcome)
                } catch (exception: Throwable) {
                    Result.failure(exception)
                }
          //...省略
        }
    }
}

到这里它,它进入 invokeSuspend,默认 label 为 0,这时候就会进入 add 方法。由于 add 方法是 suspend 方法,所以执行完 suspend 方法调用之前的代码后直接返回出去了,不会执行spend 函数后面的方法,到这里循环体判断返回 suspend 就直接结束循环了。也就是我们说的挂起了。那什么时候恢复呢?下面就要看 add 方法了。

到这里小结下协程的创建过程

协程是由协程体构成,kotlin 会帮我们生成一个协程体对应的类,继承于 SuspendLambda,里面有 create ,invokeSuspend ,invoke 方法,在 invokeSuspend 方法里面,会把挂起函数调用之前和之后的代码通过 switch 分段,然后根据 label 的值决定执行哪段代码。当我们调用 launch 方法时候,会调用 create 实例化协程体类, 创建完会通过 resumeCancellableWith 执行 invokeSuspend ,到此挂起函数之前的代码执行完毕,等待挂起函数得到结果之后的恢复再执行挂起函数之后的代码。

生成上面协程创建的部分时序图为:

image.png

如果 add 方法没有挂起,那执行到 invokeSuspend 就会直接执行结果代码,不需要挂起等待。而上面,add 方法是个挂起方法并且指定了线程为 Dispatchers.IO 处理。下面就进入了 add 方法

协程调度器

要知道 add 方法的执行过程,首先就要了解下协程调度器。在协程中,当我们需要切换协程的时候,我们一般通过 withContext 来操作,withContext 需要传递一个 CoroutineContext,它代表着我们需要使用哪种协程调度器,分别有以下几种类型

  • Dispatchers.Default,默认的调度器,它适合在主线程之外执行占用大量 CPU 资源的工作。
  • Dispatchers.IO,适合在主线程之外执行磁盘或网络 I/O
  • Dispatchers.Main,UI调度器
  • Dispatchers.Unconfined 协程可以在任何线程恢复,不限定特定线,用于保证某些事务完成之后才执行某些操作的场景

Dispatchers.Default

首先看下它的源码

internal val useCoroutinesScheduler = systemProp(COROUTINES_SCHEDULER_PROPERTY_NAME).let { value ->
    when (value) {
        null, "", "on" -> true
        "off" -> false
        else -> error("System property '$COROUTINES_SCHEDULER_PROPERTY_NAME' has unrecognized value '$value'")
    }
}

internal actual fun createDefaultDispatcher(): CoroutineDispatcher =
    if (useCoroutinesScheduler) DefaultScheduler else CommonPool

由于这里未配置 useCoroutinesScheduler 属性,所以直接走 DefaultScheduler,它继承于 ExperimentalCoroutineDispatcher

@InternalCoroutinesApi
public open class ExperimentalCoroutineDispatcher(
    private val corePoolSize: Int,
    private val maxPoolSize: Int,
    private val idleWorkerKeepAliveNs: Long,
    private val schedulerName: String = "CoroutineScheduler"
) : ExecutorCoroutineDispatcher() {
    public constructor(
        corePoolSize: Int = CORE_POOL_SIZE,
        maxPoolSize: Int = MAX_POOL_SIZE,
        schedulerName: String = DEFAULT_SCHEDULER_NAME
    ) : this(corePoolSize, maxPoolSize, IDLE_WORKER_KEEP_ALIVE_NS, schedulerName)

    @Deprecated(message = "Binary compatibility for Ktor 1.0-beta", level = DeprecationLevel.HIDDEN)
    public constructor(
        corePoolSize: Int = CORE_POOL_SIZE,
        maxPoolSize: Int = MAX_POOL_SIZE
    ) : this(corePoolSize, maxPoolSize, IDLE_WORKER_KEEP_ALIVE_NS)

    override val executor: Executor
        get() = coroutineScheduler

private var coroutineScheduler = createScheduler()

private fun createScheduler() = CoroutineScheduler(corePoolSize, maxPoolSize, idleWorkerKeepAliveNs, schedulerName)
}

这里面有个默认构造器,传入了我们熟悉的线程池需要的几个参数,而 executor 属性,它就是我们要找到线程池,继承于 CoroutineScheduler,然后 CoroutineScheduler 又继承于 Executor。我们经常使用的 ThreadPoolExecutor 也是最终继承于 Executor。读到这里,再思考下 Kotlin 协程是一个线程池框架吗?

它是一个由 kotlin 自己实现的一个线程池框架,但不能误以为是 Java 的线程池 ThreadPoolExecutor 的一次封装,它们直接实现原理是有区别的,具体的实现原理在核心类 CoroutineScheduler 里面。

明白了这个问题后,再捋下上面的线程池类的关系图

image.png

CoroutineScheduler

想要进一步理解协程的调度器,那就不得不学习下 CoroutineScheduler 了,它主要的任务就是把已经运行的协程分配到工作线程上面。并且用两个队列本地队列和全局队列来存储任务,优先添加到本地队列里面,从而避免外面大量的调用。而且每当来一个协程调用,它总是会放在头部,从而消除了调度延时。下面来看几个方法看是不是这样

class CoroutineScheduler {
    fun dispatch(block: Runnable, taskContext: TaskContext = NonBlockingContext, tailDispatch: Boolean = false) {
        trackTask() // this is needed for virtual time support
        //创建一个任务
        val task = createTask(block, taskContext)
        // try to submit the task to the local queue and act depending on the result
        val currentWorker = currentWorker()
        //notAdded 判断是否添加到了本地队列,如果阻塞队列任务正在执行就不会添加到本地队列
        val notAdded = currentWorker.submitToLocalQueue(task, tailDispatch)
        if (notAdded != null) {
            //没有添加到本地阻塞队列,添加到全局队列里面
            if (!addToGlobalQueue(notAdded)) {
                // Global queue is closed in the last step of close/shutdown -- no more tasks should be accepted
                throw RejectedExecutionException("$schedulerName was terminated")
            }
        }
        val skipUnpark = tailDispatch && currentWorker != null
        // Checking 'task' instead of 'notAdded' is completely okay
        if (task.mode == TASK_NON_BLOCKING) {
            if (skipUnpark) return
            //尝试去创建线程池执行任务
            signalCpuWork()
        } else {
            //尝试去创建线程池执行任务
            // Increment blocking tasks anyway
            signalBlockingWork(skipUnpark = skipUnpark)
        }
    }

    private fun CoroutineScheduler.Worker?.submitToLocalQueue(task: Task, tailDispatch: Boolean): Task? {
        if (this == null) return task
        //线程池关闭了,那就不添加到本地队列里面去了
        if (state === CoroutineScheduler.WorkerState.TERMINATED) return task
        // TASK_NON_BLOCKING 代表是非 IO 任务,也就是 CPU 任务,这里是让 CPU 任务不要添加到本地队列里面
        if (task.mode == TASK_NON_BLOCKING && state === CoroutineScheduler.WorkerState.BLOCKING) {
            return task
        }
        mayHaveLocalTasks = true
        // 添加到本地队列,并且返回 null
        return localQueue.add(task, fair = tailDispatch)
    }
    private fun tryCreateWorker(state: Long = controlState.value): Boolean {
        //创建的线程数
        val created = createdWorkers(state)
        //阻塞的线程数
        val blocking = blockingTasks(state)
        //计算非阻塞的线程
        val cpuWorkers = (created - blocking).coerceAtLeast(0)
        /*
         * We check how many threads are there to handle non-blocking work,
         * and create one more if we have not enough of them.
         */
        //正在执行的线程数小于核心线程数,创建线程
        if (cpuWorkers < corePoolSize) {
            val newCpuWorkers = createNewWorker()
            // If we've created the first cpu worker and corePoolSize > 1 then create
            // one more (second) cpu worker, so that stealing between them is operational
            //这里也就是当核心线程多于1个的时候,第一次创建会同时创建2个线程来执行任务,这样就可以互相偷对方的任务来执行,使得任务均匀分配
            if (newCpuWorkers == 1 && corePoolSize > 1) createNewWorker()
            if (newCpuWorkers > 0) return true
        }
        return false
    }
}

当我们执行一个协程的时候,首先会创建一个任务,如果是一个 IO 任务,并且当前不在执行任务的时候就会添加到本地队列里面,让线程池直接去执行,否则就添加到全局队列里面,等待本地队列执行完后再执行全局队列里面的任务。再看下添加到本地队列的源码

WorkQueue {
fun add(task: Task, fair: Boolean = false): Task? {
    if (fair) return addLast(task)
    val previous = lastScheduledTask.getAndSet(task) ?: return null
    return addLast(previous)
}
}

可以看到,在非公平的模式下,它确实是总是把当前执行任务添加到本地队列的头部。到此,队列添加完了,线程池里面的工作线程也创建好了,接下来就来看下它执行任务的代码

internal inner class Worker private constructor() : Thread() {
    override fun run() = runWorker()

    @JvmField
    var mayHaveLocalTasks = false

    private fun runWorker() {
        var rescanned = false
        while (!isTerminated && state != CoroutineScheduler.WorkerState.TERMINATED) {
            //从2个队列里面查找任务
            val task = findTask(mayHaveLocalTasks)
            // Task found. Execute and repeat
            if (task != null) {
                rescanned = false
                minDelayUntilStealableTaskNs = 0L
                //执行任务
                executeTask(task)
                continue
            } else {
                mayHaveLocalTasks = false
            }
            //没找到任务时候,来到这里,minDelayUntilStealableTask 不为 0,代表可能有需要窃取的任务
            if (minDelayUntilStealableTaskNs != 0L) {
                if (!rescanned) {
                    //设置为 true,进入下个循环,再次检查任务队列里面是否有任务
                    rescanned = true
                } else {
                    rescanned = false
                    tryReleaseCpu(CoroutineScheduler.WorkerState.PARKING)
                    Thread.interrupted()
                    //延时 minDelayUntilStealableTaskNs
                    LockSupport.parkNanos(minDelayUntilStealableTaskNs)
                    minDelayUntilStealableTaskNs = 0L
                }
                continue
            }
            /*
             * 2) Or no tasks available, time to park and, potentially, shut down the thread.
             * Add itself to the stack of parked workers, re-scans all the queues
             * to avoid missing wake-up (requestCpuWorker) and either starts executing discovered tasks or parks itself awaiting for new tasks.
             */
            tryPark()
        }
        tryReleaseCpu(CoroutineScheduler.WorkerState.TERMINATED)
    }

    fun findTask(scanLocalQueue: Boolean): Task? {
        if (tryAcquireCpuPermit()) return findAnyTask(scanLocalQueue)
        //本地队列里面有添加任务的时候,scanLocalQueue 为 true
        val task = if (scanLocalQueue) {//先从本地队列里面拿任务执行,然后拿globalBlockingQueue阻塞队列里面任务执行
            localQueue.poll() ?: globalBlockingQueue.removeFirstOrNull()
        } else {
            globalBlockingQueue.removeFirstOrNull()
        }
        return task ?: trySteal(blockingOnly = true)
    }

    // Counterpart to "tryUnpark"
    private fun tryPark() {
        if (!inStack()) {
            parkedWorkersStackPush(this)
            return
        }
        kotlinx.coroutines.assert { localQueue.size == 0 }
        workerCtl.value = CoroutineScheduler.PARKED // Update value once
        while (inStack()) { // Prevent spurious wakeups
            if (isTerminated || state == CoroutineScheduler.WorkerState.TERMINATED) break
            tryReleaseCpu(CoroutineScheduler.WorkerState.PARKING)
            Thread.interrupted() // Cleanup interruptions
            park()
        }
    }

    private fun executeTask(task: Task) {
        val taskMode = task.mode
        //如果当前任务是IO任务时候,将线程池状态 state 由 PARKING 挂起转换为 BLOCKING 状态
        idleReset(taskMode)
        //如果当前任务是IO任务时候,因为 IO 任务占用 CPU 资源少,所以这里尝试释放 cpu 权限,并且尝试唤醒线程,如果唤醒失败了,那就重新创建一个新的线程.
        beforeTask(taskMode)
        //开始执行任务
        runSafely(task)
        //任务执行完后,将线程池状态恢复到初始状态
        afterTask(taskMode)
    }

    private fun idleReset(mode: Int) {
        terminationDeadline = 0L // reset deadline for termination
        if (state == CoroutineScheduler.WorkerState.PARKING) {
            kotlinx.coroutines.assert { mode == TASK_PROBABLY_BLOCKING }
            state = CoroutineScheduler.WorkerState.BLOCKING
        }
    }

    private fun beforeTask(taskMode: Int) {
        if (taskMode == TASK_NON_BLOCKING) return
        // Always notify about new work when releasing CPU-permit to execute some blocking task
        if (tryReleaseCpu(CoroutineScheduler.WorkerState.BLOCKING)) {
            signalCpuWork()
        }
    }

    fun runSafely(task: Task) {
        try {
            task.run()
        } catch (e: Throwable) {
            val thread = Thread.currentThread()
            thread.uncaughtExceptionHandler.uncaughtException(thread, e)
        } finally {
            unTrackTask()
        }
    }

    private fun afterTask(taskMode: Int) {
        if (taskMode == TASK_NON_BLOCKING) return
        decrementBlockingTasks()
        val currentState = state
        // Shutdown sequence of blocking dispatcher
        if (currentState !== CoroutineScheduler.WorkerState.TERMINATED) {
            kotlinx.coroutines.assert { currentState == CoroutineScheduler.WorkerState.BLOCKING } // "Expected BLOCKING state, but has $currentState"
            state = CoroutineScheduler.WorkerState.DORMANT
        }
    }
    }

看完上面的代码,有以下几个疑问

怎么理解线程的几个状态

  1. DORMANT,初始状态
  2. PARKING,线程空闲中
  3. BLOCKING,正在执行 IO 任务
  4. CPU_ACQUIRED,CPU 占用中,正在执行 CPU 型任务
  5. TERMINATED,关闭状态

这里怎么理解线程窃取任务的行为呢?

从上面的代码可以知道,当线程自己的任务都执行完成后,没有任务的时候它就会去从其他线程队列里面偷任务执行。如果没有可偷得任务,那就延时 minDelayUntilStealableTaskNs 后重新扫描可执行的任务。这样做的主要原因是为了任务分配均匀。

怎么窃取任务的呢

minDelayUntilStealableTaskNs 为 0 代表没有可窃取的任务,不为 0 的时候代表有可窃取的任务,minDelayUntilStealableTaskNs 后再唤醒线程查找任务执行。那么 minDelayUntilStealableTaskNs 在哪里赋值的呢?首先在 runWorker 方法里面,会通过 findTask 查找可执行的任务,当自己的任务都执行完成后,就会来到 trySteal 方法查找可窃取的任务

private fun trySteal(blockingOnly: Boolean): Task? {
       //省略...
        repeat(created) {
            ++currentIndex
            if (currentIndex > created) currentIndex = 1
            val worker = workers[currentIndex]
            if (worker !== null && worker !== this) {
                assert { localQueue.size == 0 }
                val stealResult = if (blockingOnly) {
                    localQueue.tryStealBlockingFrom(victim = worker.localQueue)
                } else {
                    localQueue.tryStealFrom(victim = worker.localQueue)
                }
                if (stealResult == TASK_STOLEN) {
                    return localQueue.poll()
                } else if (stealResult > 0) {
                    minDelay = min(minDelay, stealResult)
                }
            }
        }
        minDelayUntilStealableTaskNs = if (minDelay != Long.MAX_VALUE) minDelay else 0
        return null
    }
}

上面方法里面只需要关注 tryStealBlockingFrom 和 tryStealFrom,通过它们返回的值来计算 minDelayUntilStealableTaskNs。

fun tryStealBlockingFrom(victim: WorkQueue): Long {
    assert { bufferSize == 0 }
    var start = victim.consumerIndex.value
    val end = victim.producerIndex.value
    val buffer = victim.buffer

    while (start != end) {
        val index = start and MASK
        if (victim.blockingTasksInBuffer.value == 0) break
        val value = buffer[index]
        if (value != null && value.isBlocking && buffer.compareAndSet(index, value, null)) {
            victim.blockingTasksInBuffer.decrementAndGet()
            add(value)
            return TASK_STOLEN
        } else {
            ++start
        }
    }
    return tryStealLastScheduled(victim, blockingOnly = true)
}

tryStealBlockingFrom 是在获取不到 CPU 的权限时候,尝试窃取 IO 任务。把其他线程任务添加给自己处理。blockingTasksInBuffer 为 0 的时候,也就是任务都窃取完后,尝试计算一个延时的值,到达延时时间后再唤醒线程查找任务,让他尽可能窃取到更多任务。

fun tryStealFrom(victim: WorkQueue): Long {
    assert { bufferSize == 0 }
    val task  = victim.pollBuffer()
    if (task != null) {
        val notAdded = add(task)
        assert { notAdded == null }
        return TASK_STOLEN
    }
    return tryStealLastScheduled(victim, blockingOnly = false)
}

当获取到 CPU 使用权后,会执行 tryStealFrom 方法,直接窃取任务,直到 localQueue 队列被窃取完,也是计算一个延时的值,然后再唤醒线程查找任务。最后看下计算时间的方法

private fun tryStealLastScheduled(victim: WorkQueue, blockingOnly: Boolean): Long {
    while (true) {
        //当最后一个任务的值都为 null,代表没任务可以偷了
        val lastScheduled = victim.lastScheduledTask.value ?: return NOTHING_TO_STEAL
        //如果只是查 IO 任务,最后一个任务又不是 IO 任务,那就直接返回没任务可偷
        if (blockingOnly && !lastScheduled.isBlocking) return NOTHING_TO_STEAL

        // TODO time wraparound ?
        //当前执行时间
        val time = schedulerTimeSource.nanoTime()
        //创建多久了?当前执行时间 - 任务创建时间
        val staleness = time - lastScheduled.submissionTime
        //当创建的足够久了,那就算个时间给线程
        if (staleness < WORK_STEALING_TIME_RESOLUTION_NS) {
            return WORK_STEALING_TIME_RESOLUTION_NS - staleness
        }

        /*
         * If CAS has failed, either someone else had stolen this task or the owner executed this task
         * and dispatched another one. In the latter case we should retry to avoid missing task.
         */
        if (victim.lastScheduledTask.compareAndSet(lastScheduled, null)) {
            add(lastScheduled)
            return TASK_STOLEN
        }
        continue
    }
}

所以回到 runWorker 的 minDelayUntilStealableTaskNs 就能理解它了,当有一个或者多个任务的时候,并且该任务创建的等待时间足够久了 WORK_STEALING_TIME_RESOLUTION_NS,才会不为 0,也许拥有这个逻辑所以 kotlin 把它叫协程的吧。findTask 流程有点复杂,捋下它的流程:

image.png

根据 cpu 使用情况,以及不同优先级先从 localQueue 和 globalQueue 获取任务,当获取到任务则返回任务并且执行,如果没有获取则尝试窃取任务,窃取到了任务直接执行,没有窃取到任务,线程进入延时,然后延时后重新唤醒线程查找任务执行。

怎么执行任务的,以及怎么返回结果的?

由于 add 指定了 Dispatchers.IO,从 witchContext 出发,还是会来到 startCoroutineCancellable,然后来到 intercepted,这里和上面 Global.launch 不一样,Dispatchs 实现了 ContinuationInterceptor,所以调用 interceptContinuation 会得到 DispatchedContinuation

BaseContinuationImpl.kt

public fun intercepted(): Continuation<Any?> =
    intercepted
        ?: (context[ContinuationInterceptor]?.interceptContinuation(this) ?: this)
            .also { intercepted = it }

CoroutineDispatcher.kt

public final override fun <T> interceptContinuation(continuation: Continuation<T>): Continuation<T> =
    DispatchedContinuation(this, continuation)

这里 DispatchedContinuation 的 continuation 需要记住,它传递的是 add 方法协程体类 SuspendLambda,等下线程执行完任务后会通过这个参数调用 resumeWith 。执行完 intercepted 后,会来到 resumeCancellableWith 方法

public fun <T> Continuation<T>.resumeCancellableWith(result: Result<T>): Unit = when (this) {
    is DispatchedContinuation -> resumeCancellableWith(result)
    else -> resumeWith(result)
}

由于这里是 DispatchedContinuation,所以会走 resumeCancellableWith 了

inline fun resumeCancellableWith(result: Result<T>) {
    val state = result.toState()
    if (dispatcher.isDispatchNeeded(context)) {
        _state = state
        resumeMode = MODE_CANCELLABLE
        dispatcher.dispatch(context, this)
    } else {
        executeUnconfined(state, MODE_CANCELLABLE) {
            if (!resumeCancelled()) {
                resumeUndispatchedWith(result)
            }
        }
    }
}

它会来到我们上面分析的 dispatch 方法,开始添加到线程池里面执行了。这里的第二个参数 block 传的是 this ,也就是 DispatchedContinuation,它继承于 Runnable,来到 ContinuateSchedule 的 dispatch 方法,它会帮再创建一个 Runnable TaskImpl


fun dispatch(block: Runnable, taskContext: TaskContext = NonBlockingContext, tailDispatch: Boolean = false) {
val task = createTask(block, taskContext)
}
internal fun createTask(block: Runnable, taskContext: TaskContext): Task {
    val nanoTime = schedulerTimeSource.nanoTime()
    if (block is Task) {
        block.submissionTime = nanoTime
        block.taskContext = taskContext
        return block
    }
    return TaskImpl(block, nanoTime, taskContext)
}

internal class TaskImpl(
    @JvmField val block: Runnable,
    submissionTime: Long,
    taskContext: TaskContext
) : Task(submissionTime, taskContext) {
    override fun run() {
        try {
            block.run()
        } finally {
            taskContext.afterTask()
        }
    }

}

创建任务的时候,会记录当前提交的时间。然后添加到队列里面。再回到线程里面的 runWorker 方法执行,当获取到任务后,就会执行任务的 run 方法,先执行到 TaskImpl 的 run,再执行 DispatchedContinuation 的 run 方法。

internal inner class Worker private constructor() : Thread() {
private fun runWorker() {
//其他省略...
executeTask(task)
}
private fun executeTask(task: Task) {
    //其他省略...
    runSafely(task)
}
}
fun runSafely(task: Task) {
    try {
        task.run()
    } catch (e: Throwable) {
        val thread = Thread.currentThread()
        thread.uncaughtExceptionHandler.uncaughtException(thread, e)
    } finally {
        unTrackTask()
    }
}

这样就来到了 DispatchedContinuation 的 run 方法,DispatchedContinuation 继承于 SchedulerTask,在里面实现的 run 方法

public final override fun run() {
//其他省略...
    if (exception == null && job != null && !job.isActive) {
                val cause = job.getCancellationException()
                cancelResult(state, cause)
                continuation.resumeWithStackTrace(cause)
            } else {
                if (exception != null) continuation.resumeWithException(exception)
                else continuation.resume(getSuccessfulResult(state))
            }
}

public inline fun <T> Continuation<T>.resume(value: T): Unit =
    resumeWith(Result.success(value))

这里我们可以看到执行了 continuation.resume,然后执行了 continuation 的 resumeWith,也就是 SuspendLambda 里面的 resumeWith 方法里面。

BaseContinuationImpl.kt

public final override fun resumeWith(result: Result<Any?>) {
//...省略其他
try {
                    val outcome = invokeSuspend(param)
                    if (outcome === COROUTINE_SUSPENDED) return
                    Result.success(outcome)
                } catch (exception: Throwable) {
                    Result.failure(exception)
                }
   //...省略其他             
if (completion is BaseContinuationImpl) {
        // unrolling recursion via loop
        current = completion
        param = outcome
    } else {
        // top-level completion reached -- invoke and return
        completion.resumeWith(outcome)
        return
    }

}

这里关注两个地方,一个是 invokeSuspend,上文有提到过,这里就是 add 方法协程体类里面的 invokeSuspend 方法,执行后它会得到 add 方法的结果。还一个地方就是 completion.resumeWith,这个 completion 是 GlobalScope.launch 方法里面调用 add 方法时候传进来的,也就是这里再次 GlobalScope.launch 执行 resumeWith 方法。这样得到了执行上面挂起后剩下的代码,至此,协程恢复完毕,并且得到结果。

image.png

Dispatchers.Main

熟悉了上面流程后,再简单来看看协程是怎么切换到线程的,来到 MainDispatcherLoader

internal object MainDispatcherLoader {

    private val FAST_SERVICE_LOADER_ENABLED = systemProp(FAST_SERVICE_LOADER_PROPERTY_NAME, true)

    @JvmField
    val dispatcher: MainCoroutineDispatcher = loadMainDispatcher()

    private fun loadMainDispatcher(): MainCoroutineDispatcher {
        return try {
            val factories = if (FAST_SERVICE_LOADER_ENABLED) {
                FastServiceLoader.loadMainDispatcherFactory()
            } else {
               //省略...
        }
    }
}

这里执行了 loadMainDispatcher,继续执行到 loadmainDispatcherFactory

internal fun loadMainDispatcherFactory(): List<MainDispatcherFactory> {
    val clz = MainDispatcherFactory::class.java
    if (!ANDROID_DETECTED) {
        return load(clz, clz.classLoader)
    }

    return try {
        val result = ArrayList<MainDispatcherFactory>(2)
        createInstanceOf(clz, "kotlinx.coroutines.android.AndroidDispatcherFactory")?.apply { result.add(this) }
        createInstanceOf(clz, "kotlinx.coroutines.test.internal.TestMainDispatcherFactory")?.apply { result.add(this) }
        result
    } catch (e: Throwable) {
        // Fallback to the regular SL in case of any unexpected exception
        load(clz, clz.classLoader)
    }
}

可以看到通过反射的方式实例化了 AndroidDispatcherFactory

internal class AndroidDispatcherFactory : MainDispatcherFactory {

    override fun createDispatcher(allFactories: List<MainDispatcherFactory>) =
        HandlerContext(Looper.getMainLooper().asHandler(async = true))
//...省略
}

可以看到,初始化 HandlerContext,并且传递的是 Handler.getMainLooper()

internal class HandlerContext private constructor(
    private val handler: Handler,
    private val name: String?,
    private val invokeImmediately: Boolean
) : HandlerDispatcher(), Delay {
    /**
     * Creates [CoroutineDispatcher] for the given Android [handler].
     *
     * @param handler a handler.
     * @param name an optional name for debugging.
     */
    public constructor(
        handler: Handler,
        name: String? = null
    ) : this(handler, name, false)

  //...省略
    override fun dispatch(context: CoroutineContext, block: Runnable) {
        handler.post(block)
    }
    }

所以得出结论,协程也是通过 Handler 切换到 UI 线程的。

里面还有很多知识可值得学习和思考,后续再学习学习,结束。