kotlin-协程(九)协程的异常处理

972 阅读2分钟

在 Kotlin 协程当中,我们通常把异常分为两大类,一类是取消异常(CancellationException),另一类是其他异常。之所以要这么分类,是因为在 Kotlin 协程当中,这两种异常的处理方式是不一样的。或者说,在 Kotlin 协程所有的异常当中,我们需要把CancellationException 单独拎出来,特殊对待。

当协程任务被取消的时候,协程内部是会产生一个 CancellationException 。很多初学者都会遇到一个问题,那就是协程无法被取消。带着这个问题进入下面的内容。

一、协程的取消需要内部配合

先看看下面的例子

fun main() {
    runBlocking {
        printMsg("start")
        val job = launch(Dispatchers.IO) {
            var i = 0
            while (true) {
                Thread.sleep(500L)
                i++
                printMsg("i = $i")
            }
        }
        delay(2000L)
        job.cancel()   <------2秒后在协程作用域内取消协程
        job.join()
        printMsg("end")
    }
}

//日志
main @coroutine#1 start
DefaultDispatcher-worker-1 @coroutine#2 i = 1
DefaultDispatcher-worker-1 @coroutine#2 i = 2
DefaultDispatcher-worker-1 @coroutine#2 i = 3
//停不下来了
......

为什么2秒后协程没有退出呢?这是因为协程是协作式的,我们在作用域内调用了cancel方法,协程需要自己检查取消状态,并在适当的时机主动做出响应。取消状态可以通过协程的 isActive 属性进行检查。但是在我们的代码中,由于是无限循环,协程没有时机主动检查取消状态,因此协程无法感知到取消请求并退出。

改造上面的代码

fun main() {
    runBlocking {
        printMsg("start")
        val job = launch(Dispatchers.IO) {
            var i = 0
            while (isActive) {      <----------方式一:主动提供检查的时机
                Thread.sleep(500L)
                //delay(500L)    <----------方式二:挂起函数,在挂起点检查自己的状态
                i++
                printMsg("i = $i")
            }
        }
        delay(2000L)
        job.cancel()
        job.join()
        printMsg("end")
    }
}

//日志

提供了二种方式:

  • 方式一:通过isActive主动发起状态检查。

  • 方式二:将sleep改为挂起函数delay,因为挂起函数在挂起或恢复的时候肯定会检查协程的状态(比如协程已经被cancel肯定不会再从挂起恢复了)。

综上,协程代码如果无法被cancel,请检查协程是否有检查状态的时机。

二、不要打破协程的父子结构

看下面的例子

var startTime: Long = 0
fun main() {
    runBlocking {
        startTime = System.currentTimeMillis()
        printMsg("start")
        var childJob1: Job? = null
        var childJob2: Job? = null
        val parentJob = launch(Dispatchers.IO) {
            childJob1 = launch {        <------子协程使用父协程的上下文
                printMsg("childJob1 start")
                delay(600L)          <------子协程挂起600毫秒后执行完
                printMsg("childJob1 end")
            }

            childJob2 = launch(Job()) {       <------子协程使用自己的上下文
                printMsg("childJob2 start")
                delay(600L)          <------子协程挂起600毫秒后执行完
                printMsg("childJob2 end")
            }
        }

        delay(400L)
        parentJob.cancel()    <---------400毫秒后取消父协程
        printMsg("childJob1.isActive=${childJob1?.isActive}")   <-----程序执行完时打印子协程1的状态
        printMsg("childJob2.isActive=${childJob2?.isActive}")   <-----程序执行完时打印子协程2的状态
        printMsg("end")       <-----程序执行完时
    }
}

fun printMsg(msg: Any) {
    println("打印内容:$msg 消耗时间:${System.currentTimeMillis() - startTime} 线程信息:${Thread.currentThread().name} ")
}

//日志
打印内容:start 消耗时间:0 线程信息:main @coroutine#1 
打印内容:childJob1 start 消耗时间:15 线程信息:DefaultDispatcher-worker-3 @coroutine#3 
打印内容:childJob2 start 消耗时间:22 线程信息:DefaultDispatcher-worker-2 @coroutine#4 
打印内容:childJob1.isActive=false 消耗时间:430 线程信息:main @coroutine#1 
打印内容:childJob2.isActive=true 消耗时间:430 线程信息:main @coroutine#1    <------程序执行完时,子协程2并没有退出,isActive=true
打印内容:end 消耗时间:430 线程信息:main @coroutine#1 
Process finished with exit code 0

代码并不难,注释也很详细。可以看到子协程2使用自己的上下文后脱离了父协程的控制,当父协程被cancel后,子协程2并没有被cancelisActive状态仍然是true

所以,不要打破协程的父子结构!

三、不要用 try-catch 直接包裹 launch、async

看下面的例子

fun main() {
    runBlocking {
        printMsg("start")
        try {
            printMsg("try start")
            launch {
                printMsg("launch start")
                delay(200L)
                1 / 0          <------------200毫秒后创建一个异常
                printMsg("launch end")
            }
            printMsg("try end")
        } catch (exception: Exception) {
            printMsg("catch $exception")
        }
        printMsg("end")
    }
}

//日志
main @coroutine#1 start
main @coroutine#1 try start
main @coroutine#1 try end
main @coroutine#1 end
main @coroutine#2 launch start
Exception in thread "main" java.lang.ArithmeticException: / by zero    <-----报错程序崩溃

虽然try-catch包裹了协程的内容,但是程序还是报错,这是因为子协程与父协程是并发执行的,它们之间是独立的执行流程,所以上面代码中父协程的 try-catch 无法捕获子协程抛出的异常。

try-catch修改上面的代码

fun main() {
    runBlocking {
        printMsg("start")
        launch {
            printMsg("launch start")
            try {
                printMsg("try start")
                delay(200L)
                1 / 0
                printMsg("try end")
            } catch (exception: Exception) {
                printMsg("catch $exception")
            }
            printMsg("launch end")
        }
        printMsg("end")
    }
}

//日志
main @coroutine#1 start
main @coroutine#1 end
main @coroutine#2 launch start
main @coroutine#2 try start
main @coroutine#2 catch java.lang.ArithmeticException: / by zero    <------异常被成功捕获
main @coroutine#2 launch end
Process finished with exit code 0

如果使用async创建协程,try-catch是应该包裹async内的代码块还是应该包裹deferred.await()? 写段代码看看

fun main() {
    runBlocking {
        printMsg("start")
        val deferred = async() {
            printMsg("async start")
            delay(200L)
            1 / 0
            printMsg("async end")
        }

        try {
            deferred.await()
        } catch (exception: Exception) {
            printMsg("catch $exception")
        }

        printMsg("end")
    }
}

//日志
main @coroutine#1 start
main @coroutine#2 async start
main @coroutine#1 catch java.lang.ArithmeticException: / by zero     <------捕获到了异常
main @coroutine#1 end
Exception in thread "main" java.lang.ArithmeticException: / by zero   <-----报错程序崩溃

虽然捕获到了异常,但是程序还是报错了,所以try-catch一般还是包裹具体的代码块吧。

四、使用SupervisorJob

上面的一段代码try-catchdeferred.await()仍然报错,有没有办法补救这段代码呢?答案是有,可以使用SupervisorJob()。代码如下:

fun main() {
    runBlocking {
        printMsg("start")
        val deferred = async(SupervisorJob()) {      <-------变化在这里
            printMsg("async start")
            delay(200L)
            1 / 0
            printMsg("async end")
        }

        try {
            deferred.await()
        } catch (exception: Exception) {
            printMsg("catch $exception")
        }

        printMsg("end")
    }
}

//日志
main @coroutine#1 start
main @coroutine#2 async start
main @coroutine#1 catch java.lang.ArithmeticException: / by zero
main @coroutine#1 end
Process finished with exit code 0

为什么加了SupervisorJob()就不报错了? 看下SupervisorJob()的源码:

@Suppress("FunctionName")
public fun SupervisorJob(parent: Job? = null) : CompletableJob = SupervisorJobImpl(parent)

public interface CompletableJob : Job {
    
    public fun complete(): Boolean

    public fun completeExceptionally(exception: Throwable): Boolean
}

SupervisorJob() 其实不是构造函数,它只是一个普通的顶层函数。而这个方法返回的对象,是 Job 的子类。默认的 Job 类型会将异常传播给父协程,如果一个子协程抛出异常,它会取消父协程及其所有兄弟协程。

通过使用 SupervisorJob,我们可以创建一个具有独立异常处理行为的作业层级。这意味着即使子协程中发生异常,父协程仍然可以继续执行而不会被取消,从而避免整个程序崩溃。

SupervisorJob()可以作为 CoroutineScope 的上下文,但是它的监管范围并不是无限大的,看下面的例子:

fun main() {
    runBlocking {
        val supervisorJob = SupervisorJob()    
        val scope = CoroutineScope(coroutineContext + supervisorJob)    <-----作用域内使用SupervisorJob()
        val job = scope.launch {              <----注意这里,作用域内启动子协程
            launch {                   <----注意这里,作用域内启动孙协程
                printMsg("job1 start")
                delay(200L)
                throw  ArithmeticException("by zero")
            }
            launch {
                printMsg("job2 start")
                delay(300L)
                printMsg("job2 end")      <----关注这个日志
            }
        }
        job.join()
        scope.cancel()
    }
}

//日志
main @coroutine#3 job1 start
main @coroutine#4 job2 start
Exception in thread "main @coroutine#4" java.lang.ArithmeticException: by zero
Process finished with exit code 0

上面的日志中并没有输出job2 end,说明上面job1的异常影响了下面协程job2的执行,那如何修改呢?

fun main() {
    runBlocking {
        val supervisorJob = SupervisorJob()
        val scope = CoroutineScope(coroutineContext + supervisorJob)
        scope.apply {               <----------变化在这里,launch改为apply
            val job1 = launch {
                printMsg("job1 start")
                delay(200L)
                throw  ArithmeticException("by zero")
            }
            val job2 = launch {
                printMsg("job2 start")
                delay(300L)
                printMsg("job2 end")
            }
            job1.join()       <----------变化在这里
            job2.join()
        }
        scope.cancel()
    }
}

//日志
main @coroutine#2 job1 start
main @coroutine#3 job2 start
Exception in thread "main @coroutine#2" java.lang.ArithmeticException: by zero
main @coroutine#3 job2 end        <--------成功输出: job2 end
Process finished with exit code 0

可以看到当将 SupervisorJob 作为 CoroutineScope 的上下文时,它的监管范围仅限于该作用域内部启动的子协程。

SupervisorJob的源码中是因为重写了childCancelled方法并直接返回false,保证异常不会向父协程和其他子协程传递:

private class SupervisorJobImpl(parent: Job?) : JobImpl(parent) {
    override fun childCancelled(cause: Throwable): Boolean = false
}

事实上kotlin有提供给我们含SupervisorJob上下文的协程作用域,它就是supervisorScope,源码如下:


/**
 * Creates a [CoroutineScope] with [SupervisorJob] and calls the specified suspend block with this scope.
 * The provided scope inherits its [coroutineContext][CoroutineScope.coroutineContext] from the outer scope, but overrides
 * context's [Job] with [SupervisorJob].
 * This function returns as soon as the given block and all its child coroutines are completed.
 *
 * Unlike [coroutineScope], a failure of a child does not cause this scope to fail and does not affect its other children,
 * so a custom policy for handling failures of its children can be implemented. See [SupervisorJob] for additional details.
 * A failure of the scope itself (exception thrown in the [block] or external cancellation) fails the scope with all its children,
 * but does not cancel parent job.
 *
 * The method may throw a [CancellationException] if the current job was cancelled externally,
 * or rethrow an exception thrown by the given [block].
 */
public suspend fun <R> supervisorScope(block: suspend CoroutineScope.() -> R): R {
    contract {
        callsInPlace(block, InvocationKind.EXACTLY_ONCE)
    }
    return suspendCoroutineUninterceptedOrReturn { uCont ->
        val coroutine = SupervisorCoroutine(uCont.context, uCont)      <-------SupervisorCoroutine
        coroutine.startUndispatchedOrReturn(coroutine, block)
    }
}

private class SupervisorCoroutine<in T>(
    context: CoroutineContext,
    uCont: Continuation<T>
) : ScopeCoroutine<T>(context, uCont) {
    override fun childCancelled(cause: Throwable): Boolean = false    <-------同样重写了childCancelled方法返回false
}

我们使用supervisorScope改造上面的代码:

fun main() {
    runBlocking {
        supervisorScope {
            val job1 = launch {
                printMsg("job1 start")
                delay(200L)
                throw  ArithmeticException("by zero")
            }
            val job2 = launch {
                printMsg("job2 start")
                delay(300L)
                printMsg("job2 end")
            }
            job1.join()
            job2.join()
        }
    }
}

//日志
main @coroutine#2 job1 start
main @coroutine#3 job2 start
Exception in thread "main @coroutine#2" java.lang.ArithmeticException: by zero
main @coroutine#3 job2 end        <--------成功输出: job2 end
Process finished with exit code 0

五、CoroutineExceptionHandler

有时候由于协程嵌套的层级很深,并且也不需要每一个协程去处理异常,这时候CoroutineExceptionHandler就可以派上用场了,如下:

fun main() {
    runBlocking {
        val coroutineExceptionHandler = CoroutineExceptionHandler { _, throwable ->
            printMsg("CoroutineExceptionHandler $throwable")
        }
        val scope = CoroutineScope(coroutineExceptionHandler)
        val job = scope.launch {
            launch {
                printMsg("job1 start")
                delay(200L)
                throw  ArithmeticException("by zero")
            }
            launch {
                printMsg("job2 start")
                delay(300L)
                printMsg("job2 end")
            }
        }
        job.join()
        scope.cancel()
    }
}

//日志
DefaultDispatcher-worker-2 @coroutine#3 job1 start
DefaultDispatcher-worker-3 @coroutine#4 job2 start
DefaultDispatcher-worker-3 @coroutine#4 CoroutineExceptionHandler java.lang.ArithmeticException: by zero
Process finished with exit code 0

CoroutineExceptionHandler中成功输出了异常的日志。试试把CoroutineExceptionHandler放在子协程报错的地方有什么样的结果?

fun main() {
    runBlocking {
        val coroutineExceptionHandler = CoroutineExceptionHandler { _, throwable ->
            printMsg("CoroutineExceptionHandler $throwable")
        }
        val scope = CoroutineScope(coroutineContext)       <--------变化在这里
        val job = scope.launch {
            launch(coroutineExceptionHandler) {       <--------变化在这里
                printMsg("job1 start")
                delay(200L)
                throw  ArithmeticException("by zero")
            }
            launch {
                printMsg("job2 start")
                delay(300L)
                printMsg("job2 end")
            }
        }
        job.join()
        scope.cancel()
    }
}

//日志
main @coroutine#3 job1 start
main @coroutine#4 job2 start
Exception in thread "main" java.lang.ArithmeticException: by zero      <-------程序报错
Process finished with exit code 1

程序报错,且coroutineExceptionHandler并没有捕获到异常,说明coroutineExceptionHandler并没有起到作用,原因是CoroutineExceptionHandler 只在顶层的协程当中才会起作用,当子协程当中出现异常以后,它们都会统一上报给顶层的父协程,然后由顶层的父协程去调用 CoroutineExceptionHandler来处理异常

看上面的日志都没有输出job2 end,说明job1的异常影响到了job2的执行,那如果既想用coroutineExceptionHandler兜底异常,又不想协程间因为异常互相影响怎么办呢? 我们可以试试这样写:

fun main() {
    runBlocking {
        val supervisorJob = SupervisorJob()        <----------使用SupervisorJob()
        val coroutineExceptionHandler = CoroutineExceptionHandler { _, throwable ->
            printMsg("CoroutineExceptionHandler $throwable")
        }
        val scope = CoroutineScope(coroutineExceptionHandler + supervisorJob)    <-------加入到作用域的上下文
        scope.apply {
            val job1 = launch {
                printMsg("job1 start")
                delay(100L)
                throw  NullPointerException("parameters is null")     <-----子协程的异常
            }

            val job2 = launch {
                printMsg("job2 start")
                delay(200L)
                launch {                 <-----孙协程
                    try {
                        1 / 0            <-----孙协程的异常
                    } catch (exception: ArithmeticException) { 
                        throw  ArithmeticException("by zero")     <------记得抛出来,不抛出来也没有的
                    }
                }
            }

            val job3 = launch {
                printMsg("job3 start")
                delay(300L)
                printMsg("job3 end")
            }

            job1.join()
            job2.join()
            job3.join()
        }
        scope.cancel()
    }
}

//日志
DefaultDispatcher-worker-1 @coroutine#2 job1 start
DefaultDispatcher-worker-2 @coroutine#3 job2 start
DefaultDispatcher-worker-3 @coroutine#4 job3 start
DefaultDispatcher-worker-2 @coroutine#2 CoroutineExceptionHandler java.lang.NullPointerException: parameters is null
DefaultDispatcher-worker-3 @coroutine#5 CoroutineExceptionHandler java.lang.ArithmeticException: by zero
DefaultDispatcher-worker-3 @coroutine#4 job3 end
Process finished with exit code 0

部分内容参考了以下文章表示感谢

try-catch居然会不起作用?坑!

Kotlin协程异常机制与优雅封装

Kotlin | 关于协程异常处理,你想知道的都在这里

学习笔记

如有错误欢迎指正。