关于kotlin协程异常的几个误区

122 阅读6分钟

协程异常的处理

正常情况下,当协程作用域中的一个协程发生异常时,此时的异常流程如下所示:

  • 发生异常的协程被cancel
  • 异常传递到它的父协程
  • 父协程 cancel(取消其所有子协程)
  • 将异常在协程树上进一步向上传播

在部分情况下,我们需要某一个子协程挂了不影响其他子协程的继续执行,这就需要SupervisorJob。SupervisorJob 的作用就是将协程中的异常「掐死」在协程内部,切断其向上传播的路径。使用 SupervisorJob 后,子协程的异常退出不会影响到其他子协程,同时 SupervisorJob 也不会传播异常而是让异常发生的协程自己处理。

SupervisorJob 可以在创建 CoroutineScope 的时候作为参数传进来,也可以使用 supervisorScope 来创建一个自定义的协程作用域,所以SupervisorJob 只有下面两种使用方式。

supervisorScope{} CoroutineScope(SupervisorJob()) 但是要注意的是,不论是SupervisorJob还是Job,如果协程内部发生异常,这个异常是肯定会被抛出的,只是是否会崩溃。

这里有个误区,那就是大家不要以为使用SupervisorJob之后,协程就不会崩溃,不管你用什么Job,该崩溃的还是要崩溃的,它们的差别在于是否会影响到别的协程。下面举几个例子来说明异常的处理。

普通Job

runBlocking {
    val job = Job()
    val scopeSuper = CoroutineScope(job)

    scopeSuper.launch {
        "start job1 delay".println()
        delay(1000)
        "end job1 delay".println()
    }

    scopeSuper.launch {
        "job2 throw execption".println()
        throw NullPointerException("我异常了")
    }

    scopeSuper.launch {
        delay(2000)
        "start job3 delay".println()
    }

    job.children.forEach { it.join() }

    // scopeSuper.coroutineContext[Job]和job为同一个job,因此下面的调用是一样的
    // scopeSuper.coroutineContext[Job]?.children?.forEach { it.join() }

    "application end".println()
}

注意上面代码有一个常见的错误是使用 ​​Job()​​​ 来创建一个 job,将其用作某些协程的父协程,然后调用 job 的 ​​join​​ 函数,就像下面这样。这样的程序永远不会结束,join那里将一直等待,因为 job 将一直处于活动状态,即使它的所有子协程都完成了,但是这个job仍然可以被其它协程继续使用,因此被判定为未完成状态,其实就相当于一个死循环的线程,除非主动退出或者出现异常,否则它将永远不会自己退出。 这里除非job显示调用complete()才表明这个job为完成状态。

job.join()

运行结果如下:

2022-08-17 14:55:45.545: [Log] job2 throw execption
2022-08-17 14:55:45.545: [Log] start job1 delay
Exception in thread "DefaultDispatcher-worker-2" java.lang.NullPointerException: 我异常了
	at MainKt$main$1$2.invokeSuspend(Main.kt:17)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:570)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:749)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:677)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:664)
	Suppressed: kotlinx.coroutines.DiagnosticCoroutineContextException: [StandaloneCoroutine{Cancelling}@60ba2a13, Dispatchers.Default]
2022-08-17 14:55:45.563: [Log] application end

Process finished with exit code 0

从上面输出我们可以看到,job3没有执行,job1的delay后面一段也没有执行,也就行job2的异常导致了父job异常,父job就会取消所有子协程,因此job3和job1也被动被取消了。

这里还有一个注意点,如果我们将上面的 job.children.forEach { it.join() }改为job.join(),它们输出是一样的,是因为job2异常了,导致job.join()的执行结束。如果把job2的异常注释掉,那么job.join()将永远等待下去,不会执行后面的"application end",除非显示调用job.complete()。

SupervisorJob

接下来我们将Job改为SupervisorJob,看看输出结果

runBlocking {
    val job = SupervisorJob()
    val scopeSuper = CoroutineScope(job)

    scopeSuper.launch {
        "start job1 delay".println()
        delay(1000)
        "end job1 delay".println()
    }

    scopeSuper.launch {
        "job2 throw execption".println()
        throw NullPointerException("我异常了")
    }

    scopeSuper.launch {
        delay(2000)
        "start job3 delay".println()
    }
    
    job.children.forEach { it.join() }

    // scopeSuper.coroutineContext[Job]和job为同一个job,因此下面的调用是一样的
    // scopeSuper.coroutineContext[Job]?.children?.forEach { it.join() }

    "application end".println()
}

输出结果如下:

2022-08-17 15:26:56.535: [Log] start job1 delay
2022-08-17 15:26:56.535: [Log] job2 throw execption
Exception in thread "DefaultDispatcher-worker-2" java.lang.NullPointerException: 我异常了
	at MainKt$main$1$2.invokeSuspend(Main.kt:18)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:570)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:749)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:677)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:664)
	Suppressed: kotlinx.coroutines.DiagnosticCoroutineContextException: [StandaloneCoroutine{Cancelling}@734cc8e8, Dispatchers.Default]
2022-08-17 15:26:57.544: [Log] end job1 delay
2022-08-17 15:26:58.494: [Log] start job3 delay
2022-08-17 15:26:58.494: [Log] application end

Process finished with exit code 0

我们可以看到job3和job1都正常执行完成了,也就是job2的异常没有影响到job1和job3的执行。

接下来我们再看看下面这段代码:

runBlocking {
    val job = SupervisorJob()
    val scopeSuper = CoroutineScope(job)

    scopeSuper.launch {
        launch {
            "start job1 delay".println()
            delay(1000)
            "end job1 delay".println()
        }

        launch {
            "job2 throw execption".println()
            throw NullPointerException("我异常了")
        }

        launch {
            delay(2000)
            "start job3 delay".println()
        }
    }
    
    scopeSuper.launch {
        delay(2000)
        "start job4 delay".println()
    }

    job.children.forEach { it.join() }

    // scopeSuper.coroutineContext[Job]和job为同一个job,因此下面的调用是一样的
    // scopeSuper.coroutineContext[Job]?.children?.forEach { it.join() }

    "application end".println()
}

输出结果:

2022-08-17 15:38:05.775: [Log] start job1 delay
2022-08-17 15:38:05.775: [Log] job2 throw execption
Exception in thread "DefaultDispatcher-worker-1" java.lang.NullPointerException: 我异常了
	at MainKt$main$1$1$2.invokeSuspend(Main.kt:19)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:570)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:749)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:677)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:664)
	Suppressed: kotlinx.coroutines.DiagnosticCoroutineContextException: [StandaloneCoroutine{Cancelling}@742da77c, Dispatchers.Default]
2022-08-17 15:38:07.734: [Log] start job4 delay
2022-08-17 15:38:07.735: [Log] application end

Process finished with exit code 0

怎么回事呢,job1和job3都没有正常执行完成,而job4正常完成了,这是因为SupervisorJob只针对当前scope的直接子协程起作用,而对于孙子协程仍然会继续抛异常,并向上传递从而影响同一scope下的其他子协程。

如果我们像这么这样修改呢

runBlocking {
    val job = SupervisorJob()
    val scopeSuper = CoroutineScope(job)

    scopeSuper.launch {
        launch {
            "start job1 delay".println()
            delay(1000)
            "end job1 delay".println()
        }

        launch(job) {
            "job2 throw execption".println()
            throw NullPointerException("我异常了")
        }

        launch {
            delay(2000)
            "start job3 delay".println()
        }
    }

    scopeSuper.launch {
        delay(2000)
        "start job4 delay".println()
    }

    job.children.forEach { it.join() }

    // scopeSuper.coroutineContext[Job]和job为同一个job,因此下面的调用是一样的
    // scopeSuper.coroutineContext[Job]?.children?.forEach { it.join() }

    "application end".println()
}

看看输出结果:

2022-08-17 15:40:27.372: [Log] start job1 delay
2022-08-17 15:40:27.372: [Log] job2 throw execption
Exception in thread "DefaultDispatcher-worker-4" java.lang.NullPointerException: 我异常了
	at MainKt$main$1$1$2.invokeSuspend(Main.kt:19)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:570)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:749)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:677)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:664)
	Suppressed: kotlinx.coroutines.DiagnosticCoroutineContextException: [StandaloneCoroutine{Cancelling}@1cbf468b, Dispatchers.Default]
2022-08-17 15:40:28.385: [Log] end job1 delay
2022-08-17 15:40:29.334: [Log] start job4 delay
2022-08-17 15:40:29.334: [Log] start job3 delay
2022-08-17 15:40:29.334: [Log] application end

Process finished with exit code 0

supervisorScope

runBlocking {
    supervisorScope {
        launch {
            "start job1 delay".println()
            delay(1000)
            "end job1 delay".println()
        }

        launch {
            "job2 throw execption".println()
            throw NullPointerException("我异常了")
        }

        launch {
            delay(2000)
            "start job3 delay".println()
        }
    }

    "application end".println()
}

输出结果:

2022-08-17 15:44:15.186: [Log] start job1 delay
2022-08-17 15:44:15.192: [Log] job2 throw execption
Exception in thread "main" java.lang.NullPointerException: 我异常了
	at MainKt$main$1$1$2.invokeSuspend(Main.kt:16)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
	at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:284)
	at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:85)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:59)
	at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:38)
	at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
	at MainKt.main(Main.kt:6)
	Suppressed: kotlinx.coroutines.DiagnosticCoroutineContextException: [StandaloneCoroutine{Cancelling}@d70c109, BlockingEventLoop@17ed40e0]
2022-08-17 15:44:16.195: [Log] end job1 delay
2022-08-17 15:44:17.235: [Log] start job3 delay
2022-08-17 15:44:17.236: [Log] application end

Process finished with exit code 0

可以看到supervisorScope也能达到停止异常传递的作用。

我们再来看看下面这段代码:

runBlocking {
    supervisorScope {
        launch {
            "start job1 delay".println()

            launch {
                "job4 throw execption".println()
                throw NullPointerException("我异常了4")
            }

            launch {
                delay(500)
                "start job5 delay ".println()
            }

            delay(1000)
            "end job1 delay".println()
        }

        launch {
            "job2 throw execption".println()
            throw NullPointerException("我异常了")
        }

        launch {
            delay(2000)
            "start job3 delay".println()
        }
    }

    "application end".println()
}

看看输出结果:

2022-08-17 15:52:00.101: [Log] start job1 delay
2022-08-17 15:52:00.110: [Log] job2 throw execption
Exception in thread "main" java.lang.NullPointerException: 我异常了
	at MainKt$main$1$1$2.invokeSuspend(Main.kt:27)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
	at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:284)
	at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:85)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:59)
	at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:38)
	at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
	at MainKt.main(Main.kt:6)
	Suppressed: kotlinx.coroutines.DiagnosticCoroutineContextException: [StandaloneCoroutine{Cancelling}@d70c109, BlockingEventLoop@17ed40e0]
Exception in thread "main" java.lang.NullPointerException: 我异常了4
	at MainKt$main$1$1$1$1.invokeSuspend(Main.kt:13)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
	at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:284)
	at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:85)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:59)
	at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:38)
	at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
	at MainKt.main(Main.kt:6)
	Suppressed: kotlinx.coroutines.DiagnosticCoroutineContextException: [StandaloneCoroutine{Cancelling}@71e7a66b, BlockingEventLoop@17ed40e0]
2022-08-17 15:52:00.137: [Log] job4 throw execption
2022-08-17 15:52:02.151: [Log] start job3 delay
2022-08-17 15:52:02.151: [Log] application end

Process finished with exit code 0

从输出可以看出job1和job5都受到了影响 我们将job4的异常输出结果如下:

2022-08-17 15:54:45.445: [Log] start job1 delay
2022-08-17 15:54:45.453: [Log] job2 throw execption
Exception in thread "main" java.lang.NullPointerException: 我异常了
	at MainKt$main$1$1$2.invokeSuspend(Main.kt:27)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
	at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:284)
	at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:85)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:59)
	at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:38)
	at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
	at MainKt.main(Main.kt:6)
	Suppressed: kotlinx.coroutines.DiagnosticCoroutineContextException: [StandaloneCoroutine{Cancelling}@d70c109, BlockingEventLoop@17ed40e0]
2022-08-17 15:54:45.480: [Log] job4 throw execption
2022-08-17 15:54:45.996: [Log] start job5 delay 
2022-08-17 15:54:46.463: [Log] end job1 delay
2022-08-17 15:54:47.491: [Log] start job3 delay
2022-08-17 15:54:47.491: [Log] application end

Process finished with exit code 0

因此supervisorScope其实内部本质上还是SupervisorJob。