Kotlin 协程+挂起中可重入锁(ReentranceLock)的坑

417 阅读3分钟

背景

在某次任务中使用了 Kotlin协程+读写锁(ReentrantReadWriteLock) 对数据库进行操作。

  • 数据库的查询使用了 suspend 挂起函数
  • 封装层使用了协程和锁,控制对一些DAO挂起函数的访问。 大致代码如下:
private val rwLock = ReentrantReadWriteLock(true)

private val dao: DataDao by lazy {
    DatabaseProvider.getInstance(GlobalContext.application).getRecord().dataDao()
}

suspend fun insert(uuid: Long, json: String, status: Int) {
    val t = System.currentTimeMillis()
    try {
        rwLock.writeLock().lock()
        // 挂起函数 insertRecord
        dao.insertRecord(
            Record(
                uuid = uuid,
                data = json,
                status = status,
                createdTime = t,
                updatedTime = t,
            ),
        )
    } finally {
        rwLock.writeLock().unlock()
    }
}

不出意外,在并发下就会出现下面这个问题:

	at java.base/java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryRelease(Unknown Source)
	at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.release(Unknown Source)
	at java.base/java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.unlock(Unknown Source)
	at com.jwslh.onecamera.ExampleUnitTest.loadData(ExampleUnitTest.kt:61)
	at com.jwslh.onecamera.ExampleUnitTest.access$loadData(ExampleUnitTest.kt:23)
	at com.jwslh.onecamera.ExampleUnitTest$loadData$1.invokeSuspend(ExampleUnitTest.kt)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)

看错误,初步分析

  • 报错点:释放锁时,当前线程的holder count不大于0,释放出现错误。
  • 思路: 释放出现错误 ===> 协程操作中,线程的加解锁不匹配 ===> 协程是线程框架,调度器根据条件调度协程任务 ===> 挂起时会释放线程资源 ===> 恢复时重新调度线程执行任务,重新调度时,线程前后不一致?

还原问题,验证猜想

初步猜测协程灵活的线程调度,使得线程的加锁和释放没有对应执行,导致出现这个问题。

环境配置:

  • 3个线程的线程池
  • 读写锁
  • 挂起函数中添加读写锁,互斥代码块内调用其他挂起函数
class UnitTestClass {

    private val executor = Executors.newFixedThreadPool(3, object : ThreadFactory {
        private var count = AtomicInteger(0)
        override fun newThread(r: Runnable): Thread {
            return Thread(r, "thread-${count.getAndIncrement()}")
        }
    })

    private val rwLock = ReentrantReadWriteLock(true)

    @Test
    fun test() {
        AppCoroutineScope.launch(Dispatchers.Default) {
            var count = AtomicInteger(0)
            repeat(3) {
                launch(executor.asCoroutineDispatcher()) {
                    loadData(count.getAndIncrement())
                }
                delay(1000)
            }
        }
        Thread.sleep(30_000L) // 等待协程执行结束
    }

    private suspend fun loadData(index: Int) {
        try {
            Thread.sleep(1000) // 模拟耗时操作
            println("${System.currentTimeMillis()} ===> loadData($index) start, thread: ${Thread.currentThread().name}")
            rwLock.writeLock().lock()
            println("${System.currentTimeMillis()} ===> loadData($index) locked, thread: ${Thread.currentThread().name}")
            delay(3000) // 挂起函数
        } finally {
            println("${System.currentTimeMillis()} ===> loadData($index) end, thread: ${Thread.currentThread().name}")
            rwLock.writeLock().unlock()
            println("${System.currentTimeMillis()} ===> loadData($index) unlocked, thread: ${Thread.currentThread().name}")
        }
    }
}

object AppCoroutineScope: CoroutineScope {
    override val coroutineContext: CoroutineContext
        get() = Dispatchers.Default + SupervisorJob() + CoroutineExceptionHandler { _, t ->
            println("======================================> ${t.stackTraceToString()}")
        }
}
1736174730828 ===> loadData(0) start, thread: thread-0 @coroutine#2
1736174730839 ===> loadData(0) locked, thread: thread-0 @coroutine#2
1736174731844 ===> loadData(1) start, thread: thread-1 @coroutine#3
1736174732854 ===> loadData(2) start, thread: thread-2 @coroutine#4
1736174733846 ===> loadData(0) end, thread: thread-0 @coroutine#2
1736174733846 ===> loadData(1) locked, thread: thread-1 @coroutine#3
1736174733846 ===> loadData(0) unlocked, thread: thread-0 @coroutine#2
1736174736862 ===> loadData(1) end, thread: thread-1 @coroutine#3
1736174736862 ===> loadData(1) unlocked, thread: thread-1 @coroutine#3
1736174736862 ===> loadData(2) locked, thread: thread-2 @coroutine#4  // 1
1736174739871 ===> loadData(2) end, thread: thread-0 @coroutine#4  // 2
======================================> java.lang.IllegalMonitorStateException
	at java.base/java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryRelease(Unknown Source)
	at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.release(Unknown Source)
	at java.base/java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.unlock(Unknown Source)
	at com.jwslh.onecamera.ExampleUnitTest.loadData(ExampleUnitTest.kt:61)
	at com.jwslh.onecamera.ExampleUnitTest.access$loadData(ExampleUnitTest.kt:23)
	at com.jwslh.onecamera.ExampleUnitTest$loadData$1.invokeSuspend(ExampleUnitTest.kt)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)

从上面输出的结果,1处加锁 + 2处即将释放,可以看到线程确实切换了,导致释放错误。

总结

Lock是线程级别的,协程又是基于线程之上更小粒度任务单元。所以从这个角度上看,线程锁明显不适用于协程(特指挂起,非挂起不会存在线程的自动切换)这种线程调度框架的场景。

主要知识:

  • 协程是线程之上的线程框架
  • 协程挂起函数会释放线程资源
  • 挂起函数恢复时的线程会根据调度器(线程池)中空闲的线程来决定的,不一定是原线程
  • Lock

后续有时间应该还会对协程源码的源码以及Lock(AQS)进行简要的分析,更能清楚这里面的机制。