背景
在某次任务中使用了 Kotlin协程+读写锁(ReentrantReadWriteLock) 对数据库进行操作。
- 数据库的查询使用了 suspend 挂起函数
- 封装层使用了协程和锁,控制对一些DAO挂起函数的访问。 大致代码如下:
private val rwLock = ReentrantReadWriteLock(true)
private val dao: DataDao by lazy {
DatabaseProvider.getInstance(GlobalContext.application).getRecord().dataDao()
}
suspend fun insert(uuid: Long, json: String, status: Int) {
val t = System.currentTimeMillis()
try {
rwLock.writeLock().lock()
// 挂起函数 insertRecord
dao.insertRecord(
Record(
uuid = uuid,
data = json,
status = status,
createdTime = t,
updatedTime = t,
),
)
} finally {
rwLock.writeLock().unlock()
}
}
不出意外,在并发下就会出现下面这个问题:
at java.base/java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryRelease(Unknown Source)
at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.release(Unknown Source)
at java.base/java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.unlock(Unknown Source)
at com.jwslh.onecamera.ExampleUnitTest.loadData(ExampleUnitTest.kt:61)
at com.jwslh.onecamera.ExampleUnitTest.access$loadData(ExampleUnitTest.kt:23)
at com.jwslh.onecamera.ExampleUnitTest$loadData$1.invokeSuspend(ExampleUnitTest.kt)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
看错误,初步分析
- 报错点:释放锁时,当前线程的holder count不大于0,释放出现错误。
- 思路: 释放出现错误 ===> 协程操作中,线程的加解锁不匹配 ===> 协程是线程框架,调度器根据条件调度协程任务 ===> 挂起时会释放线程资源 ===> 恢复时重新调度线程执行任务,重新调度时,线程前后不一致?
还原问题,验证猜想
初步猜测协程灵活的线程调度,使得线程的加锁和释放没有对应执行,导致出现这个问题。
环境配置:
- 3个线程的线程池
- 读写锁
- 挂起函数中添加读写锁,互斥代码块内调用其他挂起函数
class UnitTestClass {
private val executor = Executors.newFixedThreadPool(3, object : ThreadFactory {
private var count = AtomicInteger(0)
override fun newThread(r: Runnable): Thread {
return Thread(r, "thread-${count.getAndIncrement()}")
}
})
private val rwLock = ReentrantReadWriteLock(true)
@Test
fun test() {
AppCoroutineScope.launch(Dispatchers.Default) {
var count = AtomicInteger(0)
repeat(3) {
launch(executor.asCoroutineDispatcher()) {
loadData(count.getAndIncrement())
}
delay(1000)
}
}
Thread.sleep(30_000L) // 等待协程执行结束
}
private suspend fun loadData(index: Int) {
try {
Thread.sleep(1000) // 模拟耗时操作
println("${System.currentTimeMillis()} ===> loadData($index) start, thread: ${Thread.currentThread().name}")
rwLock.writeLock().lock()
println("${System.currentTimeMillis()} ===> loadData($index) locked, thread: ${Thread.currentThread().name}")
delay(3000) // 挂起函数
} finally {
println("${System.currentTimeMillis()} ===> loadData($index) end, thread: ${Thread.currentThread().name}")
rwLock.writeLock().unlock()
println("${System.currentTimeMillis()} ===> loadData($index) unlocked, thread: ${Thread.currentThread().name}")
}
}
}
object AppCoroutineScope: CoroutineScope {
override val coroutineContext: CoroutineContext
get() = Dispatchers.Default + SupervisorJob() + CoroutineExceptionHandler { _, t ->
println("======================================> ${t.stackTraceToString()}")
}
}
1736174730828 ===> loadData(0) start, thread: thread-0 @coroutine#2
1736174730839 ===> loadData(0) locked, thread: thread-0 @coroutine#2
1736174731844 ===> loadData(1) start, thread: thread-1 @coroutine#3
1736174732854 ===> loadData(2) start, thread: thread-2 @coroutine#4
1736174733846 ===> loadData(0) end, thread: thread-0 @coroutine#2
1736174733846 ===> loadData(1) locked, thread: thread-1 @coroutine#3
1736174733846 ===> loadData(0) unlocked, thread: thread-0 @coroutine#2
1736174736862 ===> loadData(1) end, thread: thread-1 @coroutine#3
1736174736862 ===> loadData(1) unlocked, thread: thread-1 @coroutine#3
1736174736862 ===> loadData(2) locked, thread: thread-2 @coroutine#4 // 1
1736174739871 ===> loadData(2) end, thread: thread-0 @coroutine#4 // 2
======================================> java.lang.IllegalMonitorStateException
at java.base/java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryRelease(Unknown Source)
at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.release(Unknown Source)
at java.base/java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.unlock(Unknown Source)
at com.jwslh.onecamera.ExampleUnitTest.loadData(ExampleUnitTest.kt:61)
at com.jwslh.onecamera.ExampleUnitTest.access$loadData(ExampleUnitTest.kt:23)
at com.jwslh.onecamera.ExampleUnitTest$loadData$1.invokeSuspend(ExampleUnitTest.kt)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
从上面输出的结果,1处加锁 + 2处即将释放,可以看到线程确实切换了,导致释放错误。
总结
Lock是线程级别的,协程又是基于线程之上更小粒度任务单元。所以从这个角度上看,线程锁明显不适用于协程(特指挂起,非挂起不会存在线程的自动切换)这种线程调度框架的场景。
主要知识:
- 协程是线程之上的线程框架
- 协程挂起函数会释放线程资源
- 挂起函数恢复时的线程会根据调度器(线程池)中空闲的线程来决定的,不一定是原线程
- Lock
后续有时间应该还会对协程源码的源码以及Lock(AQS)进行简要的分析,更能清楚这里面的机制。