“我报名参加金石计划1期挑战——瓜分10万奖池，这是我的第1篇文章，点击查看活动详情”

前言

前文再续，书接上一回。上回说到Coil的内存缓存（MemoryCache），今天就来把磁盘缓存（DIskCache）也讲完！

用法

回顾一下上文说过的关于Coil的缓存用法，这里也再贴出来一遍。

imageView.load(
    url,
    //Coil给Context提供了一个扩展属性获取全局单例的ImageLoader。
    context.imageLoader
    //newBuilder用于获取ImageLoader的构造器
    .newBuilder()
    //指定内存缓存策略
    .memoryCache {
        MemoryCache.Builder(context)
        //指定最大Size为当前可用内存的25%
        .maxSizePercent(0.25)
        //启用或停用对缓存资源的弱引用
        .weakReferencesEnabled(true)
        //启用或停用对缓存资源的强引用
        .strongReferencesEnabled(true)
        //指定最大Size为1MB
        .maxSizeBytes(1024 * 1024)
        //构建内存缓存策略
        .build()
    }
    //指定磁盘缓存策略
    .diskCache {
        DiskCache.Builder()
        //指定磁盘缓存的路径，没有默认值，必填。
        .directory(context.cacheDir.resolve("coil_cache"))
        //指定最大Size为当前可用磁盘空间的2%
        .maxSizePercent(0.02)
        //指定最大Size为10MB
        .maxSizeBytes(1024 * 1024 * 10)
        //指定清理逻辑执行的协程Dispatcher
        .cleanupDispatcher(Dispatchers.IO)
        //指定磁盘缓存的最大大小。如果设置了maxSizeBytes则忽略该设置。
        .maximumMaxSizeBytes(...)
        //指定磁盘缓存的最小大小。如果设置了maxSizeBytes则忽略该设置。
        .minimumMaxSizeBytes(...)
        //指定OKio的FileSystem
        .fileSystem(...)
        //构建磁盘缓存策略
        .build()
    }.build() //构建ImageLoader
)

DiskCache

根据Coil整体的设计风格，不难猜出这个DiskCache同样也是个接口类。话不多说，直接上类图！

classDiagram
class DiskCache{
	<<interface>>
	+Long size
	+Long maxSize
	+Path directory
	+FileSystem fileSystem
	+get(key:String)
	+edit(key:String)
	+remove(key:String)
	+clear()
}

class RealDiskCache

DiskCache <|.. RealDiskCache

其中比较重要的是以下方法：

get方法。根据Key获取Snapshot。
edit方法。根据Key获取对应数据的Editor。

Snapshot以及Editor都是DiskCache内定义的接口，之后会讲到。

RealDiskCache

磁盘缓存没有复杂的结构，只有唯一一个RealDiskCache实现了DiskCache接口。而RealDiskCache实际上也只是另一个真正工作的类的代理类。这个类就是整个磁盘缓存中的核心：DiskLruCache。

DiskLruCache

这个DiskLruCache其实也是现成的，虽然没有直接归入到Android的SDK里，但据说也是受到了谷歌的官方认可的推荐做法。

整体思路和普通的DiskLruCache别无二致，只不过将IO操作改造成OKIO这个库来实现。如果对DiskLruCache本身有了解的读者可以关闭这篇文章了，基本没有什么新知识。毕竟看技术文章主要还是为了获取新知识，我不会怪你的。

分析这个类，本文将从三个方面进行。

首先是数据的载体。

private val lruEntries = LinkedHashMap<String, Entry>(0, 0.75f, true)

从代码可以看到，内存里实际上的数据载体是LinkedHashMap，String是数据对应的Key，Entry自然就是指向数据的。Entry类为DiskLruCache的内部类，其作用是持有缓存文件的脏数据以及干净数据的路径（用的是OKIO的Path）。

但既然是磁盘缓存，那在磁盘层面上的数据载体是什么呢？能存在磁盘里的东西，那必然是文件啦！Coil是图片加载库，那必然是图片文件啦！自然是如此，但这个DiskLruCache的核心之处就在于除了实际上缓存下来的图片文件之外，还维护了一个日志文件（journal file）。日志文件记录下了每一条的操作记录，有一点类似数据库的事务管理。对缓存文件的操作有以下选项：

DIRTY。表示脏数据，该文件正在被写入（新增或修改）。
CLEAN。表示干净数据，文件已经写入完毕（缓存下来了）。
REMOVE。表示移除操作，该文件被移除（移除一条干净数据或者一条脏数据写入失败）。
READ。表示读取操作。

其中值得注意的是，由于DIRTY操作要么成功要么失败，所以它的后一条操作只会是CLEAN（成功）或者REMOVE（失败），如果DIRTY后面没有其他操作，则证明这条数据在写入过程中被强行打断了（比如崩溃了），也是一条无效的数据。

既然清楚了内存上的数据载体以及磁盘上的载体，那么接下来就应该看看如何将两者联系起来，也就是磁盘缓存的初始化。以下是DiskLruCache的初始化代码：

fun initialize() {
    if (initialized) return

    //省略...

    if (fileSystem.exists(journalFile)) {
        try {
            readJournal()
            processJournal()
            initialized = true
            return
        } catch (_: IOException) {
            
        }

        try {
            delete()
        } finally {
            closed = false
        }
    }

    writeJournal()
    initialized = true
}

首先判断一下是否已经初始化过了，由于这里的DiskLruCache是懒汉式的初始化，只有在需要进行缓存操作时才会去进行初始化，所以每个缓存操作前都会调用一次这个initialize方法。

接着如果日志文件存在则进行日志文件的读取（readJournal）和处理（processJournal）。

先说一下readJournal，关键就是循环读取日志文件的每一行（readJournalLine）：

private fun readJournalLine(line: String) {
    val firstSpace = line.indexOf(' ')
    if (firstSpace == -1) throw IOException("unexpected journal line: $line")

    val keyBegin = firstSpace + 1
    val secondSpace = line.indexOf(' ', keyBegin)
    val key: String
    if (secondSpace == -1) {
        key = line.substring(keyBegin)
        if (firstSpace == REMOVE.length && line.startsWith(REMOVE)) {
            //REMOVE操作
            lruEntries.remove(key)
            return
        }
    } else {
        key = line.substring(keyBegin, secondSpace)
    }

    val entry = lruEntries.getOrPut(key) { Entry(key) }
    when {
        secondSpace != -1 && firstSpace == CLEAN.length && line.startsWith(CLEAN) -> {
            //CLEAN操作
            val parts = line.substring(secondSpace + 1).split(' ')
            entry.readable = true
            entry.currentEditor = null
            entry.setLengths(parts)
        }
        secondSpace == -1 && firstSpace == DIRTY.length && line.startsWith(DIRTY) -> {
            //DIRTY操作
            entry.currentEditor = Editor(entry)
        }
        secondSpace == -1 && firstSpace == READ.length && line.startsWith(READ) -> {
            //READ操作
        }
        else -> throw IOException("unexpected journal line: $line")
    }
}

从上面的代码的注释位置可以看到，对于读取到的不同操作都有着不同的处理：

REMOVE操作，将该Key对应的Entry从Map中移除。
CLEAN操作，构建并完善Entry。
DIRTY操作，为当前Entry创建Editor。
READ操作，不作处理。

接下来是processJournal：

private fun processJournal() {
    var size = 0L
    val iterator = lruEntries.values.iterator()
    while (iterator.hasNext()) {
        val entry = iterator.next()
        if (entry.currentEditor == null) {
            //干净数据
            for (i in 0 until valueCount) {
                size += entry.lengths[i]
            }
        } else {
            //脏数据
            entry.currentEditor = null
            for (i in 0 until valueCount) {
                fileSystem.delete(entry.cleanFiles[i])
                fileSystem.delete(entry.dirtyFiles[i])
            }
            iterator.remove()
        }
    }
    this.size = size
}

可以看到，processJournal就是遍历Map，并对干净数据和脏数据进行分别处理：

干净数据，累计缓存总size。
脏数据，删除脏数据的所有缓存文件并移除该Entry。

再次回头看initialize方法，进行完readJournal以及processJournal之后，初始化就已经完成了。

假如说在这个过程中，文件读取出现问题，证明日志文件可能已经损坏了，就会继续往下走到delect。并把错误抛出。

读取完日志文件后，再会执行一次writeJournal方法：

private fun writeJournal() {
    journalWriter?.close()

    //将新日志文件先写入到临时目录
    fileSystem.write(journalFileTmp) {
        //日志头部信息，不需要关注
        writeUtf8(MAGIC).writeByte('\n'.code)
        writeUtf8(VERSION).writeByte('\n'.code)
        writeDecimalLong(appVersion.toLong()).writeByte('\n'.code)
        writeDecimalLong(valueCount.toLong()).writeByte('\n'.code)
        writeByte('\n'.code)

        //遍历Map的Entry
        for (entry in lruEntries.values) {
            if (entry.currentEditor != null) {
                //脏数据
                writeUtf8(DIRTY)
                writeByte(' '.code)
                writeUtf8(entry.key)
                writeByte('\n'.code)
            } else {
                //干净数据
                writeUtf8(CLEAN)
                writeByte(' '.code)
                writeUtf8(entry.key)
                entry.writeLengths(this)
                writeByte('\n'.code)
            }
        }
    }

    if (fileSystem.exists(journalFile)) {
        //有旧文件，把新日志文件交换并删除旧文件
        fileSystem.atomicMove(journalFile, journalFileBackup)
        fileSystem.atomicMove(journalFileTmp, journalFile)
        fileSystem.delete(journalFileBackup)
    } else {
        //没有旧文件，直接将新日志文件放入到正式目录
        fileSystem.atomicMove(journalFileTmp, journalFile)
    }

    //创建新的Writer以及重置一些参数
    journalWriter = newJournalWriter()
    operationsSinceRewrite = 0
    hasJournalErrors = false
    mostRecentRebuildFailed = false
}

自此初始化已经说完了，流程图总结：

graph TD
initialize --> a(readJournalLine)
subgraph readJournal
a --> |iterate| a
end
a --> processJournal --> writeJournal --> End

接下来说缓存的操作。

对数据的操作不外乎增删改查，而DiskLruCache的增和改都使用edit方法进行，也即只有三个方法对缓存进行操作：edit、get、remove。

先说get。

顾名思义，get是从缓存中读取数据。

operator fun get(key: String): Snapshot? {
    //检查缓存是否已经close、key是否有效、是否已经初始化。
    checkNotClosed()
    validateKey(key)
    initialize()

    //构建Snapshot
    val snapshot = lruEntries[key]?.snapshot() ?: return null

    //操作计数+1
    operationsSinceRewrite++
    //写入日志记录
    journalWriter!!.apply {
        writeUtf8(READ)
        writeByte(' '.code)
        writeUtf8(key)
        writeByte('\n'.code)
    }

    if (journalRewriteRequired()) {
        //进行清理工作
        launchCleanup()
    }

    //返回Snapshot
    return snapshot
}

代码很简单，重点就是构建Snapshot以及将操作写入日志。Snapshot并不是数据本身，同样也只是缓存文件的路径。Coil会将其包装成SourceResult，并在decode步骤进行实际的文件读取。

再说remove。

fun remove(key: String): Boolean {
    //同样的检查
    checkNotClosed()
    validateKey(key)
    initialize()

    //获取待删除的Entry
    val entry = lruEntries[key] ?: return false
    //删除
    val removed = removeEntry(entry)
    if (removed && size <= maxSize) mostRecentTrimFailed = false
    return removed
}

private fun removeEntry(entry: Entry): Boolean {

    if (entry.lockingSnapshotCount > 0) {
        //该Entry还有未释放的Snapshot，也即正在被使用中，写入一行DIRTY日志
        journalWriter?.apply {
            writeUtf8(DIRTY)
            writeByte(' '.code)
            writeUtf8(entry.key)
            writeByte('\n'.code)
            flush()
        }
    }
    if (entry.lockingSnapshotCount > 0 || entry.currentEditor != null) {
        //标记为zombie并返回。
        entry.zombie = true
        return true
    }

    //detach当前Entry的Editor
    entry.currentEditor?.detach()

    for (i in 0 until valueCount) {
        //删除缓存文件并计算size
        fileSystem.delete(entry.cleanFiles[i])
        size -= entry.lengths[i]
        entry.lengths[i] = 0
    }

    //操作计数+1
    operationsSinceRewrite++
    //写入REMOVE操作到日志文件
    journalWriter?.apply {
        writeUtf8(REMOVE)
        writeByte(' '.code)
        writeUtf8(entry.key)
        writeByte('\n'.code)
    }
    //Map中移除Entry
    lruEntries.remove(entry.key)

    if (journalRewriteRequired()) {
        //同样的清理行为
        launchCleanup()
    }

    return true
}

remove的代码也不太难，只有其中一个地方需要解释一下。当Entry有未释放的Snapshot或者当前的Editor不为空（正在被编辑）时，remove行为不是被实际执行，而是将Entry的zombie置为true。而这个zombie则表明，该Entry应该被删除，只是当前处于正在被使用的状态下不直接删除，等到用完了就应该删除掉该Entry。不得不说，这个zombie的变量名取得相当有趣且贴切。

最后是比较复杂的edit。

DiskLruCache的edit方法本身其实并不复杂，这里就不展开了，实际上只是构建了一个Editor并返回，实际上所有的实际操作都以Editor作为工具的。这里先看看Editor的代码：

inner class Editor(val entry: Entry) {

    private var closed = false

    val written = BooleanArray(valueCount)

    fun file(index: Int): Path {
        synchronized(this@DiskLruCache) {
            check(!closed) { "editor is closed" }
            written[index] = true
            return entry.dirtyFiles[index].also(fileSystem::createFile)
        }
    }

    fun detach() {
        if (entry.currentEditor == this) {
            entry.zombie = true
        }
    }

    fun commit() = complete(true)

    fun commitAndGet(): Snapshot? {
        synchronized(this@DiskLruCache) {
            commit()
            return get(entry.key)
        }
    }

    fun abort() = complete(false)

    private fun complete(success: Boolean) {
        synchronized(this@DiskLruCache) {
            check(!closed) { "editor is closed" }
            if (entry.currentEditor == this) {
                completeEdit(this, success)
            }
            closed = true
        }
    }
}

首先捋清楚对一个缓存文件进行修改的流程：

graph LR
a(Editor.file) --> b(edit the file) --> c(Editor.commit)

file方法获取到文件的路径（同样也是使用OKIO的Path），接着对文件进行编辑（新增或者修改），最后调用commit提交修改。

重点是commit方法，我们详细来分析一下。commit的调用链如下：

graph LR
Editor.commit --> Editor.complete --> DiskLruCache.completeEdit

我们直接看DiskLruCache的completeEdit方法的代码：

private fun completeEdit(editor: Editor, success: Boolean) {
    val entry = editor.entry
    check(entry.currentEditor == editor)

    if (success && !entry.zombie) {
        //编辑成功并且该Entry不需要删除
        for (i in 0 until valueCount) {
            if (editor.written[i] && !fileSystem.exists(entry.dirtyFiles[i])) {
                //检查编辑过的缓存文件是否真实存在，不存在则认为编辑失败，打断编辑。
                editor.abort()
                return
            }
        }

        for (i in 0 until valueCount) {
            val dirty = entry.dirtyFiles[i]
            val clean = entry.cleanFiles[i]
            if (fileSystem.exists(dirty)) {
                //将修改后的脏数据转成干净数据
                fileSystem.atomicMove(dirty, clean)
            } else {
                fileSystem.createFile(entry.cleanFiles[i])
            }
            //计算size
            val oldLength = entry.lengths[i]
            val newLength = fileSystem.metadata(clean).size ?: 0
            entry.lengths[i] = newLength
            size = size - oldLength + newLength
        }
    } else {
        //编辑失败或者该Entry为zombie
        for (i in 0 until valueCount) {
            //删除所有的编辑后的脏数据
            fileSystem.delete(entry.dirtyFiles[i])
        }
    }

    entry.currentEditor = null
    if (entry.zombie) {
        //Entry为zombie则移除
        removeEntry(entry)
        return
    }

    //操作计数+1
    operationsSinceRewrite++
    //写日志文件
    journalWriter!!.apply {
        if (success || entry.readable) {
            entry.readable = true
            writeUtf8(CLEAN)
            writeByte(' '.code)
            writeUtf8(entry.key)
            entry.writeLengths(this)
            writeByte('\n'.code)
        } else {
            lruEntries.remove(entry.key)
            writeUtf8(REMOVE)
            writeByte(' '.code)
            writeUtf8(entry.key)
            writeByte('\n'.code)
        }
        flush()
    }

    if (size > maxSize || journalRewriteRequired()) {
        //清理行为
        launchCleanup()
    }
}

代码逻辑的本身并不难，只是这里我们需要再次明确什么是脏数据什么是干净数据。

干净数据是经过了DiskLruCache“洗过了”，能供外部正确读取的数据，从逻辑层面来说，对外部是只读的。

脏数据则是可被读写的。

这就是同样是读取缓存文件，DiskLruCache的get方法是从clean里面取，而Editor.file方法是从dirty里面取的原因。这样做的好处是，即使对缓存进行了一个错误的写入行为（比如写到一半程序崩溃了），也不会因此污染到其他人对该缓存的读取行为，因为这个错误的写入行为还没有被“洗干净”，意味着还没有真正提交到缓存里。

当一个dirty的数据执行了commit，提交到DiskLruCache里面，就会被洗干净，存放到clean里面，此时再调用get进行读取缓存，就是编辑后的数据了。

最后说一下清理行为。

上面的代码中，最后都会调用一个launchCleanup的方法。由于只有edit会有可能增加缓存的占用，所以maxSize的判断只有在completeEdit里才有出现。maxSize的判断很容易理解，当超过了设定的maxSize时，就需要进行缓存的清理了。而launchCleanup还有另一个执行条件，取决于journalRewriteRequire这个方法的返回。

顾名思义，journalRewriteRequire正是用来判断是否需要重写日志文件的。日志文件总是会将每个操作一条不漏地记录下来，如果没有自清理的机制的话，这个日志文件就会不断膨胀。

我们看看这个journalRewriteRequire的代码：

private fun journalRewriteRequired() = operationsSinceRewrite >= 2000

当操作数到达2000时，日志文件就需要被重写。留心的读者会发现，上面将缓存操作的时候，每一个都出现了操作数+1的代码。无论是edit、get还是remove，都计入操作数中。而文件的重新则是调用writeJournal方法实现的，上面已经分析过了，只会根据内存中的Map构建日志文件。

下面我们看launchCleanup方法：

private fun launchCleanup() {
    //协程执行
    cleanupScope.launch {
        synchronized(this@DiskLruCache) {
            if (!initialized || closed) return@launch
            try {
                //缩减缓存的size
                trimToSize()
            } catch (_: IOException) {
                mostRecentTrimFailed = true
            }
            try {
                if (journalRewriteRequired()) {
                    //重写日志文件
                    writeJournal()
                }
            } catch (_: IOException) {
                mostRecentRebuildFailed = true
                journalWriter = blackholeSink().buffer()
            }
        }
    }
}

private fun trimToSize() {
    while (size > maxSize) {
        //一直删除最远使用的Entry直到size小于maxSize
        if (!removeOldestEntry()) return
    }
    mostRecentTrimFailed = false
}

private fun removeOldestEntry(): Boolean {
    for (toEvict in lruEntries.values) {
        //循环删除非zombie的Entry
        if (!toEvict.zombie) {
            removeEntry(toEvict)
            return true
        }
    }
    return false
}

代码也同样很简单，也只有一个地方需要稍微解释一下。

为什么直接遍历删除Entry就能做到removeOldest呢？这正是日志文件最重要的功能。由于日志文件的写入先天就存在着时间关系，越迟写入的则表明越新。并且读取日志文件同样也是从前到后读取，所以说初始化时构建的Map里的Entry，本身就存在着越来越新的关系。而遍历Map是从头开始的，所以每一次删除都是Map中“最老”的Entry。

总结

自从，Coil的缓存都讲完了。当了解过后才发现，不管是内存缓存还是磁盘缓存，采用的策略其实都是很常见的东西。通过这次对Coil缓存实现的源码分析，补上了一点知识缺口，受益匪浅。

Coil源码解析（四）之磁盘缓存

前言

用法

DiskCache

RealDiskCache

DiskLruCache

总结