Okio 源码分析(一)

146 阅读6分钟

携手创作,共同成长!这是我参与「掘金日新计划 · 8 月更文挑战」的第7天,点击查看活动详情

Okio 源码分析

Sink
expect interface Sink : Closeable,Flushable
 {
  /** 
从 source 中获取到的数据添加到 sink 自身
 */
  @Throws(IOException::class)
  fun write(source: Buffer, byteCount: Long)

  /** 将所有缓冲的字节推送到它们的最终目的地。  */
  @Throws(IOException::class)
  fun flush()

  /** 返回此接收器的超时时间。  */
  fun timeout(): Timeout

  /**
    将所有缓冲的字节推送到它们的最终目的地并释放它持有的资源。
   */
  @Throws(IOException::class)
  override fun close()
}

Source
interface Source : Closeable {
  /**
     将自身数据给 sink 
   */
  @Throws(IOException::class)
  fun read(sink: Buffer, byteCount: Long): Long

  /** 返回超时时间 */
  fun timeout(): Timeout


  @Throws(IOException::class)
  override fun close()
}

这两个是 Okio 中最基本的两个接口,分别对应 java 的 InputStream 和 OutputStream 即输入流和输出流,Source 是输入流,Sink 是输出流

BufferedSink
expect sealed interface BufferedSink : Sink {
  /** 此接收器的内部缓冲区。 */
  val buffer: Buffer

  fun write(byteString: ByteString): BufferedSink

  fun write(byteString: ByteString, offset: Int, byteCount: Int): BufferedSink

  /** 将字节数组写入此接收器*/
  fun write(source: ByteArray): BufferedSink

  fun write(source: ByteArray, offset: Int, byteCount: Int): BufferedSink
  /**
从 source 中删除所有字节并将它们附加到此接收器。 返回读取的字节数
如果 source 用尽,则为 0。
   */
  fun writeAll(source: Source): Long
  fun write(source: Source, byteCount: Long): BufferedSink
  fun writeUtf8(string: String): BufferedSink
  fun writeUtf8(string: String, beginIndex: Int, endIndex: Int): BufferedSink
  fun writeUtf8CodePoint(codePoint: Int): BufferedSink
  fun writeByte(b: Int): BufferedSink
  fun writeShort(s: Int): BufferedSink
  fun writeShortLe(s: Int): BufferedSink
  fun writeInt(i: Int): BufferedSink
  fun writeIntLe(i: Int): BufferedSink
  fun writeLong(v: Long): BufferedSink
  fun writeLongLe(v: Long): BufferedSink
  fun writeDecimalLong(v: Long): BufferedSink
  fun writeHexadecimalUnsignedLong(v: Long): BufferedSink
  override fun flush()
  fun emit(): BufferedSink
  fun emitCompleteSegments(): BufferedSink
}

BufferedSink继承sink,但是它提供了很多写文件的方法,几乎涵盖了所有的数据类型都可以轻松的写入

BufferedSource
expect sealed interface BufferedSource : Source {
  val buffer: Buffer
  fun exhausted(): Boolean
  fun require(byteCount: Long)
  fun request(byteCount: Long): Boolean
  fun readByte(): Byte
  fun readShort(): Short
  fun readShortLe(): Short
  fun readInt(): Int
  fun readIntLe(): Int
  fun readLong(): Long
  fun readLongLe(): Long
  fun readDecimalLong(): Long
  fun readHexadecimalUnsignedLong(): Long
  fun skip(byteCount: Long)
  fun readByteString(): ByteString
  fun readByteString(byteCount: Long): ByteString
  fun select(options: Options): Int
  fun readByteArray(): ByteArray
  fun readByteArray(byteCount: Long): ByteArray
  fun read(sink: ByteArray): Int
  fun readFully(sink: ByteArray)
  fun read(sink: ByteArray, offset: Int, byteCount: Int): Int
  fun readFully(sink: Buffer, byteCount: Long)
  fun readAll(sink: Sink): Long
  fun readUtf8(): String
  fun readUtf8(byteCount: Long): String
  fun readUtf8Line(): String?
  fun readUtf8LineStrict(): String
  fun readUtf8LineStrict(limit: Long): String
  fun readUtf8CodePoint(): Int
  fun indexOf(b: Byte): Long
  fun indexOf(b: Byte, fromIndex: Long): Long
  fun indexOf(b: Byte, fromIndex: Long, toIndex: Long): Long
  fun indexOf(bytes: ByteString): Long
  fun indexOf(bytes: ByteString, fromIndex: Long): Long
  fun indexOfElement(targetBytes: ByteString): Long
  fun indexOfElement(targetBytes: ByteString, fromIndex: Long): Long
  fun rangeEquals(offset: Long, bytes: ByteString): Boolean
  fun rangeEquals(offset: Long, bytes: ByteString, bytesOffset: Int, byteCount: Int): Boolean
  fun peek(): BufferedSource
}

可以看到,几乎你想从输入流中读取任何的数据类型都可以,而不需要你自己去转换,可以说是非常强大而且人性化了,除了 read 方法以外,还有一些别的方法,可以说几乎可以满足很多需求。

RealBufferedSource & RealBufferedSink

在我们通过 Okio.source() 和 Okio.sink() 获取了 Souce 和 Sink 对象后,一般不会直接使用,而是会再调用一次 Okio.buffer() 生成一个实现 BufferedSource 和 BufferedSink 接口的对象,而这两个接口则是分别继承了 Source 和 Sink 接口的并基础上进行了方法扩展,提供了丰富的读写接口方法,几乎可以对各种基础数据类型进行读写。

Segment

internal class Segment {
  @JvmField val data: ByteArray //存储的数据


  /** 下一次读取的开始位置*/
  @JvmField var pos: Int = 0

  //写入的开始位置
  @JvmField var limit: Int = 0

  /** 当前Segment是否可以共享*/
  @JvmField var shared: Boolean = false

  /** data是否仅当前Segment独有,不share */
  @JvmField var owner: Boolean = false

  /** 后继节点 */
  @JvmField var next: Segment? = null

  /** 前驱节点 */
  @JvmField var prev: Segment? = null

  constructor() {
    this.data = ByteArray(SIZE)
    this.owner = true
    this.shared = false
  }

  constructor(data: ByteArray, pos: Int, limit: Int, shared: Boolean, owner: Boolean) {
    this.data = data
    this.pos = pos
    this.limit = limit
    this.shared = shared
    this.owner = owner
  }

  fun sharedCopy(): Segment {
    shared = true
    return Segment(data, pos, limit, true, false)
  }

  fun unsharedCopy() = Segment(data.copyOf(), pos, limit, false, true)

  /**
   * 移除当前的Segment
   */
  fun pop(): Segment? {
    val result = if (next !== this) next else null
    prev!!.next = next
    next!!.prev = prev
    next = null
    prev = null
    return result
  }

  /**
   * 在当前节点后添加一个新的节点
   */
  fun push(segment: Segment): Segment {
    segment.prev = this
    segment.next = next
    next!!.prev = segment
    next = segment
    return segment
  }

  /**
   * 将当前Segment分裂成2个Segment结点。前面结点pos~limit数据范围是[pos..pos+byteCount),后面结点pos~limit数据范围是[pos+byteCount..limit)
   */
  fun split(byteCount: Int): Segment {
    require(byteCount > 0 && byteCount <= limit - pos) { "byteCount out of range" }
    val prefix: Segment
    
    //如果字节数大于SHARE_MINIMUM则拆分成共享节点
    if (byteCount >= SHARE_MINIMUM) {
      prefix = sharedCopy()
    } else {
      prefix = SegmentPool.take()
      data.copyInto(prefix.data, startIndex = pos, endIndex = pos + byteCount)
    }

    prefix.limit = prefix.pos + byteCount
    pos += byteCount
    prev!!.push(prefix)
    return prefix
  }

//当前Segment结点和prev前驱结点合并成一个Segment,统一合并到prev,然后当前Segment结点从双向链表移除并添加到SegmentPool复用。当然合并的前提是:2个Segment的字节总和不超过8K。合并后可能会移动pos、limit
  fun compact() {
    check(prev !== this) { "cannot compact" }
    if (!prev!!.owner) return // Cannot compact: prev isn't writable.
    val byteCount = limit - pos
    val availableByteCount = SIZE - prev!!.limit + if (prev!!.shared) 0 else prev!!.pos
    if (byteCount > availableByteCount) return // Cannot compact: not enough writable space.
    writeTo(prev!!, byteCount)
    pop()
    SegmentPool.recycle(this)
  }

  /** 从当前节点移动byteCount个字节到sink中 */
  fun writeTo(sink: Segment, byteCount: Int) {
    check(sink.owner) { "only owner can write" }
    if (sink.limit + byteCount > SIZE) {
      // We can't fit byteCount bytes at the sink's current position. Shift sink first.
      if (sink.shared) throw IllegalArgumentException()
      if (sink.limit + byteCount - sink.pos > SIZE) throw IllegalArgumentException()
      sink.data.copyInto(sink.data, startIndex = sink.pos, endIndex = sink.limit)
      sink.limit -= sink.pos
      sink.pos = 0
    }

    data.copyInto(
      sink.data, destinationOffset = sink.limit, startIndex = pos,
      endIndex = pos + byteCount
    )
    sink.limit += byteCount
    pos += byteCount
  }

  companion object {
    /**Segment的容量,最大为8kb */
    const val SIZE = 8192

    /** 
如果Segment中字节数 > SHARE_MINIMUM时(大Segment),就可以共享,不能添加到SegmentPool */
    const val SHARE_MINIMUM = 1024
  }
}

SegmentPool

internal actual object SegmentPool {
  // SegmentPool的最大容量
  actual val MAX_SIZE = 64 * 1024 // 64 KiB.

 //指示链表当前正在修改的标记段。
  private val LOCK = Segment(ByteArray(0), pos = 0, limit = 0, shared = false, owner = false)


  private val HASH_BUCKET_COUNT =
    Integer.highestOneBit(Runtime.getRuntime().availableProcessors() * 2 - 1)
 //每个哈希桶都包含一个单链段列表。 索引/键是线程 ID 的哈希函数,因为它可以减少争用或增加局部性。 我们不使用 [ThreadLocal] 因为我们不知道主机进程有多少线程,并且我们不想在线程生命周期内泄漏内存。
  private val hashBuckets: Array<AtomicReference<Segment?>> = Array(HASH_BUCKET_COUNT) {
    AtomicReference<Segment?>() // null value implies an empty bucket
  }
  //当前池内的总字节数
  actual val byteCount: Int
    get() {
      val first = firstRef().get() ?: return 0
      return first.limit
    }
  
  //从池中获取一个Segment对象
  @JvmStatic
  actual fun take(): Segment {
    val firstRef = firstRef()

    val first = firstRef.getAndSet(LOCK)
    when {
      first === LOCK -> {
        return Segment()
      }
      first == null -> {
        firstRef.set(null)
        return Segment()
      }
      else -> {
        firstRef.set(first.next)
        first.next = null
        first.limit = 0
        return first
      }
    }
  }
  //将Segment状态初始化并放入池中
  @JvmStatic
  actual fun recycle(segment: Segment) {
    require(segment.next == null && segment.prev == null)
    if (segment.shared) return // This segment cannot be recycled.

    val firstRef = firstRef()

    val first = firstRef.get()
    if (first === LOCK) return // A take() is currently in progress.
    val firstLimit = first?.limit ?: 0
    if (firstLimit >= MAX_SIZE) return // Pool is full.

    segment.next = first
    segment.pos = 0
    segment.limit = firstLimit + Segment.SIZE

    if (!firstRef.compareAndSet(first, segment)) {
      segment.next = null // Don't leak a reference in the pool either!
    }
  }

  private fun firstRef(): AtomicReference<Segment?> {
    val hashBucket = (Thread.currentThread().id and (HASH_BUCKET_COUNT - 1L)).toInt()
    return hashBuckets[hashBucket]
  }
}

SegmentPool 可以理解为一个缓存Segment的池,它只有两个方法,一个 take(),一个 recycle(),在 SegmentPool 中维护的是一个 Segment 的单链表,并且它的最大值为 MAX_SIZE = 64 * 1024 也就是 64kb 即 8 个 Segment 的长度

take() 方法的作用是取出单链表的头结点 Segment 对象,然后将取出的对象与链表断开并将链表往后移动一个单位,如果是第一次调用 take, 则会直接 new 一个 Segment 对象返回,并且这里创建的Segment是不共享的。

recycle() 方法的作用则是回收一个 Segment 对象,被回收的 Segment 对象将会被插入到 SegmentPool 中的单链表的头部,以便后面继续复用,并且这里源码我们也可以看到如果是 shared 的对象是不处理的,如果是第一次调用recycle() 方法则链表会由空变为拥有一个节点的链表, 每次回收就会插入一个到表头,直到超过最大容量。