源码浅析-iOS缓存NSCacheNSCache 是 iOS 上常用的缓存机制。其内部数据结构是「哈希表 + 双向链表

摘要

NSCache 是 iOS 上常用的缓存机制。

其内部数据结构是「哈希表 + 双向链表」。

当需要释放空间时，它优先删除 cost 较低的。

本文记录了 swift-corelibs-foundation/NSCache.swift 里的实现，文末附录了 GNUStep 的实现。

分析

核心变量如下，可看到其中的「哈希表 + 双向链表」结构。

open class NSCache<KeyType : AnyObject, ObjectType : AnyObject> : NSObject {
    // note(jd): 哈希表
    private var _entries = Dictionary<NSCacheKey, NSCacheEntry<KeyType, ObjectType>>()

    private let _lock = NSLock() // note(jd): 加锁，保证线程安全

    private var _head: NSCacheEntry<KeyType, ObjectType>? // note(jd): 双向链表节点

    // note(jd): 缓存的 2 个限制
    open var totalCostLimit: Int = 0 // limits are imprecise/not strict
    open var countLimit: Int = 0 // limits are imprecise/not strict
    ...
}

双向链表节点：

其中的 cost，由接入方设置，一般会是缓存的大小。

private class NSCacheEntry<KeyType : AnyObject, ObjectType : AnyObject> {
    var key: KeyType
    var value: ObjectType
    var cost: Int
    var prevByCost: NSCacheEntry?
    var nextByCost: NSCacheEntry?
    init(key: KeyType, value: ObjectType, cost: Int) {
        self.key = key
        self.value = value
        self.cost = cost
    }
}

对于 get，只是从 _entries 中获取。

我们重点关注 set 的过程。

open func setObject(_ obj: ObjectType, forKey key: KeyType, cost g: Int) {
    let g = max(g, 0)
    let keyRef = NSCacheKey(key)

    _lock.lock()

    let costDiff: Int

    if let entry = _entries[keyRef] { // note(jd): 该 obj 已存在，更新 cost、value
        costDiff = g - entry.cost
        entry.cost = g

        entry.value = obj

        // note(jd): 为何 cost 不同时，需要移除后，再重新插入呢？
        // 因为需要保证链表按 cost 大小排序
        if costDiff != 0 {
            remove(entry)
            insert(entry)
        }
    } else {// note(jd): obj 不存在，直接插入
        let entry = NSCacheEntry(key: key, value: obj, cost: g)
        _entries[keyRef] = entry
        insert(entry)

        costDiff = g
    }

    _totalCost += costDiff
    // note(jd): 超过 cost 的处理
    var purgeAmount = (totalCostLimit> 0) ? (_totalCost - totalCostLimit) : 0
    while purgeAmount > 0 {
        if let entry = _head {// note(jd): 删除头节点
            delegate?.cache(unsafeDowncast(self, to:NSCache<AnyObject, AnyObject>.self), willEvictObject: entry.value)

            _totalCost -= entry.cost
            purgeAmount -= entry.cost

            remove(entry) // _head will be changed to next entry in remove(_:)
            _entries[NSCacheKey(entry.key)] = nil
        } else {
            break
        }
    }

    // note(jd): 超过 count 的处理

    var purgeCount = (countLimit> 0) ? (_entries.count - countLimit) : 0
    while purgeCount > 0 {
        // 与上面类似
    }

    _lock.unlock()
}

而在插入代码节点时，会保证链表是按 cost 升序排列。

private func insert(_ entry: NSCacheEntry<KeyType, ObjectType>) {
    guard var currentElement = _head else {
        // The cache is empty
        entry.prevByCost = nil
        entry.nextByCost = nil

        _head = entry
        return
    }
    // note(jd): 以下代码，会保证按 cost 升序排列节点

    guard entry.cost > currentElement.cost else {
        // Insert entry at the head
        entry.prevByCost = nil
        entry.nextByCost = currentElement
        currentElement.prevByCost = entry

        _head = entry
        return
    }

    // note(jd): 寻找合适和插入位置
    while let nextByCost = currentElement.nextByCost, nextByCost.cost < entry.cost {
        currentElement = nextByCost
    }

    // Insert entry between currentElement and nextElement
    let nextElement = currentElement.nextByCost

    currentElement.nextByCost = entry
    entry.prevByCost = currentElement

    entry.nextByCost = nextElement
    nextElement?.prevByCost = entry
}

小结

不同于 LRU 或 LFU，释放空间时，NSCache 的做法是，删除 cost 最小的节点。

其算法总结如下：

插入时，需保证链表是按 cost 升序排列。
当需要释放空间时，从链表的头部逐个删除。
需要释放空间的情况：totalCount > countLimit 或 totalCost > costLimit。

若对 LRU 或 LFU 的算法实现有兴趣，可看看这里算法练习 - LRU、LFU 缓存机制。

PS：

一开始笔者误以为，释放空间时是删除 cost 最大的，因为这样可以腾出更多「空间」。

经评论区大佬指正，发现恰恰相反。

为什么会这样？个人理解，可能是设计者希望留下的缓存尽可能更「值钱」些。

GNUStep 中实现

评论区有大佬测试发现 iOS 里使用 NSCache 更像是 LRU，笔者对比了 NSCache | Apple Developer Documentation 里的 API，发现与上述实现有差异。所以，重新找了 GNUstep ，这里的实现会更贴近官方文档的描述。

GNUstep，GNU 计划的项目之一。它将 Cocoa（前身为 NeXT 的 OpenStep）Objective-C 软件库，部件工具箱（widget toolkits）以及其上的应用软件，以自由软件方式重新实现。它能够运行在类 Unix 操作系统上，也能运作在 Microsoft Windows 上。

简单看下代码。

注释里提到按照 LRU 规则记录被访问的数据

/** LRU ordering of all potentially-evictable objects in this cache. */
// NSMutableArray *_accesses

也可以看到获取 object，最新访问的数据在数组最后

- (id) objectForKey: (id)key
{
  _GSCachedObject *obj = [_objects objectForKey: key];

  if (nil == obj)
    {
      return nil;
    }
  if (obj->isEvictable)
    {
      // Move the object to the end of the access list.
      [_accesses removeObjectIdenticalTo: obj];
      [_accesses addObject: obj];
    }
  obj->accessCount++;
  _totalAccesses++;
  return obj->object;
}

而核心的清理缓存方法，注释里提到这是个「simple LRU/LFU hybrid」 LRU 容易理解，毕竟上面就是按这个规则记录信息的，至于 LFU，关键在于代码里，取了一个 averageAccesses，用以优先清理「低频」数据。

截取部分代码如下：

/**
 * This method is the one that handles the eviction policy.  This
 * implementation uses a relatively simple LRU/LFU hybrid.  The NSCache
 * documentation from Apple makes it clear that the policy may change, so we
 * could in future have a class cluster with pluggable policies for different
 * caches or some other mechanism.
 */
- (void)_evictObjectsToMakeSpaceForObjectWithCost: (NSUInteger)cost
{
    // 计算出需要的空间 spaceNeeded ...

  // Only evict if we need the space.
  if (count > 0 && (spaceNeeded > 0 || count >= _countLimit))
    {
      NSMutableArray *evictedKeys = nil;
      // Round up slightly.
      NSUInteger averageAccesses = ((_totalAccesses / (double)count) * 0.2) + 1;
      // 需要清理时，遍历 _accesses
      NSEnumerator *e = [_accesses objectEnumerator];
      _GSCachedObject *obj;

      ...

      while (nil != (obj = [e nextObject]))
	{
        // 关键点： `obj->accessCount < averageAccesses` 这一条件，会优先清理「低频」数据
	  // Don't evict frequently accessed objects.
	  if (obj->accessCount < averageAccesses && obj->isEvictable)
	    {
	      [obj->object discardContentIfPossible];
	      if ([obj->object isContentDiscarded])
		{
		  NSUInteger cost = obj->cost;

		  // Evicted objects have no cost.
		  obj->cost = 0;
		  // Don't try evicting this again in future; it's gone already.
		  obj->isEvictable = NO;
		  // Remove this object as well as its contents if required
		  if (_evictsObjectsWithDiscardedContent)
		    {
		      [evictedKeys addObject: obj->key];
		    }
		  _totalCost -= cost;
		  // If we've freed enough space, give up
		  if (cost > spaceNeeded)
		    {
		      break;
		    }
		  spaceNeeded -= cost;
		}
	    }
	}
      // 清理 evictedKeys 里对应的数据 ...
    }
}

小结下，日常使用 NSCache 时，可以简单认为这是个 LRU 算法即可。