前言
我们知道,实例方法是在bits里面存储的,那么在调用方法的时候,就是从bits中读取的,如果方法少的话,还可以,如果方法多的话,每次都读取会很消耗时间的,那么该怎么解决呢,答案是用缓存。 这样的话,在调用方法的时候,我直接从缓存中取,就很快了,那么方法的缓存是怎样的结构呢, 这篇文章就讨论这个问题。
cache_t的代码结构
objc_class里面有isa,superclass,bits,还设有一个cache,这个是存储什么的? 先看下代码
struct cache_t {
struct bucket_t *_buckets;
mask_t _mask;
mask_t _occupied;
public:
struct bucket_t *buckets();
mask_t mask();
mask_t occupied();
void incrementOccupied();
void setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask);
void initializeToEmpty();
mask_t capacity();
bool isConstantEmptyCache();
bool canBeFreed();
static size_t bytesForCapacity(uint32_t cap);
static struct bucket_t * endMarker(struct bucket_t *b, uint32_t cap);
void expand();
void reallocate(mask_t oldCapacity, mask_t newCapacity);
struct bucket_t * find(cache_key_t key, id receiver);
static void bad_cache(id receiver, SEL sel, Class isa) __attribute__((noreturn));
};
struct bucket_t {
private:
// IMP-first is better for arm64e ptrauth and no worse for arm64.
// SEL-first is better for armv7* and i386 and x86_64.
#if __arm64__
MethodCacheIMP _imp;
cache_key_t _key;
#else
cache_key_t _key;
MethodCacheIMP _imp;
#endif
public:
inline cache_key_t key() const { return _key; }
inline IMP imp() const { return (IMP)_imp; }
inline void setKey(cache_key_t newKey) { _key = newKey; }
inline void setImp(IMP newImp) { _imp = newImp; }
void set(cache_key_t newKey, IMP newImp);
};
可以看出,cache_t里面主要包含了一个结构体数组指针 _buckets,而这个结构体里面包含了方法实现的指针,和缓存key
cache_t的数据结构
如果_buckets也是一个简单的数组的话,那效率还是低,那用什么数据结构效率高呢,答案是hash表,我们从代码也能看出
bucket_t * cache_t::find(cache_key_t k, id receiver)
{
assert(k != 0);
bucket_t *b = buckets();
mask_t m = mask();
// 通过cache_hash函数【begin = k & m】计算出key值 k 对应的 index值 begin,用来记录查询起始索引
mask_t begin = cache_hash(k, m);
// begin 赋值给 i,用于切换索引
mask_t i = begin;
do {
if (b[i].key() == 0 || b[i].key() == k) {
//用这个i从散列表取值,如果取出来的bucket_t的 key = k,则查询成功,返回该bucket_t,
//如果key = 0,说明在索引i的位置上还没有缓存过方法,同样需要返回该bucket_t,用于中止缓存查询。
return &b[i];
}
} while ((i = cache_next(i, m)) != begin);
// 这一步其实相当于 i = i-1,回到上面do循环里面,相当于查找散列表上一个单元格里面的元素,再次进行key值 k的比较,
//当i=0时,也就i指向散列表最首个元素索引的时候重新将mask赋值给i,使其指向散列表最后一个元素,重新开始反向遍历散列表,
//其实就相当于绕圈,把散列表头尾连起来,不就是一个圈嘛,从begin值开始,递减索引值,当走过一圈之后,必然会重新回到begin值,
//如果此时还没有找到key对应的bucket_t,或者是空的bucket_t,则循环结束,说明查找失败,调用bad_cache方法。
// hack
Class cls = (Class)((uintptr_t)this - offsetof(objc_class, cache));
cache_t::bad_cache(receiver, (SEL)k, cls);
}
我们从这个方法中的源码 cache_hash(k, m);,推断出cache_t是采用的hash表来存储的,而当存储的时候,数组的大小是怎么控制呢?
cache_t的数组动态扩容
同样,我们从代码分析入手
static void cache_fill_nolock(Class cls, SEL sel, IMP imp, id receiver)
{
cacheUpdateLock.assertLocked();
// Never cache before +initialize is done
if (!cls->isInitialized()) return;
// Make sure the entry wasn't added to the cache by some other thread
// before we grabbed the cacheUpdateLock.
if (cache_getImp(cls, sel)) return;
cache_t *cache = getCache(cls);
cache_key_t key = getKey(sel);
// Use the cache as-is if it is less than 3/4 full
mask_t newOccupied = cache->occupied() + 1;
mask_t capacity = cache->capacity();
if (cache->isConstantEmptyCache()) {
// Cache is read-only. Replace it.
cache->reallocate(capacity, capacity ?: INIT_CACHE_SIZE);
}
else if (newOccupied <= capacity / 4 * 3) {
// Cache is less than 3/4 full. Use it as-is.
}
else {
// Cache is too full. Expand it.
cache->expand();
}
// Scan for the first unused slot and insert there.
// There is guaranteed to be an empty slot because the
// minimum size is 4 and we resized at 3/4 full.
bucket_t *bucket = cache->find(key, receiver);
if (bucket->key() == 0) cache->incrementOccupied();
bucket->set(key, imp);
}
void cache_t::expand()
{
cacheUpdateLock.assertLocked();
uint32_t oldCapacity = capacity();
uint32_t newCapacity = oldCapacity ? oldCapacity*2 : INIT_CACHE_SIZE;
if ((uint32_t)(mask_t)newCapacity != newCapacity) {
// mask overflow - can't grow further
// fixme this wastes one bit of mask
newCapacity = oldCapacity;
}
reallocate(oldCapacity, newCapacity);
}
从代码中,可以看出,初始化的时候,cache的容量是4,当当前使用的数量>=容量的3/4的时候,进行扩容,容量是原来的2倍,并且释放元类的缓存。