类的本质是objc_class结构体,里面存储了 isa, superClass, cache, bits, 还剩下 cache 没有探索,那么 cache 缓存的是什么,如何让进行缓存的呢?
cache 的结构和存储内容
cache 结构
打开源码找到 cache:
struct cache_t {
explicit_atomic<uintptr_t> _bucketsAndMaybeMask;
union {
struct {
explicit_atomic<mask_t> _maybeMask;
#if __LP64__
uint16_t _flags;
#endif
uint16_t _occupied; // 2 记录当前存储的方法数量
};
explicit_atomic<preopt_cache_t *> _originalPreoptCache;
};
...
// 缓存为空缓存,第一次判断使用
bool isConstantEmptyCache() const;
bool canBeFreed() const;
// 可使用总容量,为 bucket_t 列表长度 -1
mask_t mask() const;
// 当前 bucket_t 列表已缓存的方法个数加 1
void incrementOccupied();
// 设置 buckets 和 mask
void setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask);
// 重新开辟内存
void reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld);
// 根据 oldCapacity 回收 oldBuckets
void collect_free(bucket_t *oldBuckets, mask_t oldCapacity);
//当前 bucket_t 列表能缓存的最大个数
unsigned capacity() const;
// 获取 buckets
struct bucket_t *buckets() const;
// 获取 class
Class cls() const;
// 当前 bucket_t 列表已缓存的方法个数
mask_t occupied() const;
// 将调用的方法插入到 buckets 所在的内存区域
void insert(SEL sel, IMP imp, id receiver);
...
};
从结构上看,_bucketsAndMaybeMask 占 8 个字节(点进去是个 long 类型)
同理可以知道 _maybeMask 占 4 个字节,_flags 占 2 个字节,_occupied 占 2 个字节,这个结构体一共占 8 个字节
_originalPreoptCache 是个指针,占 8 字节,整个联合体占 8 字节
所以 cache 占 16 字节
打印 cache 的结构
由于 isa 和 superClass 各占 8 字节,所以只要通过类对象向左平移 16 字节就可以得到 cache:
分析 cache 的存储内容
打印发现结构和源码一致,但是存储的东西看不懂,既然成员变量无法知道存储的内容是什么,那就从提供的方法中寻找,从源码中看到了下面的方法:
void insert(SEL sel, IMP imp, id receiver);
这个很有可能是存入数据的地方,进入源码查看:
void cache_t::insert(SEL sel, IMP imp, id receiver)
{
runtimeLock.assertLocked();
// Never cache before +initialize is done
if (slowpath(!cls()->isInitialized())) {
return;
}
if (isConstantOptimizedCache()) {
_objc_fatal("cache_t::insert() called with a preoptimized cache for %s",
cls()->nameForLogging());
}
#if DEBUG_TASK_THREADS
return _collecting_in_critical();
#else
#if CONFIG_USE_CACHE_LOCK
mutex_locker_t lock(cacheUpdateLock);
#endif
ASSERT(sel != 0 && cls()->isInitialized());
// Use the cache as-is if until we exceed our expected fill ratio.
mask_t newOccupied = occupied() + 1;
unsigned oldCapacity = capacity(), capacity = oldCapacity;
if (slowpath(isConstantEmptyCache())) {
// Cache is read-only. Replace it.
if (!capacity) capacity = INIT_CACHE_SIZE;
reallocate(oldCapacity, capacity, /* freeOld */false);
}
else if (fastpath(newOccupied + CACHE_END_MARKER <= cache_fill_ratio(capacity))) {
// Cache is less than 3/4 or 7/8 full. Use it as-is.
}
#if CACHE_ALLOW_FULL_UTILIZATION
else if (capacity <= FULL_UTILIZATION_CACHE_SIZE && newOccupied + CACHE_END_MARKER <= capacity) {
// Allow 100% cache utilization for small buckets. Use it as-is.
}
#endif
else {
capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE;
if (capacity > MAX_CACHE_SIZE) {
capacity = MAX_CACHE_SIZE;
}
reallocate(oldCapacity, capacity, true);
}
bucket_t *b = buckets();
mask_t m = capacity - 1;
mask_t begin = cache_hash(sel, m);
mask_t i = begin;
// Scan for the first unused slot and insert there.
// There is guaranteed to be an empty slot.
do {
if (fastpath(b[i].sel() == 0)) {
incrementOccupied();
b[i].set<Atomic, Encoded>(b, sel, imp, cls());
return;
}
if (b[i].sel() == sel) {
// The entry was added to the cache by some other thread
// before we grabbed the cacheUpdateLock.
return;
}
} while (fastpath((i = cache_next(i, m)) != begin));
bad_cache(receiver, (SEL)sel);
#endif // !DEBUG_TASK_THREADS
}
从函数的参数来看,这是插入的是方法。上面代码里主要变量 occupied 是数据占据大小,capacity 就是 buckets 的容量,buckets 就是存放数据的地方,然后再来仔细分析这些参数
occupied, capacity, buckets 分析
occupied 和 capacity
occupied 就是 cache 的成员变量 _occupied
mask_t cache_t::occupied() const
{
return _occupied;
}
capacity 需要通过 mask() 方法去获取,而 mask() 方法是通过 cache 的成员变量 _maybeMask 获取的,capacity 就等于 _maybeMask + 1, 如果 _maybeMask 为空,capacity 就为 0:
unsigned cache_t::capacity() const
{
return mask() ? mask()+1 : 0;
}
mask_t cache_t::mask() const
{
return _maybeMask.load(memory_order_relaxed);
}
buckets
首先查看一下如何获取到 buckets, 它是通过 cache 的成员变量 _bucketsAndMaybeMask 获取的,bucketsMask 在不同的架构下有不同的值:
struct bucket_t *cache_t::buckets() const
{
uintptr_t addr = _bucketsAndMaybeMask.load(memory_order_relaxed);
return (bucket_t *)(addr & bucketsMask);
}
#if defined(__arm64__) && __LP64__
// arm64 架构并且 Long 和 Point 都是 64 位,即 64 位系统
#if TARGET_OS_OSX || TARGET_OS_SIMULATOR
// Mac 或 模拟器
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRS
#else
// 真机
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_HIGH_16
#endif
#elif defined(__arm64__) && !__LP64__
// arm64 架构并且 32 位系统
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_LOW_4
#else
// x86 架构
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_OUTLINED
#endif
#define CACHE_MASK_STORAGE_OUTLINED 1
#define CACHE_MASK_STORAGE_HIGH_16 2
#define CACHE_MASK_STORAGE_LOW_4 3
#define CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRS 4
进入
bucket_t 源码可以发现里面存储的是方法编号 sel 和指向方法实现的地址指针 imp ,并且提供了 set 模版方法来存储 sel 和 imp:
struct bucket_t {
private:
// IMP-first is better for arm64e ptrauth and no worse for arm64.
// SEL-first is better for armv7* and i386 and x86_64.
#if __arm64__
explicit_atomic<uintptr_t> _imp;
explicit_atomic<SEL> _sel;
#else
explicit_atomic<SEL> _sel;
explicit_atomic<uintptr_t> _imp;
...
template <Atomicity, IMPEncoding>
void set(bucket_t *base, SEL newSel, IMP newImp, Class cls);
#endif
总结就是 buckets() 用来获取 bucket_t 列表,也就是获取存储缓存的哈希表
cache 结构总览
从上面的分析可以看出 cache 每个成员变量的作用了,因为 _originalPreoptCache 存在联合体中,和结构体里面的数据只能存储一个,里面存储的和结构体差不多,所以可以不用管了
struct cache_t {
explicit_atomic<uintptr_t> _bucketsAndMaybeMask; // 获取容器的地址
union {
struct {
explicit_atomic<mask_t> _maybeMask; // 获取容器的容量
#if __LP64__
uint16_t _flags;
#endif
uint16_t _occupied; // 记录当前存储的方法数量
};
explicit_atomic<preopt_cache_t *> _originalPreoptCache;
};
};
cache 存入数据分析
初始化操作
首先如果缓存没有初始化,即 occupied == 0, buckets() 为空,需要进行初始化操作:
if (slowpath(isConstantEmptyCache())) {
// Cache is read-only. Replace it.
if (!capacity) capacity = INIT_CACHE_SIZE;
reallocate(oldCapacity, capacity, /* freeOld */false);
}
先确定以下变量的值:
-
newOccupied
newOccupied = occupied() + 1;而occupied()直接返回cache_t结构体的成员变量_occupied,也就是返回的当前缓存的数量,在初次进入的时候缓存的数量为 0,_occupied也就是 0,newOccupied为 1。 -
oldCapacity
oldCapacity调用的是mask()方法,而mask()返回的是cache_t结构体中联合体的_maybeMask,为bucket_t列表的长度 -1(下标从 0 开始),如果_maybeMask有值则+1,否则为 0 -
capacity
capacity初始值为oldCapacity,代表bucket_t列表的长度,容器的大小
再确定 INIT_CACHE_SIZE 的值:
/* Initial cache bucket count. INIT_CACHE_SIZE must be a power of two. */
enum {
#if CACHE_END_MARKER || (__arm64__ && !__LP64__)
// CACHE_END_MARKER arm64 架构下为 0,x86_64 架构下为 1.
// 此处为 x86_64 架构下的情况
// When we have a cache end marker it fills a bucket slot, so having a
// initial cache size of 2 buckets would not be efficient when one of the
// slots is always filled with the end marker. So start with a cache size
// 4 buckets.
INIT_CACHE_SIZE_LOG2 = 2,
#else
// arm64 架构下的情况
// Allow an initial bucket size of 2 buckets, since a large number of
// classes, especially metaclasses, have very few imps, and we support
// the ability to fill 100% of the cache before resizing.
INIT_CACHE_SIZE_LOG2 = 1,
#endif
INIT_CACHE_SIZE = (1 << INIT_CACHE_SIZE_LOG2),
MAX_CACHE_SIZE_LOG2 = 16,
MAX_CACHE_SIZE = (1 << MAX_CACHE_SIZE_LOG2),
FULL_UTILIZATION_CACHE_SIZE_LOG2 = 3,
FULL_UTILIZATION_CACHE_SIZE = (1 << FULL_UTILIZATION_CACHE_SIZE_LOG2),
};
#if __arm__ || __x86_64__ || __i386__
// x86_64
// objc_msgSend has few registers available.
// Cache scan increments and wraps at special end-marking bucket.
#define CACHE_END_MARKER 1
// Historical fill ratio of 75% (since the new objc runtime was introduced).
static inline mask_t cache_fill_ratio(mask_t capacity) {
return capacity * 3 / 4;
}
#elif __arm64__ && !__LP64__
// objc_msgSend has lots of registers available.
// Cache scan decrements. No end marker needed.
#define CACHE_END_MARKER 0
// Historical fill ratio of 75% (since the new objc runtime was introduced).
static inline mask_t cache_fill_ratio(mask_t capacity) {
return capacity * 3 / 4;
}
#elif __arm64__ && __LP64__
// arm64
// objc_msgSend has lots of registers available.
// Cache scan decrements. No end marker needed.
#define CACHE_END_MARKER 0
// Allow 87.5% fill ratio in the fast path for all cache sizes.
// Increasing the cache fill ratio reduces the fragmentation and wasted space
// in imp-caches at the cost of potentially increasing the average lookup of
// a selector in imp-caches by increasing collision chains. Another potential
// change is that cache table resizes / resets happen at different moments.
static inline mask_t cache_fill_ratio(mask_t capacity) {
return capacity * 7 / 8;
}
以 arm64 架构为例,INIT_CACHE_SIZE = 1 << 1, 为 2,即 arm64 架构下 cache 容量的初始值为 2
reallocate
void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld)
{
bucket_t *oldBuckets = buckets();
bucket_t *newBuckets = allocateBuckets(newCapacity);
// Cache's old contents are not propagated.
// This is thought to save cache memory at the cost of extra cache fills.
// fixme re-measure this
ASSERT(newCapacity > 0);
ASSERT((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);
setBucketsAndMask(newBuckets, newCapacity - 1);
if (freeOld) {
collect_free(oldBuckets, oldCapacity);
}
}
总结如下:
- 首先获得老的
bucket_t列表地址 - 然后获得新的
bucket_t列表地址 setBucketsAndMask主要作用是为cache_t结构体成员变量赋初值- 用
freeOld来判断是否要释放老的bucket_t列表
缓存不为空,判断是否需要扩容
首先确定 CACHE_END_MARKER 在 arm64 架构下为 0,x86_64 架构下为 1,
cache_fill_ratio 在 arm64 架构下为 7/8,x86_64 架构下为 3/4,
FULL_UTILIZATION_CACHE_SIZE 为 8,
CACHE_ALLOW_FULL_UTILIZATION 在 arm64 架构下为 1
else if (fastpath(newOccupied + CACHE_END_MARKER <= cache_fill_ratio(capacity))) {
// Cache is less than 3/4 or 7/8 full. Use it as-is.
}
#if CACHE_ALLOW_FULL_UTILIZATION
else if (capacity <= FULL_UTILIZATION_CACHE_SIZE && newOccupied + CACHE_END_MARKER <= capacity) {
// Allow 100% cache utilization for small buckets. Use it as-is.
}
#endif
可以得出以下结论:
- arm64 架构下,实际缓存大小小于等于 8 时,不进行扩容,填满为止。容量大于 8 时,实际缓存大小小于等于 7/8 容量时不进行扩容
- x86_64 架构下,实际缓存大小加 1 小于等于容量的 3/4 时,不进行扩容
bucket_t扩容
else {
capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE;
if (capacity > MAX_CACHE_SIZE) {
capacity = MAX_CACHE_SIZE;
}
reallocate(oldCapacity, capacity, true);
}
- 当
capacity为 0 的时候,初始化为INIT_CACHE_SIZE的大小 - 正常为 2 倍扩容
freeOld为true, 说明扩容时会释放掉旧的bucket_t列表,即原有缓存的方法会被释放掉
往 bucket_t 列表存储
bucket_t *b = buckets();
mask_t m = capacity - 1;
mask_t begin = cache_hash(sel, m);
mask_t i = begin;
// Scan for the first unused slot and insert there.
// There is guaranteed to be an empty slot.
do {
if (fastpath(b[i].sel() == 0)) {
incrementOccupied();
b[i].set<Atomic, Encoded>(b, sel, imp, cls());
return;
}
if (b[i].sel() == sel) {
// The entry was added to the cache by some other thread
// before we grabbed the cacheUpdateLock.
return;
}
} while (fastpath((i = cache_next(i, m)) != begin));
static inline mask_t cache_hash(SEL sel, mask_t mask)
{
uintptr_t value = (uintptr_t)sel;
#if CONFIG_USE_PREOPT_CACHES
value ^= value >> 7;
#endif
return (mask_t)(value & mask);
}
#if CACHE_END_MARKER
// x86_64
static inline mask_t cache_next(mask_t i, mask_t mask) {
return (i+1) & mask;
}
#elif __arm64__
static inline mask_t cache_next(mask_t i, mask_t mask) {
return i ? i-1 : mask;
}
- 首先获取
bucket_t列表,存在b里 m为扩容后的bucket_t列表的长度 -1- 通过
cache_hash计算出sel在bucket_t列表里的位置cache_hash中value为很大的数,这么做的目的是为了使计算出的索引不大于mask,即capacity - 1- 如果
b[i].sel() == 0,代表没有值,则进行插入操作,incrementOccupied()实质是_occupied++, 再调用set方法存入方法 - 如果
b[i].sel() == sel,说明方法已经缓存过了,直接返回 - 否则就出现了 hash 冲突,通过
cache_next解决 hash 冲突,cache_next实质是:如果在 x86_64 架构下,返回后一位的索引;如果在 arm64 架构下,传入的索引为 0,返回最后一个索引,否则就返回前一个
存储数据
set 方法是往 bucket_t 列表里存入一个 bucket_t,以 arm64 架构为例:
#if __arm64__
template<Atomicity atomicity, IMPEncoding impEncoding>
void bucket_t::set(bucket_t *base, SEL newSel, IMP newImp, Class cls)
{
ASSERT(_sel.load(memory_order_relaxed) == 0 ||
_sel.load(memory_order_relaxed) == newSel);
static_assert(offsetof(bucket_t,_imp) == 0 &&
offsetof(bucket_t,_sel) == sizeof(void *),
"bucket_t layout doesn't match arm64 bucket_t::set()");
uintptr_t encodedImp = (impEncoding == Encoded
? encodeImp(base, newImp, newSel, cls)
: (uintptr_t)newImp);
// LDP/STP guarantees that all observers get
// either imp/sel or newImp/newSel
stp(encodedImp, (uintptr_t)newSel, this);
}
这里提供一个模板函数,encodedImp 是做签名用,其返回是一个签名后的 imp 地址
lldb 验证
以 arm64 架构为例 在 insert() 中添加一行打印:
void cache_t::insert(SEL sel, IMP imp, id receiver)
{
printf("%s", sel_getName(sel));
发现在没执行 method1 方法前,_maybeMask 和 _occupied 为0
执行一下 method1, 发现 _occupied 为 1,_maybeMask 为 0,_maybeMask 为容器的大小,可以通过 mask() 方法获取:
当走过 method3 时,发生了扩容,并且是满容量扩容:
那么根据前面的理论,当运行过 method6 时,会再次满容量,执行过 method7 时,会再次扩容:
执行了 method14 又是满容量的点,执行过 method15 又会扩容:
因为容量大于 8,所以下次实际大小大于 7/8 是就会扩容,也就是第 15 个方法执行完就会扩容,即 method29:
疑问:
-
整个往
bucket_t列表中插入的操作其实就是往哈希表中进行插入的操作,那么苹果为什么对于方法的缓存要使用哈希表呢?时间换空间,hash 表查找的复杂度为 O(1)
-
在扩容时,为什么要销毁掉旧的 bucket_t 列表,之前缓存的方法也会消失?
- 由于哈希表的特性 -- 地址映射, 当每次总表扩容时, 所有元素的映射都会失效(有些索引是根据容量定的), 因为总容量变了, 下标哈希结果也会改变
- 如果需要之前所有所缓存的方法都重新存储, 消耗与花费有点过于大了
- 扩容是按照指数级的增加的,如果及时清除,可以缓存更多的方法,减少扩容次数,从而提高效率