006-cache_t分析通过这篇文章可以获得什么 cache_t是什么？ cache_t部分源码分析关键函数inse

通过这篇文章可以获得什么

cache_t是什么？
cache_t部分源码分析
关键函数insert分析
为什么要清空oldBuckets，而不是空间扩容，然后在后面附加新的缓存呢？
reallocate分析
cache_fill_ratio分析
setBucketsAndMask分析
LLDB动态调试验证cache_t结构，此处有疑问，欢迎高手帮忙看一下
模仿底层源码，通过NSLog的方式打印cache_t内缓存的buckets
cache_t访问流程图

cache_t是什么？

在类的方法调用过程中，已知过程是通过SEL(方法编号)在内存中查找IMP(方法指针)，为了使方法响应更加快速，效率更高，不需要每一次都去内存中把方法都遍历一遍，cache_t结构体出现了。cache_t将调用过的方法的SEL和IMP以及receiver以bucket_t结构体方式存储在当前类结构中，以便后续方法的查找。

粗略图解：

cache_t流程.jpeg

cache_t部分源码分析

struct cache_t

_bucketsAndMaybeMask：存放数据的bit信息，类似于isa不同bit位存放的数据是什么，当前存放的是buckets和maybeMask
_maybeMask：当前的缓存区count，第一次开辟是3
_occupied：当前cache的可存储的buckets数量，默认是0
incrementOccupied():执行_occupied++，_occupied默认是0，每次有方法的插入都会被执行，本质上就是占位+1

struct cache_t {
private:
    explicit_atomic<uintptr_t> _bucketsAndMaybeMask; // 8
    union {
        struct {
            explicit_atomic<mask_t>    _maybeMask; // 4
#if __LP64__
            uint16_t                   _flags;  // 2
#endif
            uint16_t                   _occupied; // 2
        };
        explicit_atomic<preopt_cache_t *> _originalPreoptCache; // 8

        //第一次时候的条件判定
        bool isConstantEmptyCache() const;
        bool canBeFreed() const;
        mask_t mask() const;  
        //增量占用
        void incrementOccupied();
        //buckets存储
        void setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask);
        //分配内存
        void reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld);
        //将oldBuckets回收到垃圾桶
        void collect_free(bucket_t *oldBuckets, mask_t oldCapacity);
public:
        unsigned capacity() const;
        //创建buckets
        struct bucket_t *buckets() const;
        Class cls() const;
        //初始化occupied，默认是0
        mask_t occupied() const;
        //将调用的方法插入到cache中
        void insert(SEL sel, IMP imp, id receiver);
};

关键函数insert：

首先将newOccupied初始化，也就是占位+1
isConstantEmptyCache判定的是不是第一次缓存方法，如果是第一次缓存方法，那么会开始开辟空间INIT_CACHE_SIZE = (1 << INIT_CACHE_SIZE_LOG2);,INIT_CACHE_SIZE_LOG2 = 2,也就是说将1<<2位，得到4，那么暂时默认开辟4个buckets的空间。
cache_fill_ratio(capacity)判定如果不是第一次，则在判定当前是否占用的3/4容积，如果未达到，这什么也不做，继续向下执行
如果缓存空间不足的时候，会进入到else分支，执行 capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE;直接将空间*2，也就是此时空间将会达到8，但是调用reallocate函数重新分配空间之后，第二次真实开辟空间为7
第二步和第四步都会调用调用reallocate重新分配空间，会执行setBucketsAndMask(newBuckets, newCapacity - 1)，这时真实开辟的空间为newCapacity-1，如果第一次就是3，第二次就是7，一次类推
进入到setBucketsAndMask函数里面，可以看到这段代码_bucketsAndMaybeMask.store((uintptr_t)newBuckets, memory_order_release);哇是不是一下就清晰了，为什么cache_t结构体的_bucketsAndMaybeMask里面有bucekts，是这这里存储的奥。nice奥，柳暗花明。
开辟空间结束之后就开始创建buckets，创建方法的存储位置m，然后使用cache_hash(sel, m)将sel做一次hash赋值给begin，然后使用do-While循环查找第一个未使用的位置将方法插入。
最后会执行incrementOccupied即_occupied+1，缓存的方法+1，至此，方法的缓存完成。

void cache_t::insert(SEL sel, IMP imp, id receiver)
{
    runtimeLock.assertLocked();

    
    // Use the cache as-is if until we exceed our expected fill ratio.
    //初始化
    mask_t newOccupied = occupied() + 1; // 1+1
    unsigned oldCapacity = capacity(), capacity = oldCapacity;
    if (slowpath(isConstantEmptyCache())) {
        // Cache is read-only. Replace it.
        if (!capacity) capacity = INIT_CACHE_SIZE;//4
        reallocate(oldCapacity, capacity, /* freeOld */false);
    }
    else if (fastpath(newOccupied + CACHE_END_MARKER <= cache_fill_ratio(capacity))) {
        // Cache is less than 3/4 or 7/8 full. Use it as-is.
    }
#if CACHE_ALLOW_FULL_UTILIZATION
    else if (capacity <= FULL_UTILIZATION_CACHE_SIZE && newOccupied + CACHE_END_MARKER <= capacity) {
        // Allow 100% cache utilization for small buckets. Use it as-is.
    }
#endif
    else {// 4*2 = 8
        capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE;
        if (capacity > MAX_CACHE_SIZE) {
            capacity = MAX_CACHE_SIZE;
        }
        reallocate(oldCapacity, capacity, true);
    }

    bucket_t *b = buckets();
    mask_t m = capacity - 1; // 4-1=3
    mask_t begin = cache_hash(sel, m);
    mask_t i = begin;

    // Scan for the first unused slot and insert there.
    // There is guaranteed to be an empty slot.
    do {
        if (fastpath(b[i].sel() == 0)) {
            incrementOccupied();
            b[i].set<Atomic, Encoded>(b, sel, imp, cls());
            return;
        }
        if (b[i].sel() == sel) {
            // The entry was added to the cache by some other thread
            // before we grabbed the cacheUpdateLock.
            return;
        }
    } while (fastpath((i = cache_next(i, m)) != begin));

    bad_cache(receiver, (SEL)sel);
#endif // !DEBUG_TASK_THREADS
}

cache_fill_ratio

目前占用的内存容积判定，算法为capacity * 3 / 4，即3/4容积算法，目前应用非常广泛的缓存策略。

// 75% 的历史填充率（因为引入了新的 objc 运行时）。
static inline mask_t cache_fill_ratio(mask_t capacity) {
    return capacity * 3 / 4;
}

reallocate

重新分配空间，这里面bool freeOld代表了是否是扩容，false为第一次加载，true为扩容，如果是扩容的情况下，那么挡墙cache_t内就存在了扩容前缓存的方法，在扩容之后此缓存就变为脏内存了。这里调用了垃圾站方法collect_free(oldBuckets, oldCapacity);将oldBuckets、oldCapacity清空、回收。

void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld)
{
    bucket_t *oldBuckets = buckets();
    bucket_t *newBuckets = allocateBuckets(newCapacity);

    // Cache's old contents are not propagated. 
    // This is thought to save cache memory at the cost of extra cache fills.
    // fixme re-measure this

    ASSERT(newCapacity > 0);
    ASSERT((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);

    setBucketsAndMask(newBuckets, newCapacity - 1);
    
    if (freeOld) {
        collect_free(oldBuckets, oldCapacity);
    }
}

allocateBuckets

创建新的buckets，这里关键点在于endMarker，也就是说不管是第一次创建还是扩容的创建新的bucekts，永远把当前newBucket存储在最后一位，存储格式sel=1，imp=newBucket，如果是第一次就是第4位，扩容之后就是第8位，算法是1<<2+n。但是此位不会被计算在_bucketsAndMaybeMask中，因为setBucketsAndMask(newBuckets, newCapacity - 1);

bucket_t *cache_t::allocateBuckets(mask_t newCapacity)
{
    // Allocate one extra bucket to mark the end of the list.
    // This can't overflow mask_t because newCapacity is a power of 2.
    bucket_t *newBuckets = (bucket_t *)calloc(bytesForCapacity(newCapacity), 1);

    bucket_t *end = endMarker(newBuckets, newCapacity);

#if __arm__
    // End marker's sel is 1 and imp points BEFORE the first bucket.
    // This saves an instruction in objc_msgSend.
    end->set<NotAtomic, Raw>(newBuckets, (SEL)(uintptr_t)1, (IMP)(newBuckets - 1), nil);
#else
    // End marker's sel is 1 and imp points to the first bucket.
    end->set<NotAtomic, Raw>(newBuckets, (SEL)(uintptr_t)1, (IMP)newBuckets, nil);
#endif
    
    if (PrintCaches) recordNewCache(newCapacity);

    return newBuckets;
}

补充

为什么要清空oldBuckets，而不是空间扩容，然后在后面附加新的缓存呢？

解答：已经创建的内存无法更改，这里的内容扩容其实是伪扩容，是创建了一块新的内存，替代了原来的旧内存，之所以使用这种方式，第一，如果将旧buckets的缓存都拿出来，平移到新开辟的buckets上，即数组平移，消耗内存、耗费性能非常的强。第二，苹果缓存策略越新越好，每一次扩容句干掉了之前的oldBuckets。举例说明，A方法被调用了一次，当没有第二次调用了，使用概率非常低的，为什么要把你缓存在内存里呢，没有任何意义，当扩容之后，那么再次调用A方法，会再一次被缓存在内存内，直到下一次扩容之前。

setBucketsAndMask

setBucketsAndMask三个操作:

第一，将新创建的buckets存储在_bucketsAndMaybeMask内。
第二，将newMask，即capacity存储在_maybeMask内。
第三，_occupied = 0，因为现在还并没真正的缓存方法，方法缓存为0。

void cache_t::setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask)
{
#ifdef __arm__
    // ensure other threads see buckets contents before buckets pointer
    mega_barrier();

    _bucketsAndMaybeMask.store((uintptr_t)newBuckets, memory_order_relaxed);

    // ensure other threads see new buckets before new mask
    mega_barrier();

    _maybeMask.store(newMask, memory_order_relaxed);
    _occupied = 0;
#elif __x86_64__ || i386
    // ensure other threads see buckets contents before buckets pointer
    _bucketsAndMaybeMask.store((uintptr_t)newBuckets, memory_order_release);

    // ensure other threads see new buckets before new mask
    _maybeMask.store(newMask, memory_order_release);
    _occupied = 0;
#else
#error Don't know how to do setBucketsAndMask on this architecture.
#endif
}

collect_free

垃圾站方法，将传入的内存地址的内容清空，回收内存

void cache_t::collect_free(bucket_t *data, mask_t capacity)
{
#if CONFIG_USE_CACHE_LOCK
    cacheUpdateLock.assertLocked();
#else
    runtimeLock.assertLocked();
#endif

    if (PrintCaches) recordDeadCache(capacity);

    _garbage_make_room ();
    garbage_byte_size += cache_t::bytesForCapacity(capacity);
    garbage_refs[garbage_count++] = data;
    cache_t::collectNolock(false);
}

void bucket_t::set

将sel，imp存储到buckets里面，_sel.load(memory_order_relaxed) != newSel此条件为判定当前内存内是否存在了即将新存储的newSel，如果有就什么也不做，如果没有，进行存储sel

template<Atomicity atomicity, IMPEncoding impEncoding>
void bucket_t::set(bucket_t *base, SEL newSel, IMP newImp, Class cls)
{
    ASSERT(_sel.load(memory_order_relaxed) == 0 ||
           _sel.load(memory_order_relaxed) == newSel);

    // objc_msgSend uses sel and imp with no locks.
    // It is safe for objc_msgSend to see new imp but NULL sel
    // (It will get a cache miss but not dispatch to the wrong place.)
    // It is unsafe for objc_msgSend to see old imp and new sel.
    // Therefore we write new imp, wait a lot, then write new sel.
    
    uintptr_t newIMP = (impEncoding == Encoded
                        ? encodeImp(base, newImp, newSel, cls)
                        : (uintptr_t)newImp);

    if (atomicity == Atomic) {
        _imp.store(newIMP, memory_order_relaxed);
        
        if (_sel.load(memory_order_relaxed) != newSel) {
#ifdef __arm__
            mega_barrier();
            _sel.store(newSel, memory_order_relaxed);
#elif __x86_64__ || __i386__
            _sel.store(newSel, memory_order_release);
#else
#error Don't know how to do bucket_t::set on this architecture.
#endif
        }
    } else {
        _imp.store(newIMP, memory_order_relaxed);
        _sel.store(newSel, memory_order_relaxed);
    }
}

LLDB动态调试验证cache_t结构

案例代码

FFPerson

@interface FFPerson : NSObject

- (void)likeGirls;
- (void)likeFoods;
- (void)enjoyLife;

@end

@implementation FFPerson

- (void)likeGirls {
    NSLog(@"%s",__func__);
}
- (void)likeFoods{
    NSLog(@"%s",__func__);
}
- (void)enjoyLife{
    NSLog(@"%s",__func__);
}

@end

main

int main(int argc, const char * argv[]) {
    @autoreleasepool {

        FFPerson *p  = [FFPerson alloc];
        Class pClass = [FFPerson class];
        NSLog(@"%@",pClass);
        
//        lgKindofDemo();

    }
    return 0;
}

LLDB动态验证

操作步骤

p/x pClass格式化打印类对象，拿到地址

通过地址类对象的首地址偏移16字节（isa8字节、superclass8字节），即类首地址+0x10，拿到cache_t对象的指针地址

取cache_t真实地址，即 p *指针地址

查看当前打印的cache_t结构体内的_maybeMask和_occpuied,_maybeMask表示当前缓存内有多少个位置，_occpuied表示真实缓存方法的数量，由于当前没有调用任何方法，所以都为0

通过lldb动态调用方法

重新查看当前类对象的cache_t结构体内的存储内容，这时候_maybeMask和_occpuied都有值了，lldb动态调用方法调试_maybeMask为7，这里做过测试，调用一个方法也是开启7个存储空间，不知道为什么? 当通过对象调用一个方法_maybeMask的值为3，这个是符合预期的，至于7的问题，欢迎高手指导。

p $n.buckets()[0-6],可以分别打印当前cache缓存内的7个bucket_t结构体，来通过 p $n.sel()和p $n.imp(nil,pClass)来打印sel与imp，可以区分是系统函数还是自定义函数

我一共做了3次LLDB动态调试：

第一次：开始无方法调用，lldb动态调试中途通过lldb命令调用3个自定义方法，得到的结果是3个自定义方法，2个系统方法，开启了7个缓存。与预期相符 lldb验证cache_t.png

第二次：开始无方法调用，lldb动态调试中途通过lldb命令调用1个自定义方法，得到的结果是1个自定义方法，0个系统方法，开启了7个缓存。与预期不相符 lldb-执行一个方法.png

第三次：开始在代码中调用了1个方法，lldb动态调试中途无方法调用，得到的结果是1个自定义方法，0个系统方法，开启了3个缓存。与预期相符

代码调用-执行一个方法.png

仿造源码调试cache_t

源码地址

构建过程

参照源码仿造了struct ff_objc_class

由于ff_objc_class内需要cache_t和class_data_bits_t，所有再次仿造了ff_cache_t和ff_class_data_bits_t

仿造ff_cache_t过程中由于缺失mask_t类型，添加了typedef uint32_t mask_t

参照源码得知，sel与imp存在结构体bucket_t中，所以又仿造了ff_bucket_t。至此源码仿造工作完成

对FFPerson类调用alloc方法，分配内存空间

创建自定义结构体对象pClass，struct ff_objc_class *pClass，将类赋值给自定义对象

通过cache打印当前有多少个方法缓存与最大缓存数量

通过_bucketsAndMaybeMask解析初buckets

循环遍历打印缓存的sel与imp

部分源码：

typedef uint32_t mask_t;  // x86_64 & arm64 asm are less efficient with 16-bits

//bucketsMask：掩码，用来通过_bucketsAndMaybeMask解析初buckets
static uintptr_t bucketsMask = ~0ul;

//bucket_t源码模仿
struct ff_bucket_t {
   SEL _sel;
   IMP _imp;
};

//class_data_bits_t源码模仿
struct ff_class_data_bits_t {
   uintptr_t bits;
};
//cache_t源码模仿
struct ff_cache_t {
   uintptr_t _bucketsAndMaybeMask; // 8
   mask_t    _maybeMask; // 4
   uint16_t                   _flags;  // 2
   uint16_t                   _occupied; // 2
};

//类源码模仿
struct ff_objc_class {
   Class isa;
   Class superclass;
   struct ff_cache_t cache;             // formerly cache pointer and vtable
   struct ff_class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags
};


int main(int argc, const char * argv[]) {
   @autoreleasepool {
      
       //给person分配内存
       FFPerson *person = [FFPerson alloc];
       //调用方法
       [person likeGirls];
       [person likeFoods];
       [person likeflower];
       [person likeStudy];
       [person enjoyLift];
       [person lnspireCreativity];
       
       //将person的类型转换成自定义的源码ff_objc_class类型，方便后续操作
       struct ff_objc_class *pClass = (__bridge struct ff_objc_class *)(person.class);
       
       //打印当前有多少个方法缓存与最大缓存数量
       NSLog(@"%u-%u",pClass->cache._occupied,pClass->cache._maybeMask);
       
       //通过_bucketsAndMaybeMask解析初buckets
       struct ff_bucket_t *bucketptr = pClass->cache._bucketsAndMaybeMask & bucketsMask;
       
       //循环遍历打印缓存的sel与imp
       for (int i = 0; i<pClass->cache._maybeMask ; i++) {
           struct ff_bucket_t b = *(bucketptr + i);
           NSLog(@"%@-%p",NSStringFromSelector(b._sel),b._imp);
       }
   }
   return 0;
}

仿造源码调试结果：

2021-06-24 16:45:40.017677+0800 001-caceh源码还原调试[5948:581940] -[FFPerson likeGirls]
2021-06-24 16:45:40.018117+0800 001-caceh源码还原调试[5948:581940] -[FFPerson likeFoods]
2021-06-24 16:45:40.018283+0800 001-caceh源码还原调试[5948:581940] -[FFPerson likeflower]
2021-06-24 16:45:40.018323+0800 001-caceh源码还原调试[5948:581940] -[FFPerson likeStudy]
2021-06-24 16:45:40.018547+0800 001-caceh源码还原调试[5948:581940] -[FFPerson enjoyLift]
2021-06-24 16:45:40.018598+0800 001-caceh源码还原调试[5948:581940] -[FFPerson lnspireCreativity]
2021-06-24 16:45:40.018641+0800 001-caceh源码还原调试[5948:581940] 4-7
2021-06-24 16:45:40.018731+0800 001-caceh源码还原调试[5948:581940] likeStudy-0xbdd8
2021-06-24 16:45:40.018771+0800 001-caceh源码还原调试[5948:581940] (null)-0x0
2021-06-24 16:45:40.018821+0800 001-caceh源码还原调试[5948:581940] enjoyLift-0xbd88
2021-06-24 16:45:40.018941+0800 001-caceh源码还原调试[5948:581940] likeflower-0xba78
2021-06-24 16:45:40.019072+0800 001-caceh源码还原调试[5948:581940] lnspireCreativity-0xbdb8
2021-06-24 16:45:40.019112+0800 001-caceh源码还原调试[5948:581940] (null)-0x0
2021-06-24 16:45:40.019142+0800 001-caceh源码还原调试[5948:581940] (null)-0x0
Program ended with exit code: 0

cache_t流程图

cache_t结构流程图.png

代码解读补充：

_bucketsAndMaybeMask.store((uintptr_t)newBucekts,memory_order_release)

单纯的向_bucketsAndMaybeMask的某一bit位或几个bit位内存储空buckets()

_mayMask.store(newMask, memory_order_release)

向_mayMask中存储即将即将开辟的缓存count，第一次为3