前言
本文讨论类的缓存cache_t在新系统(catalina和iOS)下新的内存储存方式
本文只是在我有限水平下的猜想, 没有源码作为依据。
同时假设已经对Class的cache_t有所了解, 所以解释起来快速
标题中新系统为 "macOS Catalina" 和 "iOS13", x86_64包括模拟器下的iOS13
我不会汇编, 后面的分析都是我蒙的, 欢迎拍砖指正
旧系统cache_t结构回顾
struct cache_t {
struct bucket_t *_buckets;
mask_t _mask;
mask_t _occupied;
}
首先cache_t的定义, buckets数组指针8字节, mask占4字节, occupied占4字节
然后随便造个栗子输出一下, 随便建个类打印x一下看看.
(lldb) x a.class
0x100001138: 11 11 00 00 01 80 1d 00 40 71 fb 9e ff 7f 00 00 ........@q......
0x100001148: e0 c7 7b 00 01 00 00 00 07 00 00 00 01 00 00 00 ..{.............
e0 c7 7b 00 01 00 00 00 | 07 00 00 00 | 01 00 00 00
还是非常清晰的能看出, mask为7, occupied为1, 简洁明了, 也符合cache_t结构体的定义
发现问题
cache_t结构如此之清晰, 当然就顺手再试试. 然后到Catalina下也建了个类, x一下输出
(lldb) x p.class
0x100001250: 28 12 00 00 01 00 00 00 18 b1 9b 92 ff 7f 00 00 (...............
0x100001260: a0 3d 08 02 01 00 00 00 03 00 00 00 10 80 02 00 .=..............
a0 3d 08 02 01 00 00 00 | 03 00 00 00 | 10 80 02 00, 非常清晰明了的看出, mask为3, occupied为0x00028010, 简单明了, SO EA..........
............occupied为163856 ???? 肯定是电脑坏了, 这是苹果让我换16寸的阴谋!!!
不信邪的用我的6sp iOS13做了一下同样的试验, x输出, 读数据
(lldb) x p1.class
0x101001540: 18 15 00 01 01 00 00 00 58 b2 5d f2 01 00 00 00 ........X.].....
0x101001550: 00 83 ba 81 02 00 03 00 00 00 00 00 10 80 02 00 ................
一顿操作猛如虎, 得出结果 mask为0, occupied为163856, Excuse me?? mask都能是0了???? 比macOS还莫名其妙??
苹果爸爸的源码只开源到10.14.5, 没有10.15的源码, 咋办捏?
x86_64的新cache_t
没源码, 只能去逆向他了
找到libobjc.A.dylib
(lldb) image list
[ 3] 17241F77-6A7A-39D7-8836-63E2725AA3C9 0x00007fff67948000 /usr/lib/libobjc.A.dylib
进入/usr/lib目录, 找到了这个文件
复制出来, 打开IDA, 拖进去, 搜索cache, 出来的列表....依据对旧cache_t的了解, 锁定cache_fill函数, F5生成伪代码 (其它函数看着也不像)
cache_fill
cache_fill的伪代码 点击展开
signed __int64 __fastcall cache_fill(signed __int64 a1, signed __int64 a2, __int64 a3, void *a4)
{
__int64 v4; // r15
signed __int64 v5; // rdx
signed __int64 result; // rax
int v7; // er14
unsigned int v8; // er12
int v9; // er14
__int64 v10; // r8
unsigned int v11; // ebx
signed __int64 v12; // rdx
signed __int64 *v13; // rcx
signed __int64 v14; // r13
unsigned int v15; // er14
__int64 v16; // rax
__int64 v17; // rax
unsigned int v18; // er14
__int64 v19; // ST00_8
__int64 v20; // rax
void *ptr; // [rsp+10h] [rbp-30h]
v4 = a3;
v5 = a1;
if ( !(*(_BYTE *)(a1 + 28) & 1) )
v5 = *(_QWORD *)a1 & 0x7FFFFFFFFFF8LL;
result = *(_QWORD *)(v5 + 32) & 0x7FFFFFFFFFF8LL;
if ( *((_BYTE *)&_objc_empty_vtable_0.magic + result + 3) & 0x20 )
{
ptr = a4;
v7 = *(unsigned __int16 *)(a1 + 30);
if ( *(_DWORD *)(a1 + 24) )
v8 = *(_DWORD *)(a1 + 24) + 1;
else
v8 = 0;
if ( (unsigned __int8)cache_t::isConstantEmptyCache((cache_t *)(a1 + 16)) )
{
v15 = 4;
if ( v8 )
v15 = v8;
v16 = *(_QWORD *)(a1 + 16);
v17 = allocateBuckets(v15);
v9 = v15 - 1;
*(_QWORD *)(a1 + 16) = v17;
*(_DWORD *)(a1 + 24) = v9;
*(_WORD *)(a1 + 30) = 0;
}
else if ( v7 + 2 > 3 * (v8 >> 2) )
{
v18 = 4;
if ( v8 )
v18 = 2 * v8;
if ( v18 >= 0x10000 )
v18 = 0x10000;
v19 = *(_QWORD *)(a1 + 16);
v20 = allocateBuckets(v18);
v9 = v18 - 1;
*(_QWORD *)(a1 + 16) = v20;
*(_DWORD *)(a1 + 24) = v9;
*(_WORD *)(a1 + 30) = 0;
cache_collect_free(v19, v8);
}
else
{
v9 = v8 - 1;
}
v10 = *(_QWORD *)(a1 + 16);
v11 = v9 & a2;
while ( 1 )
{
v12 = 16LL * v11;
v13 = (signed __int64 *)(v10 + v12);
if ( !*(_QWORD *)(v10 + v12) )
break;
result = *v13;
if ( *v13 == a2 )
return result;
v11 = v9 & (v11 + 1);
if ( v11 == (v9 & (unsigned int)a2) )
cache_t::bad_cache(ptr);
}
++*(_WORD *)(a1 + 30);
v14 = v4 ^ a1;
if ( !v4 )
v14 = 0LL;
*(_QWORD *)(v10 + v12 + 8) = v14;
result = *v13;
if ( *v13 != a2 )
*v13 = a2;
}
return result;
}
虽然苹果改过结构, 但是代码的结构应该变动不会太大, 拿这儿的伪代码和756的代码对比一下, 发现确实能对比查看一番. 第一个方框为realloc方法, 第二个方框为expand方法. 在旧版代码可以看到这两个方法都开辟新缓存空间, 然后设置bucket和mask (setBucketsAndMask方法).
猜测新结构
发挥瞎蒙本领的时间到了!! 对比下旧代码, 猜测第一个方框里的代码
v20 = allocateBuckets(v18); // bucket_t *newBuckets = allocateBuckets(newCapacity);
v9 = v18 - 1; // mask = newCapacity - 1
*(_QWORD *)(a1 + 16) = v20; // v20为newBuckets, a1为class新地址, a1偏移16字节赋值newBuckets, QWROD为8字节
*(_DWORD *)(a1 + 24) = v9; // a1偏移24字节赋值mask, DWORD为4字节
*(_WORD *)(a1 + 30) = 0; // a1偏移30字节赋值0, WORD为2字节
在这几句赋值代码中, 可以猜测一下: 偏移16字节开始8字节为buckets数组指针, 偏移24字节开始4字节为mask, 偏移30字节开始2字节为occupied
对比验证
(lldb) x p.class
0x100001250: 28 12 00 00 01 00 00 00 18 b1 9b 92 ff 7f 00 00 (...............
0x100001260: a0 3d 08 02 01 00 00 00 03 00 00 00 10 80 02 00 .=..............
拿出刚才发现问题的数据, 对比一下然后读出: mask为3, occupied为2, 嗯, 这下正常了
a0 3d 08 02 01 00 00 00 | 03 00 00 00 | 10 80 | 02 00
中间的两个字节
在上面的读取中, 中间的0x8010....这是啥东西? 本来也不知道从哪儿下手的, 然后...第28字节嘛, 我就在伪代码里ctrl+F了一下下....还真有.....
if ( !(*(_BYTE *)(a1 + 28) & 1) )
v5 = *(_QWORD *)a1 & 0x7FFFFFFFFFF8LL;
result = *(_QWORD *)(v5 + 32) & 0x7FFFFFFFFFF8LL;
看到0x7FFFFFFFFFF8LL, 眼熟! 这不是ISA_MASK或者FAST_DATA_MASK么, 一个是取isa的一个取rw的.
取ISA? 取RW?? 对比下旧代码, 不就是 if (!cls->isInitialized()) return;这句么...?
那么, 猜一下 !(*(_BYTE *)(a1 + 28) & 1) 是用来判断是否是元类的, 应该不过分吧....
0x10 & 1 = 0, false, 不是元类? p.class确实不是元类!!!
再拿元类来试一下?
0x0000000100001228 & 0x00007ffffffffff8ULL = 0x100001228
(lldb) x 0x100001228
0x100001228: f0 b0 9b 92 ff 7f 00 00 f0 b0 9b 92 ff 7f 00 00 ................
0x100001238: 60 3d 08 02 01 00 00 00 03 00 00 00 31 e0 01 00 `=..........1...
0x31 & 1 = 1, true, 是元类!!!,
所以这儿是存放了rw的部分flag信息? (不在这儿验证了(或者说不知道咋验证了))
ISA??
等一下, 上面的掩码取isa....好像, 掩了之后的结果好像是一样的啊????
多试几次!!!
类和元类的前8字节, 开始直接存纯isa指针了
推出结论
struct cache_t {
struct bucket_t *_buckets;
uint32_t _mask;
uint16_t _flags; // 就先起个名叫flags吧, 无所谓了
uint16_t _occupied;
}
arm64的新cache_t
找libobjc.A.dylib
需要一部越狱的iOS13设备, 于是我把我的娱乐用iPad给升级越狱了
熟练的来到dylib的位置, 愉快的复.....文件呢??!!!!!!!就留了个软链接 ???
iPad:/usr/lib root# ls -al | grep objc
-rwxr-xr-x 1 root wheel 50080 Jul 14 15:37 libobjc-trampolines.dylib*
lrwxr-xr-x 1 root wheel 15 Jul 14 15:37 libobjc.dylib -> libobjc.A.dylib
硬杠汇编
没库文件, 只能用xcode硬杠汇编了...
添加符号断点, cache_fill, 断住
汇编源码 点击展开
libobjc.A.dylib`cache_fill:
0x1aa7000b8 <+0>: stp x26, x25, [sp, #-0x50]!
0x1aa7000bc <+4>: stp x24, x23, [sp, #0x10]
0x1aa7000c0 <+8>: stp x22, x21, [sp, #0x20]
0x1aa7000c4 <+12>: stp x20, x19, [sp, #0x30]
0x1aa7000c8 <+16>: stp x29, x30, [sp, #0x40]
0x1aa7000cc <+20>: add x29, sp, #0x40 ; =0x40
0x1aa7000d0 <+24>: mov x22, x3
0x1aa7000d4 <+28>: mov x21, x2
0x1aa7000d8 <+32>: mov x19, x1
0x1aa7000dc <+36>: mov x20, x0
0x1aa7000e0 <+40>: ldrb w9, [x0, #0x1c]
0x1aa7000e4 <+44>: mov x8, x0
0x1aa7000e8 <+48>: tbnz w9, #0x2, 0x1aa7000f4 ; <+60>
0x1aa7000ec <+52>: ldr x8, [x20]
0x1aa7000f0 <+56>: and x8, x8, #0xffffffff8
0x1aa7000f4 <+60>: ldr x8, [x8, #0x20]
0x1aa7000f8 <+64>: and x8, x8, #0x7ffffffffff8
0x1aa7000fc <+68>: ldrb w8, [x8, #0x3]
0x1aa700100 <+72>: tbz w8, #0x5, 0x1aa7001c4 ; <+268>
0x1aa700104 <+76>: add x23, x20, #0x10 ; =0x10
0x1aa700108 <+80>: ldrh w25, [x20, #0x1e]
0x1aa70010c <+84>: ldr x8, [x20, #0x10]
0x1aa700110 <+88>: lsr x8, x8, #48
0x1aa700114 <+92>: cbnz x8, 0x1aa700120 ; <+104>
0x1aa700118 <+96>: mov w24, #0x0
0x1aa70011c <+100>: b 0x1aa70012c ; <+116>
0x1aa700120 <+104>: ldr x8, [x23]
0x1aa700124 <+108>: lsr x8, x8, #48
0x1aa700128 <+112>: add w24, w8, #0x1 ; =0x1
0x1aa70012c <+116>: mov x0, x23
-> 0x1aa700130 <+120>: bl 0x1aa70002c ; cache_t::isConstantEmptyCache()
0x1aa700134 <+124>: cbnz w0, 0x1aa7001dc ; <+292>
0x1aa700138 <+128>: lsr w8, w24, #2
0x1aa70013c <+132>: lsl w8, w8, #1
0x1aa700140 <+136>: add w8, w8, w24, lsr #2
0x1aa700144 <+140>: cmp w8, w25
0x1aa700148 <+144>: b.ls 0x1aa700208 ; <+336>
0x1aa70014c <+148>: ldr x8, [x23]
0x1aa700150 <+152>: and x8, x8, #0xfffffffffff
0x1aa700154 <+156>: sub w9, w24, #0x1 ; =0x1
0x1aa700158 <+160>: and w10, w9, w19
0x1aa70015c <+164>: mov x11, x10
0x1aa700160 <+168>: mov w12, w11
0x1aa700164 <+172>: add x13, x8, w11, uxtw #4
0x1aa700168 <+176>: add x11, x13, #0x8 ; =0x8
0x1aa70016c <+180>: ldr x13, [x13, #0x8]
0x1aa700170 <+184>: cbz x13, 0x1aa7001a4 ; <+236>
0x1aa700174 <+188>: ldr x11, [x11]
0x1aa700178 <+192>: cmp x11, x19
0x1aa70017c <+196>: b.eq 0x1aa7001c4 ; <+268>
0x1aa700180 <+200>: sub w11, w12, #0x1 ; =0x1
0x1aa700184 <+204>: cmp w12, #0x0 ; =0x0
0x1aa700188 <+208>: csel w11, w9, w11, eq
0x1aa70018c <+212>: cmp w11, w10
0x1aa700190 <+216>: b.ne 0x1aa700160 ; <+168>
0x1aa700194 <+220>: mov x0, x22
0x1aa700198 <+224>: mov x1, x19
0x1aa70019c <+228>: mov x2, x20
0x1aa7001a0 <+232>: bl 0x1aa71e098 ; cache_t::bad_cache(objc_object*, objc_selector*, objc_class*)
0x1aa7001a4 <+236>: add x8, x8, x12, lsl #4
0x1aa7001a8 <+240>: ldrh w9, [x20, #0x1e]
0x1aa7001ac <+244>: add w9, w9, #0x1 ; =0x1
0x1aa7001b0 <+248>: strh w9, [x20, #0x1e]
0x1aa7001b4 <+252>: eor x9, x21, x20
0x1aa7001b8 <+256>: cmp x21, #0x0 ; =0x0
0x1aa7001bc <+260>: csel x9, xzr, x9, eq
0x1aa7001c0 <+264>: stp x9, x19, [x8]
0x1aa7001c4 <+268>: ldp x29, x30, [sp, #0x40]
0x1aa7001c8 <+272>: ldp x20, x19, [sp, #0x30]
0x1aa7001cc <+276>: ldp x22, x21, [sp, #0x20]
0x1aa7001d0 <+280>: ldp x24, x23, [sp, #0x10]
0x1aa7001d4 <+284>: ldp x26, x25, [sp], #0x50
0x1aa7001d8 <+288>: ret
0x1aa7001dc <+292>: cmp w24, #0x0 ; =0x0
0x1aa7001e0 <+296>: orr w8, wzr, #0x4
0x1aa7001e4 <+300>: csel w24, w8, w24, eq
0x1aa7001e8 <+304>: ldr xzr, [x20, #0x10]
0x1aa7001ec <+308>: mov x0, x24
0x1aa7001f0 <+312>: bl 0x1aa6fffb8 ; allocateBuckets(unsigned int)
0x1aa7001f4 <+316>: sub w8, w24, #0x1 ; =0x1
0x1aa7001f8 <+320>: orr x8, x0, x8, lsl #48
0x1aa7001fc <+324>: str x8, [x20, #0x10]
0x1aa700200 <+328>: strh wzr, [x20, #0x1e]
0x1aa700204 <+332>: b 0x1aa70014c ; <+148>
0x1aa700208 <+336>: lsl w8, w24, #1
0x1aa70020c <+340>: cmp w24, #0x0 ; =0x0
0x1aa700210 <+344>: orr w9, wzr, #0x4
0x1aa700214 <+348>: csel w8, w9, w8, eq
0x1aa700218 <+352>: cmp w8, #0x10, lsl #12 ; =0x10000
0x1aa70021c <+356>: orr w9, wzr, #0x10000
0x1aa700220 <+360>: csel w25, w8, w9, lo
0x1aa700224 <+364>: ldr x26, [x20, #0x10]
0x1aa700228 <+368>: mov x0, x25
0x1aa70022c <+372>: bl 0x1aa6fffb8 ; allocateBuckets(unsigned int)
0x1aa700230 <+376>: sub w8, w25, #0x1 ; =0x1
0x1aa700234 <+380>: orr x8, x0, x8, lsl #48
0x1aa700238 <+384>: str x8, [x20, #0x10]
0x1aa70023c <+388>: strh wzr, [x20, #0x1e]
0x1aa700240 <+392>: and x0, x26, #0xfffffffffff
0x1aa700244 <+396>: mov x1, x24
0x1aa700248 <+400>: bl 0x1aa700254 ; cache_collect_free(bucket_t*, unsigned int)
0x1aa70024c <+404>: mov x24, x25
0x1aa700250 <+408>: b 0x1aa70014c ; <+148>
关键信息
定位的过程就不说了, 一个不会汇编的人这点儿东西找了一个多小时才找到大概可能的位置...
这儿都是我瞎蒙的, 欢迎拍砖
瞎蒙的内容和代码写一起, 方便解释
// 首先读一下x20, 下面用到, 需要对比
(lldb) register read x20
x20 = 0x0000000104acd540 (void *)0x0000000104acd518: LGPerson
(lldb) x 0x0000000104acd540
0x104acd540: 18 d5 ac 04 01 00 00 00 58 b2 5d f2 01 00 00 00 ........X.].....
0x104acd550: 90 fa 71 aa 01 00 00 00 00 00 00 00 10 80 00 00 ..q.............
// register read 得出x24为4, 存入x0当成函数参数, 传递给allocateBuckets
0x1aa7001ec <+308>: mov x0, x24
// 开辟空间, 返回结果x0为0x00000002802ab280 (buckets数组地址)
0x1aa7001f0 <+312>: bl 0x1aa6fffb8 ; allocateBuckets(unsigned int)
// w24是4, w8 = w24 - 1
// 猜测为mask = occupied -1
0x1aa7001f4 <+316>: sub w8, w24, #0x1 ; =0x1
// x8左移48位, 然后与x0进行异或运算, 赋值给x8
// 执行结果为: x8 = 0x00030002802ab280
// 发现mask被赋值在buckets的开始2字节了
0x1aa7001f8 <+320>: orr x8, x0, x8, lsl #48
// x8赋值给class偏移16字节位置
// 执行结果为:
// (lldb) x 0x0000000104acd540
// 0x104acd540: 18 d5 ac 04 01 00 00 00 58 b2 5d f2 01 00 00 00 ........X.].....
// 0x104acd550: 80 b2 2a 80 02 00 03 00 00 00 00 00 10 80 00 00 ..*.............
// 对比之前的x20, 发现buckets&mask赋值在偏移16字节处
0x1aa7001fc <+324>: str x8, [x20, #0x10]
// 把wzr赋值给偏移30字节处
// 执行结果, x20未变
0x1aa700200 <+328>: strh wzr, [x20, #0x1e]
其中, 重点为这一句: 0x1aa7001f8 <+320>: orr x8, x0, x8, lsl #48
将2字节的mask放在了buckets数组指针的前两个字节处, occupied依然为偏移30的2字节
对比验证
把最开始的问题输出拿过来分析..
(lldb) x p1.class
0x101001540: 18 15 00 01 01 00 00 00 58 b2 5d f2 01 00 00 00 ........X.].....
0x101001550: 00 83 ba 81 02 00 03 00 00 00 00 00 10 80 02 00 ................
00 83 ba 81 02 00 | 03 00 | 00 00 00 00 10 80 | 02 00
恩......按照上面那一堆瞎蒙, 好像这儿也....可以读通了???
buckets六字节?
按上面的说法, buckets只剩6字节的空间了, 一个地址6字节怎么成
查看上面的汇编代码, 两处alloc之后, 都跳转向了0x1aa70014c, 对照旧版代码, 猜测这儿是bucket_t *bucket = cache->find(key, receiver);的实现部分
这儿是另外一次进去, 类信息已经和上一次有了变化!
// 输出类信息备用, 当前存于x20
(lldb) x/4gx 0x0000000100385540
0x100385540: 0x0000000100385518 0x00000001f25db258
0x100385550: 0x000700028381d080 0x0000801000000000
// 将x23地址指向的赋值给x8
// 输出x23目标, 保存的是0x000700028381d080, 正好是类的buckets部分
0x1aa70014c <+148>: ldr x8, [x23]
// buckets和0xfffffffffff进行了与运算!!!!!
0x1aa700150 <+152>: and x8, x8, #0xfffffffffff
0xfffffffffff, 取buckets时, 掩码取后面的44位, 其余位设置为0, 使用掩码的方式分离了mask部分
mask占据前面的16位, 中间4位无用
推出结论
cache_t现在buckets和mask共同占用8字节(联合体?), mask占据2字节, buckets实际占用44位
occupied依然是偏移30位占据2个字节
结论总结
看看就行, 不要随意相信, 因为是我瞎蒙的
macOS Catalina下为:
struct cache_t {
struct bucket_t *_buckets;
uint32_t _mask;
uint16_t _flags; // 就先起个名叫flags吧, 无所谓了
uint16_t _occupied;
}
iOS13下为:
buckets和mask一起占用8字节, mask占据其中2字节
occupied依然占据2字节