iOS底层原理之_objc_msgSend方法查找

619 阅读3分钟

我们在 iOS底层原理之cache_t分析中了解到,当一个方法被调用后,cache_t中就会存储该方法,下次调用就不会再次存储。那么cache_t是如何查找到已缓存的方法呢?本文我们探究这个问题。

通过clang分析方法调用

先看代码实现

#import <objc/message.h>

@interface IFPerson : NSObject

-(void)canEatFood;
-(void)canRun;
@end

@implementation IFPerson

-(void)canEatFood {
    NSLog(@"canEatFood");
}
-(void)canRun {
    NSLog(@"canRun");
}

@end

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        // insert code here...
                
        IFPerson *person = [IFPerson alloc];

        [person canEatFood];
        objc_msgSend(person,sel_registerName("canEatFood"));
        [person canRun];
        objc_msgSend(person, sel_registerName("canRun"));
        
        NSLog(@"Hello, World!");
    }
    return 0;
}

打印结果

2020-09-21 16:57:18.619950+0800 001-运行时感受[76272:7079686] canEatFood
2020-09-21 16:57:18.620360+0800 001-运行时感受[76272:7079686] canEatFood
2020-09-21 16:57:18.620401+0800 001-运行时感受[76272:7079686] canRun
2020-09-21 16:57:18.620423+0800 001-运行时感受[76272:7079686] canRun
2020-09-21 16:57:18.620441+0800 001-运行时感受[76272:7079686] Hello, World!
Program ended with exit code: 0

从上述结果中我们发现[person canEatFood]等价于objc_msgSend(person,sel_registerName("canEatFood"));,其中sel_registerName = @seletor() = NSSeletorFromString(),下面我们通过将该段OC代码转成C++代码看下怎么实现的。

查看cpp文件

首先我们通过终端进入需要被编译成cpp格式的main文件所在的路径,终端执行clang -rewrite-objc main.m -o main.cpp,打开main.cpp文件,搜索int main(int argc, const char * argv[])main方法入口从main函数的C++中我们看到,person无论调用canEatFood还是canRun,最终都通过objc_msgSend将消息转发了出去,我们在上面也已经确认了。实际上objc_msgSend是通过汇编来实现的,这样做的好处是

  • 速度快,效率高
  • 类型的不确定,无论什么样的类型都可以

objc_msgSend汇编实现初步探究

我们在源码文件中搜索objc_msgSend,找到objc-msg-arm64文件(我们研究真机环境),在这个文件下搜索objc_msgsend,找到如下方法

ENTRY _objc_msgSend
	UNWIND _objc_msgSend, NoFrame

	cmp	p0, #0			// nil check and tagged pointer check
#if SUPPORT_TAGGED_POINTERS
	b.le	LNilOrTagged		//  (MSB tagged pointer looks negative)
#else
	b.eq	LReturnZero
#endif
	ldr	p13, [x0]    	// p13 = isa
	GetClassFromIsa_p16 p13		// p16 = class
LGetIsaDone:
	// calls imp or objc_msgSend_uncached
	CacheLookup NORMAL, _objc_msgSend

#if SUPPORT_TAGGED_POINTERS
LNilOrTagged:
	b.eq	LReturnZero		// nil check

	// tagged
	adrp	x10, _objc_debug_taggedpointer_classes@PAGE
	add	x10, x10, _objc_debug_taggedpointer_classes@PAGEOFF
	ubfx	x11, x0, #60, #4
	ldr	x16, [x10, x11, LSL #3]
	adrp	x10, _OBJC_CLASS_$___NSUnrecognizedTaggedPointer@PAGE
	add	x10, x10, _OBJC_CLASS_$___NSUnrecognizedTaggedPointer@PAGEOFF
	cmp	x10, x16
	b.ne	LGetIsaDone

	// ext tagged
	adrp	x10, _objc_debug_taggedpointer_ext_classes@PAGE
	add	x10, x10, _objc_debug_taggedpointer_ext_classes@PAGEOFF
	ubfx	x11, x0, #52, #8
	ldr	x16, [x10, x11, LSL #3]
	b	LGetIsaDone
// SUPPORT_TAGGED_POINTERS
#endif

LReturnZero:
	// x0 is already zero
	mov	x1, #0
	movi	d0, #0
	movi	d1, #0
	movi	d2, #0
	movi	d3, #0
	ret

	END_ENTRY _objc_msgSend

_objc_msgSend部分汇编解析

,最终通过CacheLookup查找方法缓存

CacheLookup分析

cacheLookup代码

.macro CacheLookup
	//
	// Restart protocol:
	//
	//   As soon as we're past the LLookupStart$1 label we may have loaded
	//   an invalid cache pointer or mask.
	//
	//   When task_restartable_ranges_synchronize() is called,
	//   (or when a signal hits us) before we're past LLookupEnd$1,
	//   then our PC will be reset to LLookupRecover$1 which forcefully
	//   jumps to the cache-miss codepath which have the following
	//   requirements:
	//
	//   GETIMP:
	//     The cache-miss is just returning NULL (setting x0 to 0)
	//
	//   NORMAL and LOOKUP:
	//   - x0 contains the receiver
	//   - x1 contains the selector
	//   - x16 contains the isa
	//   - other registers are set as per calling conventions
	//
LLookupStart$1:

	// p1 = SEL, p16 = isa
	ldr	p11, [x16, #CACHE]				// p11 = mask|buckets

#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
	and	p10, p11, #0x0000ffffffffffff	// p10 = buckets
	and	p12, p1, p11, LSR #48		// x12 = _cmd & mask
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
	and	p10, p11, #~0xf			// p10 = buckets
	and	p11, p11, #0xf			// p11 = maskShift
	mov	p12, #0xffff
	lsr	p11, p12, p11				// p11 = mask = 0xffff >> p11
	and	p12, p1, p11				// x12 = _cmd & mask
#else
#error Unsupported cache mask storage for ARM64.
#endif


	add	p12, p10, p12, LSL #(1+PTRSHIFT)
		             // p12 = buckets + ((_cmd & mask) << (1+PTRSHIFT))

	ldp	p17, p9, [x12]		// {imp, sel} = *bucket
1:	cmp	p9, p1			// if (bucket->sel != _cmd)
	b.ne	2f			//     scan more
	CacheHit $0			// call or return imp
	
2:	// not hit: p12 = not-hit bucket
	CheckMiss $0			// miss if bucket->sel == 0
	cmp	p12, p10		// wrap if bucket == buckets
	b.eq	3f
	ldp	p17, p9, [x12, #-BUCKET_SIZE]!	// {imp, sel} = *--bucket
	b	1b			// loop

3:	// wrap: p12 = first bucket, w11 = mask
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
	add	p12, p12, p11, LSR #(48 - (1+PTRSHIFT))
					// p12 = buckets + (mask << 1+PTRSHIFT)
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
	add	p12, p12, p11, LSL #(1+PTRSHIFT)
					// p12 = buckets + (mask << 1+PTRSHIFT)
#else
#error Unsupported cache mask storage for ARM64.
#endif

	// Clone scanning loop to miss instead of hang when cache is corrupt.
	// The slow path may detect any corruption and halt later.

	ldp	p17, p9, [x12]		// {imp, sel} = *bucket
1:	cmp	p9, p1			// if (bucket->sel != _cmd)
	b.ne	2f			//     scan more
	CacheHit $0			// call or return imp
	
2:	// not hit: p12 = not-hit bucket
	CheckMiss $0			// miss if bucket->sel == 0
	cmp	p12, p10		// wrap if bucket == buckets
	b.eq	3f
	ldp	p17, p9, [x12, #-BUCKET_SIZE]!	// {imp, sel} = *--bucket
	b	1b			// loop

LLookupEnd$1:

我们来分析Lookup部分汇编源码

  • LLookupStart$1:Lookup方法开始,同理
  • LLookupEnd$1:同理,Lookup方法结束
  • ldr p11, [x16, #CACHE] :我们从上面得知 p16已经指向isa指针,该命令主要是p16指针移动16位并赋值给p11,由于isasuperClass各占8位,因此移动后p11指向了cache_t,且p11指向方法缓存列表中的第一个位置,即 p11 = mask|buckets
  • and p10, p11, #0x0000ffffffffffff:p11&上0x0000ffffffffffff的值赋值给p10,即p10 = buckets
  • and p12, p1, p11, LSR #48:p11逻辑右移48位得到的值(即mask)与上p1最终得到的值赋值 缓存数组下标给p12,即p12 = _cmd & mask
  • add p12, p10, p12, LSL #(1+PTRSHIFT):p10是缓存首地址,向左移动p12*4位置,得到的buket保存到p12中
  • ldp p17, p9, [x12]:通过bucket的结构体得到{imp, sel} = *bucket

方法1

  • cmp p9, p1:如果获得p9(_sel)跟p1传过来的_sel不相同
  • b.ne 2f:不相同则进入方法2
  • CacheHit $0:相同则返回imp

方法2

  • CheckMiss $0:如果从最后一个元素遍历过来的都找不到,则返回CheckMiss
  • cmp p12, p10 :判断是否是第一个下标
  • b.eq 3f:如果下标是第一个,走方法3
  • ldp p17, p9, [x12, #-BUCKET_SIZE]!:如果不是第一个,就向前取bucket,循环一次对内存偏移-1,把取的bucket给p17
  • b 1b:执行1

方法3

  • add p12, p12, p11, LSR #(48 - (1+PTRSHIFT)):p11右移48-(1+3)=44位,再跟第一次通过哈希算法的得到的下标p12,再次进行哈希算法。这次得到的这个下标是cache_t的最后一位

CheckMiss分析

.macro CheckMiss
	// miss if bucket->sel == 0
.if $0 == GETIMP
	cbz	p9, LGetImpMiss
.elseif $0 == NORMAL
	cbz	p9, __objc_msgSend_uncached
.elseif $0 == LOOKUP
	cbz	p9, __objc_msgLookup_uncached
.else
.abort oops
.endif
.endmacro

执行逻辑

  • 找到imp,调用LGetImpMiss
  • 正常没有找到,调用__objc_msgSend_uncached
  • 缓存没找到,调用__objc_msgLookup_uncached

JumpMiss分析

.macro JumpMiss
.if $0 == GETIMP
	b	LGetImpMiss
.elseif $0 == NORMAL
	b	__objc_msgSend_uncached
.elseif $0 == LOOKUP
	b	__objc_msgLookup_uncached
.else
.abort oops
.endif
.endmacro

执行逻辑同CheckMiss相同

_objc_msgSend流程图

此处借用 OC底层原理之-objc_msgSend方法查找(上中总结的流程图