我们在上一篇文章里探索了对象的本质
,我们知道对象在底层是objc_object
结构体,objc_object
第一个成员是isa
,今天我们从isa
开始探索。
isa
的指向
我们首先还是先定义一个继承NSObject
的类JSPerson
,在main
方法里实例化。
// JSPerson.h
@interface JSPerson : NSObject
@end
// JSPerson.m
@implementation JSPerson
@end
//main.m
int main(int argc, const char * argv[]) {
@autoreleasepool {
// 0x00007ffffffffff8
JSPerson *p = [JSPerson alloc];
NSLog(@"%@",p);//断点
}
我们在NSLog
行打断点,使用lldb
打印对象的地址:
(lldb) x/4gx p
0x10045e6e0: 0x001d8001000083a9 0x0000000000000000
0x10045e6f0: 0x6c6f6f54534e5b2d 0x7370616e53726162
(lldb) p/x 0x001d8001000083a9 & 0x00007ffffffffff8 //isa&掩码 得到isa指向内容的地址
(long) $1 = 0x00000001000083a8
(lldb) po 0x00000001000083a8 //打印isa指向地址
JSPerson
通过上面lldb
命令,我们发现isa
指向的内容是JSPerson
类,即对象的isa
指向的是类。上一节我们探索了类在底层其实是objc_class
它继承自objc_object
,那意味着类应该也有isa
指针,类的isa
指针指向哪里呢?带着这个疑问我们继续探索:
(lldb) x/4gx 0x00000001000083a8 //类对象地址
0x1000083a8: 0x0000000100008380 0x00007fff8e92c118
0x1000083b8: 0x000000010055b1a0 0x0004801000000007
(lldb) p/x 0x0000000100008380 & 0x00007ffffffffff8 //isa&掩码 得到isa指向内容的地址
(long) $6 = 0x0000000100008380
(lldb) po 0x0000000100008380 //打印isa指向地址
JSPerson
我们发现类对象isa
指向的地址打印的结果也是JSPerson
,而且这个地址和对象isa
指向的地址不是同一个。为什么一个类会有两个内存地址不同的类对象呢,难道类对象和实例对象一样也是可以创建多个吗?我们写一段代码验证一下是不是这样:
void jsTestClassNum(void){
Class class1 = [JSPerson class];
Class class2 = [JSPerson alloc].class;
Class class3 = object_getClass([JSPerson alloc]);
Class class4 = [JSPerson alloc].class;
NSLog(@"\n%p-\n%p-\n%p-\n%p",class1,class2,class3,class4);
}
我们定义一个函数,里面打印四种方式获取类对象的地址:
0x1000083a8-
0x1000083a8-
0x1000083a8-
0x1000083a8
这四种方式打印的结果都是同一个,说明类对象并没有多个,而且这个类对象地址和实例对象的isa
指向的地址一致。类对象的isa
指向的是一个新的东西即元类。下面我们用MachOView
打开编译好的二进制文件来验证一下元类是否真的存在:
我们在符号表中搜索class
关键字,发现了_OBJC_METACLASS_$_JSPerson
,说明编译器确实在编译期生成了元类对象。
现在我们知道编译期会帮我们创建元类对象,那元类对象是不是也有isa
指针?我们使用lldb
继续探索。
(lldb) x/4gx JSPerson.class
0x1000083a8: 0x0000000100008380 0x00007fff8e92c118
0x1000083b8: 0x00007fff671dc140 0x0000801000000000
(lldb) p/x 0x0000000100008380 & 0x00007ffffffffff8
(long) $1 = 0x0000000100008380
(lldb) po 0x0000000100008380 //拿到元类地址
JSPerson
(lldb) x/4gx 0x0000000100008380
0x100008380: 0x00007fff8e92c0f0 0x00007fff8e92c0f0
0x100008390: 0x00000001006058e0 0x0001e03100000007
(lldb) p/x 0x00007fff8e92c0f0 & 0x00007ffffffffff8
(long) $3 = 0x00007fff8e92c0f0
(lldb) po 0x00007fff8e92c0f0 //元类isa指向地址
NSObject
(lldb) p/x NSObject.class
(Class) $5 = 0x00007fff8e92c118 NSObject //与 元类isa指向地址不同
(lldb) x/4gx 0x00007fff8e92c0f0
0x7fff8e92c0f0: 0x00007fff8e92c0f0 0x00007fff8e92c118
0x7fff8e92c100: 0x0000000100605960 0x0005e03100000007
(lldb) p/x 0x00007fff8e92c0f0 & 0x00007ffffffffff8
(long) $6 = 0x00007fff8e92c0f0
(lldb) po 0x00007fff8e92c0f0//根元类isa指向自己
NSObject
通过上面的探索,发现元类的isa
指向的是NSObject
的元类也就是根元类,根元类的isa
指向的是根元类自己。到这里isa
的走位就比较清晰了,也就是官方文档里的一个经典的图:
图中除了isa
的走位,还有superClass
的走位,我们用代码验证打印一下:
// JSStudent.h
@interface JSStudent : JSPerson
@end
// JSStudent.m
@implementation JSStudent
@end
void JSTestNSObject(void){
// NSObject实例对象
NSObject *object1 = [NSObject alloc];
// NSObject类
Class class = object_getClass(object1);
// NSObject元类
Class metaClass = object_getClass(class);
// NSObject根元类
Class rootMetaClass = object_getClass(metaClass);
// NSObject根根元类
Class rootRootMetaClass = object_getClass(rootMetaClass);
NSLog(@"\n%p 实例对象\n%p 类\n%p 元类\n%p 根元类\n%p 根根元类",object1,class,metaClass,rootMetaClass,rootRootMetaClass);
// JSPerson元类
Class pMetaClass = object_getClass(JSPerson.class);
Class psuperClass = class_getSuperclass(pMetaClass);
NSLog(@"%@ - %p",psuperClass,psuperClass);
// JSStudent -> JSPerson -> NSObject
// 元类也有一条继承链
Class tMetaClass = object_getClass(JSStudent.class);
Class tsuperClass = class_getSuperclass(tMetaClass);
NSLog(@"%@ - %p",tsuperClass,tsuperClass);
// NSObject 根类特殊情况
Class nsuperClass = class_getSuperclass(NSObject.class);
NSLog(@"%@ - %p",nsuperClass,nsuperClass);
// 根元类 -> NSObject
Class rnsuperClass = class_getSuperclass(metaClass);
NSLog(@"%@ - %p",rnsuperClass,rnsuperClass);
}
打印结果:
0x1006055e0 实例对象
0x7fff8e92c118 类
0x7fff8e92c0f0 元类
0x7fff8e92c0f0 根元类
0x7fff8e92c0f0 根根元类
NSObject - 0x7fff8e92c0f0
JSPerson - 0x100008418
(null) - 0x0 //NSObject没有父类
NSObject - 0x7fff8e92c118
打印结果显而易见,至此isa
的走位图和继承链我们探究完了,总结起来就是官方的那张经典的走位图。
内存平移
在探究类的结构之前,我们先介绍一个概念内存平移
。我们定义一个数组array
,定义一个指针pArray
指向数组array
,代码如下:
int array[4] = {1,2,3,4};
int *pArray = array;
NSLog(@"%p - %p - %p - %p",&array,&array[0],&array[1],&array[2]);
NSLog(@"%p - %p - %p",pArray,pArray+1,pArray+2);
//以下为打印结果:
0x7ffeefbff440 - 0x7ffeefbff440 - 0x7ffeefbff444 - 0x7ffeefbff448
0x7ffeefbff440 - 0x7ffeefbff444 - 0x7ffeefbff448
我们看到array
和array[0]
的地址相同,这个好理解,因为数组指向的就是第一个元素的地址。同时我们看到pArray
、pArray+1
、pArray+2
分别指向了array[0]
、array[1]
、array[2]
,这就是内存平移
的作用。我们可以利用内存平移
原理取到数组中任意位置的元素:
for (int i = 0; i<4; i++) {
int value = *(pArray+i);
NSLog(@"%d",value);
}
//打印结果
1
2
3
4
对内存平移
了解之后,我们继续探索类。
类的结构内存
我们打开objc
搜索objc_class
,找到类的结构
struct objc_class : objc_object {
objc_class(const objc_class&) = delete;
objc_class(objc_class&&) = delete;
void operator=(const objc_class&) = delete;
void operator=(objc_class&&) = delete;
// Class ISA;
Class superclass;
cache_t cache; // formerly cache pointer and vtable
class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags
///省略代码
}
我们知道结构体占用的内存大小的影响因素是成员变量(方法是存在方法区),所以这里我们省略下面方法的相关代码。看类的结构体有四个成员变量isa
、superclass
、cache
、bits
。isa
我们之前探究过了,superclass
指向的是父类也很清楚,cache
我们后面单独分析,我们先看bits
成员。根据上一小节的内存平移
我们知道,bits
的内存地址是类的地址加上前三个成员内存大小而得到,isa
和superclass
都是指向类的指针类型各占用8
个字节很好理解,这里关键是cache
占用多少字节,我们看一下cache_t
的源码结构:
struct cache_t {
private:
explicit_atomic<uintptr_t> _bucketsAndMaybeMask;
union {
struct {
explicit_atomic<mask_t> _maybeMask;//4
#if __LP64__
uint16_t _flags;//2
#endif
uint16_t _occupied;//2
};
explicit_atomic<preopt_cache_t *> _originalPreoptCache;
};
///省略 静态变量和方法代码
}
typedef unsigned long uintptr_t;
typedef uint32_t mask_t; //4字节
cache_t
的内容很多,一看很容易懵逼,到时我们发现有规律,就是352
行后面的代码是静态变量和方法,静态变量实际是存储在静态区
,方法是存储在方法区
,它们都不会占用结构体的内存,所以cache_t
的内存大小就取决于_bucketsAndMaybeMask
和一个联合体。
_bucketsAndMaybeMask
的大小就是uintptr_t
的大小8。联合体我们探究过,它的内存是其最大成员的大小,联合体包括一个结构体和_originalPreoptCache
,结构体的大小是8,_originalPreoptCache
大小取决于preopt_cache_t
:
struct preopt_cache_t {
int32_t fallback_class_offset;//4字节
union {
struct {
uint16_t shift : 5;
uint16_t mask : 11;
};
uint16_t hash_params;
};//1字节
uint16_t occupied : 14;//1字节
uint16_t has_inlines : 1;//1字节
uint16_t bit_one : 1;//1字节
preopt_cache_entry_t entries[];
inline int capacity() const {
return mask + 1;
}
};
typedef unsigned short uint16_t;//占一个字节
可以看出_originalPreoptCache
的大小也是8。所以我们得到cache
的内存大小是16
。所以bits
成员的内存地址就是类的地址平移32(16进制就是0x20)字节。
bits
有了前面的基础,我们开始看bits
成员里的内容,我们首先看一下class_data_bits_t
的定义:
struct class_data_bits_t {
friend objc_class;
// Values are the FAST_ flags above.
uintptr_t bits;
private:
bool getBit(uintptr_t bit) const
{
return bits & bit;
}
// Atomically set the bits in `set` and clear the bits in `clear`.
// set and clear must not overlap.
void setAndClearBits(uintptr_t set, uintptr_t clear)
{
ASSERT((set & clear) == 0);
uintptr_t newBits, oldBits = LoadExclusive(&bits);
do {
newBits = (oldBits | set) & ~clear;
} while (slowpath(!StoreReleaseExclusive(&bits, &oldBits, newBits)));
}
void setBits(uintptr_t set) {
__c11_atomic_fetch_or((_Atomic(uintptr_t) *)&bits, set, __ATOMIC_RELAXED);
}
void clearBits(uintptr_t clear) {
__c11_atomic_fetch_and((_Atomic(uintptr_t) *)&bits, ~clear, __ATOMIC_RELAXED);
}
public:
class_rw_t* data() const {
return (class_rw_t *)(bits & FAST_DATA_MASK);
}
void setData(class_rw_t *newData)
{
ASSERT(!data() || (newData->flags & (RW_REALIZING | RW_FUTURE)));
// Set during realization or construction only. No locking needed.
// Use a store-release fence because there may be concurrent
// readers of data and data's contents.
uintptr_t newBits = (bits & ~FAST_DATA_MASK) | (uintptr_t)newData;
atomic_thread_fence(memory_order_release);
bits = newBits;
}
// Get the class's ro data, even in the presence of concurrent realization.
// fixme this isn't really safe without a compiler barrier at least
// and probably a memory barrier when realizeClass changes the data field
const class_ro_t *safe_ro() const {
class_rw_t *maybe_rw = data();
if (maybe_rw->flags & RW_REALIZED) {
// maybe_rw is rw
return maybe_rw->ro();
} else {
// maybe_rw is actually ro
return (class_ro_t *)maybe_rw;
}
}
///省略代码
};
class_data_bits_t
有两个对外公开的返回值的方法data()
和safe_ro()
。我们先看data()
它的返回值是class_rw_t
我们看一下它的定义:
struct class_rw_t {
// Be warned that Symbolication knows the layout of this structure.
uint32_t flags;
uint16_t witness;
#if SUPPORT_INDEXED_ISA
uint16_t index;
#endif
explicit_atomic<uintptr_t> ro_or_rw_ext;
Class firstSubclass;
Class nextSiblingClass;
///省略代码
const method_array_t methods() const {
auto v = get_ro_or_rwe();
if (v.is<class_rw_ext_t *>()) {
return v.get<class_rw_ext_t *>(&ro_or_rw_ext)->methods;
} else {
return method_array_t{v.get<const class_ro_t *>(&ro_or_rw_ext)->baseMethods()};
}
}
const property_array_t properties() const {
auto v = get_ro_or_rwe();
if (v.is<class_rw_ext_t *>()) {
return v.get<class_rw_ext_t *>(&ro_or_rw_ext)->properties;
} else {
return property_array_t{v.get<const class_ro_t *>(&ro_or_rw_ext)->baseProperties};
}
}
const protocol_array_t protocols() const {
auto v = get_ro_or_rwe();
if (v.is<class_rw_ext_t *>()) {
return v.get<class_rw_ext_t *>(&ro_or_rw_ext)->protocols;
} else {
return protocol_array_t{v.get<const class_ro_t *>(&ro_or_rw_ext)->baseProtocols};
}
}
};
在class_rw_t
的定义中最下面我们看到三个方法methods()
、properties()
、protocols()
,貌似类的方法属性、协议、方法是存储在这里的,我们验证一下,我们在JSPerson
类中加属性和方法:
// JSPerson.h
@interface JSPerson : NSObject
{
NSString *nickName;
}
@property (nonatomic, copy) NSString *name;
@property (nonatomic, copy) NSString *hobby;
- (void)sayNB;
+ (void)saySomething;
@end
// JSPerson.m
#import "JSPerson.h"
@implementation JSPerson
- (void)sayNB{
}
+ (void)saySomething{
}
@end
//main
JSPerson *p1 = [[JSPerson alloc] init];
NSLog(@"%@",p1);
属性
我们在main方法中打断点,使用lldb
调试:
(lldb) p/x JSPerson.class
(Class) $0 = 0x0000000100008530 JSPerson
(lldb) p/x 0x0000000100008530+0x20
(long) $1 = 0x0000000100008550 //bits地址
(lldb) p (class_data_bits_t *)0x0000000100008550
(class_data_bits_t *) $2 = 0x0000000100008550
(lldb) p $2->data() //bits.data()
(class_rw_t *) $3 = 0x000000010102db50
(lldb) p *$3
(class_rw_t) $4 = {
flags = 2148007936
witness = 1
ro_or_rw_ext = {
std::__1::atomic<unsigned long> = {
Value = 4295000224
}
}
firstSubclass = nil
nextSiblingClass = NSUUID
}
(lldb) p $3.properties()//取属性列表
(const property_array_t) $5 = {
list_array_tt<property_t, property_list_t, RawPtr> = {
= {
list = {
ptr = 0x00000001000081d0
}
arrayAndFlag = 4295000528
}
}
}
Fix-it applied, fixed expression was:
$3->properties()
(lldb) p $5.list
(const RawPtr<property_list_t>) $6 = {
ptr = 0x00000001000081d0
}
(lldb) p $6.ptr
(property_list_t *const) $7 = 0x00000001000081d0
(lldb) p *$7
(property_list_t) $8 = {
entsize_list_tt<property_t, property_list_t, 0, PointerModifierNop> = (entsizeAndFlags = 16, count = 2)
}//可以看到count=2,只有两个属性
(lldb) p $8.get(0)
(property_t) $9 = (name = "name", attributes = "T@\"NSString\",C,N,V_name")
(lldb) p $8.get(1)
(property_t) $10 = (name = "hobby", attributes = "T@\"NSString\",C,N,V_hobby")
通过上面的调试和注释我们发现bits.data()
的properties()
方法里存储了对象的属性,但是没有成员变量nickName
。
成员变量
那成员变量存储在哪里呢,class_rw_t
结构体中没有成员变量ivar
关键字的属性或方法,我们回到class_data_bits_t
继续查找,发现有一个safe_ro()
方法,我们看一下safe_ro()
方法返回值结构体class_ro_t
的定义:
struct class_ro_t {
uint32_t flags;
uint32_t instanceStart;
uint32_t instanceSize;
#ifdef __LP64__
uint32_t reserved;
#endif
union {
const uint8_t * ivarLayout;
Class nonMetaclass;
};
explicit_atomic<const char *> name;
// With ptrauth, this is signed if it points to a small list, but
// may be unsigned if it points to a big list.
void *baseMethodList;
protocol_list_t * baseProtocols;
const ivar_list_t * ivars;
const uint8_t * weakIvarLayout;
property_list_t *baseProperties;
///省略代码
};
我们发现有一个ivars
,这是不是就是实例变量存储的位置呢,我们继续用lldb
探索:
(lldb) p/x JSPerson.class
(Class) $0 = 0x0000000100008530 JSPerson
(lldb) p/x 0x0000000100008530+0x20
(long) $1 = 0x0000000100008550
(lldb) p (class_data_bits_t *)0x0000000100008550
(class_data_bits_t *) $2 = 0x0000000100008550
(lldb) p $2.safe_ro()
(const class_ro_t *) $3 = 0x00000001000080a0
Fix-it applied, fixed expression was:
$2->safe_ro()
(lldb) p $3->ivars//获取ivars
(const ivar_list_t *const) $4 = 0x0000000100008168
(lldb) p *$4
(const ivar_list_t) $5 = {
entsize_list_tt<ivar_t, ivar_list_t, 0, PointerModifierNop> = (entsizeAndFlags = 32, count = 3)
}
(lldb) p $5.get(0)
(ivar_t) $6 = {
offset = 0x00000001000084d8
name = 0x0000000100003f18 "nickName"
type = 0x0000000100003f79 "@\"NSString\""
alignment_raw = 3
size = 8
}
(lldb) p $5.get(1)
(ivar_t) $7 = {
offset = 0x00000001000084e0
name = 0x0000000100003f21 "_name"
type = 0x0000000100003f79 "@\"NSString\""
alignment_raw = 3
size = 8
}
(lldb) p $5.get(2)
(ivar_t) $8 = {
offset = 0x00000001000084e8
name = 0x0000000100003f27 "_hobby"
type = 0x0000000100003f79 "@\"NSString\""
alignment_raw = 3
size = 8
}
果然成员变量在safe_ro()
的ivars
成员里,可以看到编译器给属性自动生成了带_
的成员变量。其实我们还看到class_ro_t
结构体中还有baseMethodList
、baseProtocols
、baseProperties
,这些我们后边再探索。
实例方法
我们用继续看方法列表:
(lldb) p/x JSPerson.class
(Class) $0 = 0x0000000100008530 JSPerson
(lldb) p/x 0x0000000100008530+0x20
(long) $1 = 0x0000000100008550
(lldb) p (class_data_bits_t *)$1
(class_data_bits_t *) $2 = 0x0000000100008550
(lldb) p $2.data()
(class_rw_t *) $3 = 0x000000010092c330
Fix-it applied, fixed expression was:
$2->data()
(lldb) p *$3
(class_rw_t) $4 = {
flags = 2148007936
witness = 1
ro_or_rw_ext = {
std::__1::atomic<unsigned long> = {
Value = 4295000224
}
}
firstSubclass = nil
nextSiblingClass = NSUUID
}
(lldb) p $4.methods()
(const method_array_t) $5 = {
list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> = {
= {
list = {
ptr = 0x00000001000080e8
}
arrayAndFlag = 4295000296
}
}
}
(lldb) p $5.list
(const method_list_t_authed_ptr<method_list_t>) $6 = {
ptr = 0x00000001000080e8
}
(lldb) p $6.ptr
(method_list_t *const) $7 = 0x00000001000080e8
(lldb) p *$7
(method_list_t) $8 = {
entsize_list_tt<method_t, method_list_t, 4294901763, method_t::pointer_modifier> = (entsizeAndFlags = 27, count = 5)
}//共有5个方法
(lldb) p $8.get(0)
(method_t) $9 = {}//不能像属性一样调用get(0)获取,可以查看method_t结构体
(lldb) p $8.get(0).big()
(method_t::big) $10 = {
name = "sayNB"
types = 0x0000000100003f71 "v16@0:8"
imp = 0x0000000100003b50 (KCObjcBuild`-[JSPerson sayNB])
}
(lldb) p $8.get(1).big()
(method_t::big) $11 = {
name = "hobby"
types = 0x0000000100003f85 "@16@0:8"
imp = 0x0000000100003bc0 (KCObjcBuild`-[JSPerson hobby])
}
(lldb) p $8.get(2).big()
(method_t::big) $12 = {
name = "setHobby:"
types = 0x0000000100003f8d "v24@0:8@16"
imp = 0x0000000100003bf0 (KCObjcBuild`-[JSPerson setHobby:])
}
(lldb) p $8.get(3).big()
(method_t::big) $13 = {
name = "name"
types = 0x0000000100003f85 "@16@0:8"
imp = 0x0000000100003b60 (KCObjcBuild`-[JSPerson name])
}
(lldb) p $8.get(4).big()
(method_t::big) $14 = {
name = "setName:"
types = 0x0000000100003f8d "v24@0:8@16"
imp = 0x0000000100003b90 (KCObjcBuild`-[JSPerson setName:])
}
我们分析可以发现方法列表里有两个属性的get
、set
方法,还有一个是我们定义的实例方法sayNB
,但是没有类方法saySomething
,类方法存储在哪里呢。
类方法
我们很容易联想到元类,那我们就用相同的方式查看元类的方法列表
(lldb) x/4gx JSPerson.class
0x100008530: 0x0000000100008508 0x0000000100357140
0x100008540: 0x000000010076a5b0 0x0001802800000003
(lldb) po 0x0000000100008508 //找到元类地址
JSPerson
(lldb) p/x 0x0000000100008508+0x20
(long) $2 = 0x0000000100008528
(lldb) p (class_data_bits_t *)0x0000000100008528
(class_data_bits_t *) $3 = 0x0000000100008528
(lldb) p $3.data()
(class_rw_t *) $4 = 0x000000010076a550
Fix-it applied, fixed expression was:
$3->data()
(lldb) p *$4
(class_rw_t) $5 = {
flags = 2684878849
witness = 1
ro_or_rw_ext = {
std::__1::atomic<unsigned long> = {
Value = 4312049361
}
}
firstSubclass = nil
nextSiblingClass = 0x00007fff861ddcd8
}
(lldb) p $5.methods()
(const method_array_t) $6 = {
list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> = {
= {
list = {
ptr = 0x0000000100008080
}
arrayAndFlag = 4295000192
}
}
}
(lldb) p $6.list
(const method_list_t_authed_ptr<method_list_t>) $7 = {
ptr = 0x0000000100008080
}
(lldb) p $7.ptr
(method_list_t *const) $8 = 0x0000000100008080
(lldb) p *$8
(method_list_t) $9 = {
entsize_list_tt<method_t, method_list_t, 4294901763, method_t::pointer_modifier> = (entsizeAndFlags = 27, count = 1)
}
(lldb) p $9.get(0).big()
(method_t::big) $10 = {
name = "saySomething"
types = 0x0000000100003f71 "v16@0:8"
imp = 0x0000000100003b40 (KCObjcBuild`+[JSPerson saySomething])
}
所以类的类方法是存储在元类的方法列表里的。
总结
本节我们主要探索了类的结构,isa
指针的走位,以及类中属性、方法、成员变量的存储。
- 类的本质是对象。
- 实例方法存放在类中
- 类方法存放在元类中
类在class_rw_t
存储属性、方法、协议等信息,在class_ro_t
里存储了成员变量、baseMethodList
、baseProtocols
、baseProperties
等信息,那class_ro_t
和class_rw_t
的区别是什么呢,我们下一篇文章继续探索。