iOS 底层探究之alloc

我们通过几个问题来探究下一个iOS如何获取到一个对象:

alloc和init的区别？
alloc方法做了哪些事情？

alloc 和 init的区别

从字面意思上，我们可以知道alloc是用来分配内存，init是用来初始化数据。下面我们通过代码来验证一下:

NSObject *obj1 = [NSObject alloc];
NSObject *obj2 = [obj1 init];
NSObject *obj3 = [obj1 init];
NSObject *obj4 = [NSObject alloc];
NSLog(@"obj1: %@, %p, %p", obj1, obj1, &obj1);
NSLog(@"obj2: %@, %p, %p", obj2, obj2, &obj2);
NSLog(@"obj3: %@, %p, %p", obj3, obj3, &obj3);
NSLog(@"obj4: %@, %p, %p", obj4, obj4, &obj4);

obj1: <NSObject: 0x6000000fc580>, 0x6000000fc580, 0x7ffee64db358
obj2: <NSObject: 0x6000000fc580>, 0x6000000fc580, 0x7ffee64db350
obj3: <NSObject: 0x6000000fc580>, 0x6000000fc580, 0x7ffee64db348
obj4: <NSObject: 0x6000000fc6a0>, 0x6000000fc6a0, 0x7ffee64db340

分析NSObject对象的打印:

obj1、obj2、obj3 的内存地址是一样 0x6000000fc580，和obj4 0x6000000fc6a0，说明init不会分配，调用alloc时才分配了栈地址，
obj1、obj2、obj3、obj4 变量的指针地址都不一样，而且是连续，依次变小的，因为指针地址分配在栈区，栈区分配内存是连续的。
栈区和堆区的内存分配图解:

总结:

alloc才会分配内存地址，init用于初始化数据。
变量指针地址分配在栈区，而且是严格根据变量声明顺序连续分配内存地址，从高到低分配。
NSObject对象的内容一般存储在堆区，从低到高分配，因为堆空间分配是找到一块可用且大于需要分配内存大小的地址，有可能后分配的内存地址可能更小。

alloc方法做了哪些事情

从我对alloc的调用栈和实现逻辑，得到以下结论:

分配对象所需的内存，并做了内存对齐工作
将对象和所属类型通过isa属性绑定起来

准备工作

下载可编译的objc4源码，可以直接使用，不需要配置。如果断点不生效，我的解决方案是将target -> build phases -> compile sources -> 将要断点的文件移到最前面就生效了。

alloc调用链

NSObject调用alloc
调用objc_alloc
callAlloc(cls, true, false)
NSObject 通过objc_msgSend调用 +alloc
_objc_rootAlloc
callAlloc(cls, false, true)
_objc_rootAllocWithZone
_class_createInstanceFromZone(): 内部实现内存分配和绑定类型
1. instanceSize(): 计算obj所需要的内存及实现内存对齐
2. calloc(): 分配内存，得到一个对象
3. initInstanceIsa(): 绑定类型
alloc调用流程图:

分配内存，并实现内存对齐

instanceSize()方法提供了两种计算内存的方法，第一个分支走hasFastInstanceSize(), 第二个分支走alignedInstanceSize()

inline size_t instanceSize(size_t extraBytes) const {
    if (fastpath(cache.hasFastInstanceSize(extraBytes))) {
        return cache.fastInstanceSize(extraBytes);
    }
    size_t size = alignedInstanceSize() + extraBytes;
    // CF requires all objects be at least 16 bytes.
    if (size < 16) size = 16;
    return size;
}

2.判断是否可以快速计算实例化内存大小。 __builtin_constant_p()函数表示如果为常数返回1，如果是变量是返回0。而且在_class_createInstanceFromZone(cls, 0, nil, OBJECT_CONSTRUCT_CALL_BADALLOC)调用时extra传入的就是0，所以if分支为真，应该调用 _flags & FAST_CACHE_ALLOC_MASK16。但是在实际运行中，发现走的是 _flags & FAST_CACHE_ALLOC_MASK。我通过 po __builtin_constant_p(extra) == 0发现是true，因为无法看到__builtin_constant_p的实现，这里也就不深究了。最后结果返回的YES，所以下一步调用 fastInstanceSize().

bool hasFastInstanceSize(size_t extra) const
{
    if (__builtin_constant_p(extra) && extra == 0) {
        return _flags & FAST_CACHE_ALLOC_MASK16;
    }
    return _flags & FAST_CACHE_ALLOC_MASK;
}

3.调用fastInstanceSize函数，这里才是实现内存对齐的地方。因为 po __builtin_constant_p(extra) == 0 所以走else分支，调用align16()实现内存对齐。

size_t fastInstanceSize(size_t extra) const
{
    ASSERT(hasFastInstanceSize(extra));
    if (__builtin_constant_p(extra) && extra == 0) {
        return _flags & FAST_CACHE_ALLOC_MASK16;
    } else {
        size_t size = _flags & FAST_CACHE_ALLOC_MASK;
        // remove the FAST_CACHE_ALLOC_DELTA16 that was added
        // by setFastInstanceSize
        return align16(size + extra - FAST_CACHE_ALLOC_DELTA16);
    }
}

4.align16()中对对象所需的做(x + size_t(15)) & ~size_t(15)，目的很简单，即对16取余，当有余数是，取出这部分加上16. 比如: size_t(15)是01111，取反后是10000, 如果超过16的话，前面补1。 33 二进制是100001， &10000得到100000即32。

static inline size_t align16(size_t x) {
    return (x + size_t(15)) & ~size_t(15);
}

5.以上在objc4实际运行的调用链，总结可得: iOS通过alloc分配内存，且做了内存对齐，对齐的字节数是16.实际上我们得对象的结尾数字不是0就是8，就是这个原因。

6.instanceSize()方法的else分支走alignedInstanceSize()方法，最终调用word_align(),同4中分析可知对齐字节是8。

uint32_t alignedInstanceSize() const {
    return word_align(unalignedInstanceSize());
}

static inline uint32_t word_align(uint32_t x) {
    return (x + WORD_MASK) & ~WORD_MASK;
}

#   define WORD_MASK 7UL // 64位下

总结: alloc最终通过_class_createInstanceFromZone()方法调用instanceSize()计算对象所需的内存，在64位下进行16对齐，然后通过calloc()分配内存。

绑定类型

alloc最终_class_createInstanceFromZone()方法initInstanceIsa()实现类型绑定。

inline void objc_object::initInstanceIsa(Class cls, bool hasCxxDtor)
{
    ASSERT(!cls->instancesRequireRawIsa());
    ASSERT(hasCxxDtor == cls->hasCxxDtor());

    initIsa(cls, true, hasCxxDtor);
}

然后调用objc_object::initIsa()方法，在64位机器下，isa都进行了优化(nonpointer == 1)，所以走else分支, 通过setClass()将obj和Class绑定起来

inline void objc_object::initIsa(Class cls, bool nonpointer, UNUSED_WITHOUT_INDEXED_ISA_AND_DTOR_BIT bool hasCxxDtor)
{
    ASSERT(!isTaggedPointer());
    isa_t newisa(0);

    if (!nonpointer) {
        newisa.setClass(cls, this);
    } else {
        ASSERT(!DisableNonpointerIsa);
        ASSERT(!cls->instancesRequireRawIsa());

        newisa.has_cxx_dtor = hasCxxDtor;
        newisa.setClass(cls, this);
        newisa.extra_rc = 1;
    }

    // This write must be performed in a single store in some cases
    // (for example when realizing a class because other threads
    // may simultaneously try to use the class).
    // fixme use atomics here to guarantee single-store and to
    // guarantee memory order w.r.t. the class index table
    // ...but not too atomic because we don't want to hurt instantiation
    isa = newisa;
}

总结

综上的现象，我们可知alloc()方法实现了对象的内存分配，内存对齐，将对象和类型绑定三个功能。

内存对齐实际案例

Apple在64位下，对象内存对齐是16，结构体是8。
内存分配时，会根据属性或成员变量的类型length, 属性或成员的起始内存必须是该类型length的整数倍。

验证64位下内存对齐是16

在内存分配时，最终调用objc-runtime-new.h _class_createInstanceFromZone()方法中
调用顺序是: _class_createInstanceFromZone() -> instanceSize() -> cache.fastInstanceSize() -> align16()
最终调用的是align16()方法, 对分配的内存x做内存对其，对其规则(x + size_t(15)) & ~size_t(15)
1. ~size_t(15): size_t(15)是01111，取反后是10000, 如果超过16的话，前面补1
2. (x + size_t(15)) 这是为了实现分配的内存不小于实际需要的，向上加一个16(计算机从0开始)
3. (x + size_t(15)) & ~size_t(15) 在2的部分上去除余数，
4. 比如13 + 15 = 28, 最后得到16， 28 二进制是11100， &10000 得到10000即16
5. 18 + 15 = 33 最后得到32， 33 二进制是100001， &10000得到100000即32

1_alloc_二进制计算.png

inline size_t instanceSize(size_t extraBytes) const {
    if (fastpath(cache.hasFastInstanceSize(extraBytes))) {
        return cache.fastInstanceSize(extraBytes);
    }
    size_t size = alignedInstanceSize() + extraBytes;
    // CF requires all objects be at least 16 bytes.
    if (size < 16) size = 16;
    return size;
}

size_t fastInstanceSize(size_t extra) const
{
    ASSERT(hasFastInstanceSize(extra));
    if (__builtin_constant_p(extra) && extra == 0) {
        return _flags & FAST_CACHE_ALLOC_MASK16;
    } else {
        size_t size = _flags & FAST_CACHE_ALLOC_MASK;
        // remove the FAST_CACHE_ALLOC_DELTA16 that was added
        // by setFastInstanceSize
        return align16(size + extra - FAST_CACHE_ALLOC_DELTA16);
    }
}

static inline size_t align16(size_t x) {
    return (x + size_t(15)) & ~size_t(15);
}

对象内存分析

@interface LKXObjectDemo1 : NSObject {
    // isa // 8
    int age; // 4
    double hegiht; // 8
    char chr; // 1
    double weight; // 8
}
@end

@interface LKXObjectDemo2 : NSObject {
    // isa // 8
    char chr; // 1
    int age; // 4
    double weight; // 8
    double hegiht; // 8
}
@end

@interface LKXObjectDemo3 : NSObject {
    @public
    // isa // 8
    char chr; // 1
    int age; // 4
    int idx; // 4
    double weight; // 8
    double hegiht; // 8
}
@end

LKXObjectDemo1 分配内存48字节，使用内存40字节，假如起始位置是0x10020000
1. isa 占用内存8字节，起始位置是0x10020000，结束位置是0x10020007
2. int age 占用内存4字节，起始位置是0x10020008，结束位置是0x1002000B
3. double hegiht 占用内存8字节，起始位置也要是8的倍数，所以起始位置是0x10020010，结束位置是0x10020018
4. char chr 占用内存1字节，起始位置是0x10020018，结束位置是0x10020018
5. double weight占用内存8字节，起始位置也要是8的倍数，所以起始位置是0x10020020，结束位置是0x10020027
6. 0x27是40，因为对象内存对其是16，所以分配内存48
LKXObjectDemo2 分配内存32字节，使用内存32字节，假如起始位置是0x10020000
1. isa 占用内存8字节，起始位置是0x10020000，结束位置是0x10020007
2. char chr 占用内存1字节，起始位置是0x10020008，结束位置是0x10020008
3. int age 占用内存4字节，起始位置也要是4的倍数，起始位置是0x1002000B，结束位置是0x1002000F
4. double weight 占用内存8字节，起始位置是0x10020010，结束位置是0x10020017
5. double hegiht 占用内存8字节，起始位置是0x10020018，结束位置是0x1002001F
6. 0x1F是32, 所以占用32字节
LKXObjectDemo3 分配内存48字节，使用内存40字节，假如起始位置是0x10020000
1. isa 占用内存8字节，起始位置是0x10020000，结束位置是0x10020007
2. char chr 占用内存1字节，起始位置是0x10020008，结束位置是0x10020008
3. int age 占用内存4字节，起始位置也要是4的倍数，起始位置是0x1002000B，结束位置是0x1002000F
4. int idx 占用内存4字节，起始位置是0x10020010，结束位置是0x10020013
5. double weight 占用内存8字节，起始位置也要是8的倍数，起始位置是0x10020018，结束位置是0x1002001F
6. double hegiht 占用内存8字节，起始位置是0x10020020，结束位置是0x10020027
7. 0x27是40，因为对象内存对其是16，所以分配内存48
demo3成员变量分析，从输出可以看出
1. demo3(0x101b0b840)的内存地址和chr(0x101b0b848)相差8个字节, 这个8个字节就是isa的地址, demo3指向的内存是 0x011d8001000085f9，[LKXObjectDemo3 class]的内存地址是 0x00000001000085f8，刚好是后9位相同，这说明isa指向类类型内存地址
2. 从chr(0x101b0b848)、chr2(0x101b0b849)相隔1字节，而且指向的内存0x0000000a00003363可以看出，3的ASCII码是33，c的的ASCII码是63
3. 从chr(0x101b0b848)、 age(0x101b0b84c)、idx(0x101b0b850)的内存地址是相邻的，而且相隔4字节，说明成员属性分配内存必须是其类型长度的整数倍，因为int类型长度是4。因为char类型长度是1，所以没有影响。
4. weight(0x101c042c8)和height(0x101c042d0)各占8字节

demo3->chr = 'c';
demo3->age = 10;
demo3->idx = 1;
demo3->weight = 120;
demo3->hegiht = 170;
NSLog(@"chr: %p, age: %p, idx: %p, weight: %p, height: %p", &(demo3->chr), &(demo3->age), &(demo3->idx),
&(demo3->weight), &(demo3->hegiht));

demo3: 0x101b0b840 
chr: 0x101b0b848, chr2: 0x101b0b849, 
age: 0x101b0b84c, idx: 0x101b0b850, 
weight: 0x101b0b858, height: 0x101b0b860

0x101b0b840: 0x011d8001000085f9 0x0000000a00003363
0x101b0b850: 0x0000000000000001 0x405e000000000000
0x101b0b860: 0x4065400000000000 0x0000000000000000
0x101b0b870: 0x0000000000000000 0x0000000000000000

p [LKXObjectDemo3 class]
(Class) $1 = 0x00000001000085f8

struct 内存分析

struct StructDemo1 {
    char ch; // 1
    double height; // 8
    float weight; // 4
    char *name; // 8
    int age; // 4
} StructDemo1;

struct StructDemo2 {
    char ch; // 1
    int age; // 4
    char *name; // 8
    double height; // 8
    float weight; // 8
} StructDemo2;

struct StructDemo3 {
    struct StructDemo1 s1; // 40
    struct StructDemo2 s2; // 32
    float weight; // 4
    char chr; // 1
    int index; // 4
    double height; // 8
} StructDemo3;

StructDemo1内存是大小是40字节，因为每个属性都必须是其类型length，假如起始位置是0x10020000
1. char ch 占用1字节，那么ch的起始位置是0x10020000, 结束位置是0x10020000
2. double height 占用8字节，起始位置也要是8的倍数，那么height的起始位置是0x10020008, 结束位置是 0x1002000F
3. float weight 占用4字节，weight的起始位置是0x10020010, 结束位置是 0x10020014
4. char *name 占用8字节，name的起始位置是0x10020018, 结束位置是 0x1002001F
5. int age 占用4字节，age的起始位置是0x10020020, 结束位置是 0x10020023
6. 0x23是36，因为struct内存对其是8字节，所以最终分配了40字节
StructDemo2内存是大小是32字节，假如起始位置是0x10020000
1. char ch 占用1字节，那么ch的起始位置是0x10020000, 结束位置是0x10020000
2. int age 占用4字节，起始位置也要是4的倍数, age的起始位置是0x10020004, 结束位置是 0x10020007
3. char *name 占用8字节，name的起始位置是0x10020008, 结束位置是 0x1002000F
4. double height 占用8字节，那么height的起始位置是0x10020010, 结束位置是 0x10020017
5. float weight 占用4字节，weight的起始位置是0x10020018, 结束位置是 0x1002001B
6. 0x1B是28，因为struct内存对其是8字节，所以最终分配了32字节
StructDemo1内存是大小是96字节，假如起始位置是0x10020000
1. struct StructDemo1 s1 占用40字节, s1起始位置是0x10020000，结束位置0x10020027
2. struct StructDemo2 s2 占用32字节, s1起始位置是0x10020028，结束位置0x10020047
3. float weight 占用4字节，weight的起始位置是0x10020048, 结束位置是 0x1002004B
4. char chr 占用1字节，那么chr的起始位置是0x1002004C, 结束位置是0x1002004C
5. int index 占用4字节，起始位置也要是4的倍数, index的起始位置是0x10020050, 结束位置是 0x10020053
6. double height 占用8字节，那么height的起始位置是0x10020058, 结束位置是 0x1002005F
7. 0x5F是96，刚好使用了96字节

补充

为什么要内存对齐？

平台移植问题: 不同的硬件平台访问地址是有其规则，不是所有硬件都可以任意访问所有位置。
性能问题: 数据结构(特别是栈)应该尽可能在自然边界上对其。因为访问未对齐的内存，处理器需要做两次内存访问；而对齐的内存访问仅需要一次。

参考文章

OC底层原理初探之对象的本质（一）alloc探索上

OC底层原理初探之alloc的探索上