探究OC 类(Class)

326 阅读8分钟

在上篇 OC对象的本质中,了解到对象在底层是一个objc_object结构体。 Class底层结构是什么呢?

Class底层结构

typedef struct objc_class *Class; image.png

通过查看源码,发现Class是一个objc_class指针,可以看到objc_class继承objc_object,,所以就从父类获取到了Class isa成员, 然后还有一个superclass, cache, bits, 从命名显然可以知道其用意, 一个父类, 一个缓存, 一个class的数据。 objc_class继承objc_object,证明class也是一个对象,OC对象的本质中我们知道对象的成员在内存中的如何存储排布的,我们可以通过对象的地址,对指针偏移,获取我们想要的成员数据。

运行源码,lldb调试,查看class在内存中的信息

image.png

16进制打印p指针,0x0000000103817d30,我们可以看出来,它是一个纯指针(是nonpointer) , x/4gx使用16进制分4段,每段8字节打印对应指针指向的内存信息。这里是获取的LGPerson对象的内存信息, 我们知道对象底层是一个objc_object结构体,第一位是一个Class isa指针,其他成员属性一次排列。所以0x011d800100008365是就是isa指针(不是一个纯指针,nonpointer),OC对象的本质中,理解到对象创建时,会通过isa与class关联,通过define ISA_MASK 0x00007ffffffffff8ULL与isa指针0x011d800100008365进行 & 操作,获取指针中class信息shiftcls

image.png 通过lldb 打印可以看出 对象继承LGPerson

然后接着我们今天的重点,我们打印一下Class的内存信息。Class在底层也是结构体指针objc_class,并且继承objc_object,所以Class也是对象,在内存存储时,前8字节就是对应的isa成员。

image.png 此时0x0000000100008338就是class的isa,它是纯指针与ISA_MASK进行&操作,任然还是0x0000000100008338,打印一下0x0000000100008338

image.png 居然是LGPerson。(突如其来的意外,为什么都是LGPerson,地址却不一样。)

尝试创建多个LGPerson,因为Class也是对象, 会不会是创建了多个class对象呢?

//MARK: - 分析类对象内存存在个数
void lgTestClassNum(void){
    Class class1 = [LGPerson class];
    Class class2 = [LGPerson alloc].class;
    Class class3 = object_getClass([LGPerson alloc]);
    Class class4 = [LGPerson alloc].class;
    NSLog(@"\n%p-\n%p-\n%p-\n%p",class1,class2,class3,class4);
}

image.png

调用发现,class的地址都是同一个,证明0x0000000100008338这个指针指向的class不是LGPerson,其实它是LGPerson的元类。在objc源码中并没有发现有元类的结构定义,可以推测可能是在编译是动态生成的,尝试使用MachOView查看编译好的macho文件,如果有元类,就能在符号表中找到对应的符号。

image.png 可以观察到,这两个指针正式上面我们打印的LGPerson和元类的指针地址,证明我们的猜想,0x0000000100008338就是_OBJC_METACLASS_$_LGPerson。

元类的isa指向谁呢?

接着打印元类指针0x0000000100008338的内存

image.png objc_class这个结构体中,第一个成员是isa,第二个是superclass,发现0x0000000100008338的内存信息,isa与superclass一样,打印isa的结果为NSObject

它是元类吗?我们也验证一下。

image.png 打印NSObject.class的内存信息,发现isa与上面打印的一致,证明LGPerson的元类的isa指向NSObject的元类。 其实还有一个小细节,NSObject的superclass指向的0x0,也就是nil。

继续打印NSObject的元类内存。

image.png 可以发现,NSObject的元类指向自己,NSObject的元类的superclass指向了NSObject (0x00007fff8893fcc8)

由此可以得出 未命名文件 (4).png

开篇提到objc_class中有三个成员,isa, superclass,cache, bits。 isa,superclass分析完了, 现在就剩cache,bits。

cache是缓存,缓存指针和虚函数表(暂时探究)

struct cache_t {
private:
    explicit_atomic<uintptr_t> _bucketsAndMaybeMask;
    union {
        struct {
            explicit_atomic<mask_t>    _maybeMask;
#if __LP64__
            uint16_t                   _flags;
#endif
            uint16_t                   _occupied;
        };
        explicit_atomic<preopt_cache_t *> _originalPreoptCache;
    };
    ......
}

通过内存对齐,可以得出cache_t所占内存大小

explicit_atomic<uintptr_t> _bucketsAndMaybeMask;

typedef unsigned long uintptr_t;uintptr_t是 8字节

union { struct { explicit_atomic<mask_t> _maybeMask; #if __LP64__ uint16_t _flags; #endif uint16_t _occupied; }; explicit_atomic<preopt_cache_t *> _originalPreoptCache; }; typedef uint32_t mask_t; 是4字节

_flags,_occupied为2字节

explicit_atomic<preopt_cache_t *> _originalPreoptCache 指针8字节,共用体的的内存大小以最大的成员的类型所占字节的大小决定,所以cache_t所占字节为 8+8=16。

最后bits

bits里存储这class的信息,方法列表,属性列表,协议列表等等,我们对着源码探究一下。

struct class_data_bits_t {

friend objc_class;

    // Values are the FAST_ flags above.
    uintptr_t bits;
private:
    bool getBit(uintptr_t bit) const
    {
        return bits & bit;
    }

    // Atomically set the bits in `set` and clear the bits in `clear`.
    // set and clear must not overlap.
    void setAndClearBits(uintptr_t set, uintptr_t clear)
    {
        ASSERT((set & clear) == 0);
        uintptr_t newBits, oldBits = LoadExclusive(&bits);
        do {
            newBits = (oldBits | set) & ~clear;
        } while (slowpath(!StoreReleaseExclusive(&bits, &oldBits, newBits)));
    }

    void setBits(uintptr_t set) {
        __c11_atomic_fetch_or((_Atomic(uintptr_t) *)&bits, set, __ATOMIC_RELAXED);
    }

    void clearBits(uintptr_t clear) {
        __c11_atomic_fetch_and((_Atomic(uintptr_t) *)&bits, ~clear, __ATOMIC_RELAXED);
    }

public:

    class_rw_t* data() const {
        return (class_rw_t *)(bits & FAST_DATA_MASK);
    }
    // Get the class's ro data, even in the presence of concurrent realization.
    // fixme this isn't really safe without a compiler barrier at least
    // and probably a memory barrier when realizeClass changes the data field
    const class_ro_t *safe_ro() const {
        class_rw_t *maybe_rw = data();
        if (maybe_rw->flags & RW_REALIZED) {
            // maybe_rw is rw
            return maybe_rw->ro();
        } else {
            // maybe_rw is actually ro
            return (class_ro_t *)maybe_rw;
        }
    }
    ......
}

bits是一个class_data_bits_t 类型结构体,其中class_rw_t* data() 返回一个class_rw_t指针

struct class_rw_t {
    // Be warned that Symbolication knows the layout of this structure.
    uint32_t flags;
    uint16_t witness;
#if SUPPORT_INDEXED_ISA
    uint16_t index;
#endif

    explicit_atomic<uintptr_t> ro_or_rw_ext;

    Class firstSubclass;
    Class nextSiblingClass;
    ....
    const method_array_t methods() const {
        auto v = get_ro_or_rwe();
        if (v.is<class_rw_ext_t *>()) {
            return v.get<class_rw_ext_t *>(&ro_or_rw_ext)->methods;
        } else {
            return method_array_t{v.get<const class_ro_t *>(&ro_or_rw_ext)->baseMethods()};
        }
    }

    const property_array_t properties() const {
        auto v = get_ro_or_rwe();
        if (v.is<class_rw_ext_t *>()) {
            return v.get<class_rw_ext_t *>(&ro_or_rw_ext)->properties;
        } else {
            return property_array_t{v.get<const class_ro_t *>(&ro_or_rw_ext)->baseProperties};
        }
    }

    const protocol_array_t protocols() const {
        auto v = get_ro_or_rwe();
        if (v.is<class_rw_ext_t *>()) {
            return v.get<class_rw_ext_t *>(&ro_or_rw_ext)->protocols;
        } else {
            return protocol_array_t{v.get<const class_ro_t *>(&ro_or_rw_ext)->baseProtocols};
        }
    }

class_rw_t中, 看到class_rw_t可以返回方法,属性以及协议的方法。

我们lldb调试一下,看具体能不能返回我们想要的。

image.pngLGPerson中定义了2个属性,一个成员变量,一个对象方法和一个类方法

(lldb) x/5gx LGPerson.class
0x100008248: 0x0000000100008220 0x000000010036a140
0x100008258: 0x0000000100362380 0x0000802c00000000
0x100008268: 0x0000000100912574
(lldb) p (class_data_bits_t *)0x100008268
(class_data_bits_t *) $1 = 0x0000000100008268
(lldb) p $1->data()
(class_rw_t *) $2 = 0x0000000100912570
(lldb) p $2->properties()
(const property_array_t) $3 = {
  list_array_tt<property_t, property_list_t, RawPtr> = {
     = {
      list = {
        ptr = 0x00000001000081d8
      }
      arrayAndFlag = 4295000536
    }
  }
}
(lldb) p $3.list
(const RawPtr<property_list_t>) $4 = {
  ptr = 0x00000001000081d8
}
(lldb) p $4.ptr
(property_list_t *const) $5 = 0x00000001000081d8
(lldb) p *$5
(property_list_t) $6 = {
  entsize_list_tt<property_t, property_list_t, 0, PointerModifierNop> = (entsizeAndFlags = 16, count = 2)
}
(lldb) p $6.get(0)
(property_t) $7 = (name = "name", attributes = "T@\"NSString\",&,N,V_name")
(lldb) p $6.get(1)
(property_t) $8 = (name = "age", attributes = "Ti,N,V_age")

x/5gx LGPerson.class打印LGPerson的内存信息,我们知道第一个8字节是isa, 第二个是superclass,第三个事cache, 通过对指针偏移可以计算出bits在内存中的位置,首地址+8(isa)+8(superclass)+16(cache)。 0x100008248 + 0x20 = 0x100008268 0x100008268为bits的指针,知道bits是一个class_data_bits_t类型,通过指针强转。 p (class_data_bits_t *)0x10000826,得到结构体指针,就可以调用结构体内部的函数。所以就可以通过 p $1->data()获取class_rw_t 指针。 同理,通过class_rw_t指针对象,可以调用properties,methods,properties方法获取属性,方法,协议。

p $2->properties()得到属性列表property_array_t

class property_array_t : 
    public list_array_tt<property_t, property_list_t, RawPtr>
{
    typedef list_array_tt<property_t, property_list_t, RawPtr> Super;

 public:
    property_array_t() : Super() { }
    property_array_t(property_list_t *l) : Super(l) { }
};
class list_array_tt {
    struct array_t {
        uint32_t count;
        Ptr<List> lists[0];

        static size_t byteSize(uint32_t count) {
            return sizeof(array_t) + count*sizeof(lists[0]);
        }
        size_t byteSize() {
            return byteSize(count);
        }
    };
    .....
  }

property_array_t继承list_array_tt是一个模板class,内部存储了类的所有属性,property_t就是定义属性的结构体。

struct property_t {
    const char *name;
    const char *attributes;
};

通过上方的打印,(property_list_t) $6 = { entsize_list_tt<property_t, property_list_t, 0, PointerModifierNop> = (entsizeAndFlags = 16, count = 2) }

property_list_t中只有2个属性,一个name,一个age, 没有成员变量hobby。

同理获取方法列表

class method_array_t : 
    public list_array_tt<method_t, method_list_t, method_list_t_authed_ptr>
{
    typedef list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> Super;

 public:
    method_array_t() : Super() { }
    method_array_t(method_list_t *l) : Super(l) { }

    const method_list_t_authed_ptr<method_list_t> *beginCategoryMethodLists() const {
        return beginLists();
    }
    
    const method_list_t_authed_ptr<method_list_t> *endCategoryMethodLists(Class cls) const;
};
struct method_t {
    static const uint32_t smallMethodListFlag = 0x80000000;

    method_t(const method_t &other) = delete;

    // The representation of a "big" method. This is the traditional
    // representation of three pointers storing the selector, types
    // and implementation.
    struct big {
        SEL name;
        const char *types;
        MethodListIMP imp;
    };

private:
    bool isSmall() const {
        return ((uintptr_t)this & 1) == 1;
    }

    // The representation of a "small" method. This stores three
    // relative offsets to the name, types, and implementation.
    struct small {
        // The name field either refers to a selector (in the shared
        // cache) or a selref (everywhere else).
        RelativePointer<const void *> name;
        RelativePointer<const char *> types;
        RelativePointer<IMP> imp;

        bool inSharedCache() const {
            return (CONFIG_SHARED_CACHE_RELATIVE_DIRECT_SELECTORS &&
                    objc::inSharedCache((uintptr_t)this));
        }
    };

    small &small() const {
        ASSERT(isSmall());
        return *(struct small *)((uintptr_t)this & ~(uintptr_t)1);
    }
    .....

LLDB打印方法列表

(lldb) p $2->methods()
(const method_array_t) $9 = {
  list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> = {
     = {
      list = {
        ptr = 0x00000001000080d8
      }
      arrayAndFlag = 4295000280
    }
  }
}
(lldb) p $9.list
(const method_list_t_authed_ptr<method_list_t>) $10 = {
  ptr = 0x00000001000080d8
}
(lldb) p $10.ptr
(method_list_t *const) $11 = 0x00000001000080d8
(lldb) p *$11
(method_list_t) $12 = {
  entsize_list_tt<method_t, method_list_t, 4294901763, method_t::pointer_modifier> = (entsizeAndFlags = 27, count = 6)
}
(lldb) p $12.get(0).big()
(method_t::big) $13 = {
  name = "say"
  types = 0x0000000100003f62 "v16@0:8"
  imp = 0x0000000100003d90 (KCObjcBuild`-[LGPerson say])
}
(lldb) p $12.get(1).big()
(method_t::big) $14 = {
  name = "name"
  types = 0x0000000100003f78 "@16@0:8"
  imp = 0x0000000100003da0 (KCObjcBuild`-[LGPerson name])
}
(lldb) p $12.get(2).big()
(method_t::big) $15 = {
  name = ".cxx_destruct"
  types = 0x0000000100003f62 "v16@0:8"
  imp = 0x0000000100003e30 (KCObjcBuild`-[LGPerson .cxx_destruct])
}
(lldb) p $12.get(3).big()
(method_t::big) $16 = {
  name = "setName:"
  types = 0x0000000100003f80 "v24@0:8@16"
  imp = 0x0000000100003dc0 (KCObjcBuild`-[LGPerson setName:])
}
(lldb) p $12.get(4).big()
(method_t::big) $17 = {
  name = "age"
  types = 0x0000000100003f8b "i16@0:8"
  imp = 0x0000000100003df0 (KCObjcBuild`-[LGPerson age])
}
(lldb) p $12.get(5).big()
(method_t::big) $18 = {
  name = "setAge:"
  types = 0x0000000100003f93 "v20@0:8i16"
  imp = 0x0000000100003e10 (KCObjcBuild`-[LGPerson setAge:])
}

跟获取属性类似,通过methods方法获取。有点差异就是method_t结构体并不是定义成与property_t结构体一样直接就可以访问到具体内容。而是需要通过big方法获取。

// The representation of a "big" method. This is the traditional
    // representation of three pointers storing the selector, types
    // and implementation.
    struct big {
        SEL name;
        const char *types;
        MethodListIMP imp;
    };
注释中说big方法代表的意义,它是一个传统的存储selector,types, implementation3个指针的方式。

通过打印 (method_list_t) $12 = { entsize_list_tt<method_t, method_list_t, 4294901763, method_t::pointer_modifier> = (entsizeAndFlags = 27, count = 6) }只有6个方法,分别是say,name,.cxx_destruct,setName,age,setAge,没有类方法play。

通过对属性和方法的打印, 暂时没有发现成员变量hobby,已经类方法play。 hobby并不在属性列表中,在需要method_list_tproperty_list_t,发现存在一个ivar_list_t

struct ivar_list_t : entsize_list_tt<ivar_t, ivar_list_t, 0> {
    bool containsIvar(Ivar ivar) const {
        return (ivar >= (Ivar)&*begin()  &&  ivar < (Ivar)&*end());
    }
};

struct ivar_t {
#if __x86_64__
    // *offset was originally 64-bit on some x86_64 platforms.
    // We read and write only 32 bits of it.
    // Some metadata provides all 64 bits. This is harmless for unsigned 
    // little-endian values.
    // Some code uses all 64 bits. class_addIvar() over-allocates the 
    // offset for their benefit.
#endif
    int32_t *offset;
    const char *name;
    const char *type;
    // alignment is sometimes -1; use alignment() instead
    uint32_t alignment_raw;
    uint32_t size;

    uint32_t alignment() const {
        if (alignment_raw == ~(uint32_t)0) return 1U << WORD_SHIFT;
        return 1 << alignment_raw;
    }
};

通过源码搜索,发现其定义在一个class_ro_t中,

struct class_ro_t {
    uint32_t flags;
    uint32_t instanceStart;
    uint32_t instanceSize;
#ifdef __LP64__
    uint32_t reserved;
#endif

    union {
        const uint8_t * ivarLayout;
        Class nonMetaclass;
    };

    explicit_atomic<const char *> name;
    // With ptrauth, this is signed if it points to a small list, but
    // may be unsigned if it points to a big list.
    void *baseMethodList;
    protocol_list_t * baseProtocols;
    const ivar_list_t * ivars;

    const uint8_t * weakIvarLayout;
    property_list_t *baseProperties;

    // This field exists only when RO_HAS_SWIFT_INITIALIZER is set.
    _objc_swiftMetadataInitializer __ptrauth_objc_method_list_imp _swiftMetadataInitializer_NEVER_USE[0];

    _objc_swiftMetadataInitializer swiftMetadataInitializer() const {
        if (flags & RO_HAS_SWIFT_INITIALIZER) {
            return _swiftMetadataInitializer_NEVER_USE[0];
        } else {
            return nil;
        }
    }

    const char *getName() const {
        return name.load(std::memory_order_acquire);
    }
    .....
  }

主要获取到class_ro_t,就可以去获取ivar_list_t,查看内部存不存在hobby。 在class_data_bits_t中定义了const class_ro_t *safe_ro()返回class_ro_t指针。同理LLDB调试获取class_ro_t

(lldb) p $1->safe_ro()
(const class_ro_t *) $19 = 0x0000000100008090
(lldb) p $19->ivars
(const ivar_list_t *const) $20 = 0x0000000100008170
(lldb) p *$20
(const ivar_list_t) $21 = {
  entsize_list_tt<ivar_t, ivar_list_t, 0, PointerModifierNop> = (entsizeAndFlags = 32, count = 3)
}
(lldb) p $21.get(0)
(ivar_t) $22 = {
  offset = 0x0000000100008208
  name = 0x0000000100003f25 "hobby"
  type = 0x0000000100003f6a "@\"NSString\""
  alignment_raw = 3
  size = 8
}
(lldb) p $21.get(1)
(ivar_t) $23 = {
  offset = 0x0000000100008210
  name = 0x0000000100003f2b "_age"
  type = 0x0000000100003f76 "i"
  alignment_raw = 2
  size = 4
}
(lldb) p $21.get(2)
(ivar_t) $24 = {
  offset = 0x0000000100008218
  name = 0x0000000100003f30 "_name"
  type = 0x0000000100003f6a "@\"NSString\""
  alignment_raw = 3
  size = 8
}

具体步骤与获取属性一样,发现获取的ivars有3个,其实就有hobby,说明一点成员变量只存在ivars list中,成员属性生成的下滑线成员变量也或加入进ivars。

最后就是类方法,本类的方法列表中只有对象方法,很容易联想到,在不在元类中,直接LLDB打印调试。

(lldb) x/5gx LGPerson.class
0x100008248: 0x0000000100008220 0x000000010036a140
0x100008258: 0x0000000100362380 0x0000802c00000000
0x100008268: 0x0000000100912574
(lldb) p/x 0x0000000100008220 & 0x00007ffffffffff8
(long) $26 = 0x0000000100008220
(lldb) x/5gx 0x0000000100008220
0x100008220: 0x000000010036a0f0 0x000000010036a0f0
0x100008230: 0x00000001007d6e80 0x0002e03500000003
0x100008240: 0x0000000100912594
(lldb) p (class_data_bits_t *)0x100008240
(class_data_bits_t *) $27 = 0x0000000100008240
(lldb) p $27->data()
(class_rw_t *) $28 = 0x0000000100912590
(lldb) p $28->methods()
(const method_array_t) $29 = {
  list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> = {
     = {
      list = {
        ptr = 0x0000000100008070
      }
      arrayAndFlag = 4295000176
    }
  }
}
(lldb) p $29.list
(const method_list_t_authed_ptr<method_list_t>) $30 = {
  ptr = 0x0000000100008070
}
(lldb) p $30.ptr
(method_list_t *const) $31 = 0x0000000100008070
(lldb) p *$31
(method_list_t) $32 = {
  entsize_list_tt<method_t, method_list_t, 4294901763, method_t::pointer_modifier> = (entsizeAndFlags = 27, count = 1)
}
(lldb) p $32.get(0).big()
(method_t::big) $33 = {
  name = "play"
  types = 0x0000000100003f62 "v16@0:8"
  imp = 0x0000000100003d80 (KCObjcBuild`+[LGPerson play])
}

方法跟class获取methods一样,只要获取class指针内容后,将class的isa与define ISA_MASK 0x00007ffffffffff8ULL做&操作,就能得到元类指针,然后依次打印,就可以发现play方法。

总结

  1. class本质是一个objc_class的结构体并继承objc_object,也是一个对象。 objc_class有4个成员,isa,superclass, cache, bits。

通过对class的isa分析,得出Isa的链条:

对象的isa->class, class的isa-> metaClass, metaClass的isa->NSObject的元类, NSObject的元类的isa->自己。

通过对class的superclass分析,得出子类的superclass->父类,父类的superclass->NSObject, NSObject的superclass -> nil。

子类的元类的superclass-> 父类的元类, 父类的元类的superclass->NSObject的元类,NSObject的元类的superclass->NSObject。

2.class的成员属性在对应的class的property_array_t列表中,成员变量在ivars 中,class中存储对象方法,类方法存放在对应class的元类中。