iOS底层(八)-类的加载

705 阅读8分钟

一、前言

iOS程序在启动的时候, 会通过dyld来进行加载所需要的动态库、链接等一些操作. 在这些操作之后, 就会进入到 libobjc.A.dylib_objc_init中, 并且通过 _dyld_objc_notify_register(&map_images, load_images, unmap_image); 来对类进行加载

二、_objc_init方法

2.1、 _objc_init方法

首先来看一下 _objc_init的方法实现

/***********************************************************************
* _objc_init
* Bootstrap initialization. Registers our image notifier with dyld.
* Called by libSystem BEFORE library initialization time
**********************************************************************/

void _objc_init(void)
{
    static bool initialized = false;
    if (initialized) return;
    initialized = true;
    
    // fixme defer initialization until an objc-using image is found?
    environ_init();
    tls_init();
    static_init();
    lock_init();
    exception_init();

    _dyld_objc_notify_register(&map_images, load_images, unmap_image);
}

首先分别调用了: environ_init(); tls_init(); static_init(); lock_init(); exception_init(); 最后调用 _dyld_objc_notify_register来进行加载类.

2.2、environ_init()

进入到 environ_init 方法我们可以看到其中有对很多环境变量的初始化, 其中还有这样一段:

    // Print OBJC_HELP and OBJC_PRINT_OPTIONS output.
    if (PrintHelp  ||  PrintOptions) {
        if (PrintHelp) {
            _objc_inform("Objective-C runtime debugging. Set variable=YES to enable.");
            _objc_inform("OBJC_HELP: describe available environment variables");
            if (PrintOptions) {
                _objc_inform("OBJC_HELP is set");
            }
            _objc_inform("OBJC_PRINT_OPTIONS: list which options are set");
        }
        if (PrintOptions) {
            _objc_inform("OBJC_PRINT_OPTIONS is set");
        }

        for (size_t i = 0; i < sizeof(Settings)/sizeof(Settings[0]); i++) {
            const option_t *opt = &Settings[i];            
            if (PrintHelp) _objc_inform("%s: %s", opt->env, opt->help);
            if (PrintOptions && *opt->var) _objc_inform("%s is set", opt->env);
        }
    }

根据注释得知着说对环境变量的一些打印. 我们把其中的 for循环 稍作处理:

    for (size_t i = 0; i < sizeof(Settings)/sizeof(Settings[0]); i++) {
        const option_t *opt = &Settings[i];
        _objc_inform("%s: %s", opt->env, opt->help);
        if (PrintOptions && *opt->var) _objc_inform("%s is set", opt->env);
    }

取出来放在最初的判断之前, 来看一下打印出来的都是什么:

我们就可以通过在Arguments中修改环境变量来做一些调试等等.

2.3、tls_init()

可以看到tls_init方法就是对线程的一些key的绑定.

void tls_init(void)
{
#if SUPPORT_DIRECT_THREAD_KEYS
    _objc_pthread_key = TLS_DIRECT_KEY;
    pthread_key_init_np(TLS_DIRECT_KEY, &_objc_pthread_destroyspecific);
#else
    _objc_pthread_key = tls_create(&_objc_pthread_destroyspecific);
#endif
}

2.4、static_init

/***********************************************************************
* static_init
* Run C++ static constructor functions.
* libc calls _objc_init() before dyld would call our static constructors, 
* so we have to do it ourselves.
**********************************************************************/
static void static_init()
{
    size_t count;
    auto inits = getLibobjcInitializers(&_mh_dylib_header, &count);
    for (size_t i = 0; i < count; i++) {
        inits[i]();
    }
}

根据注释可以知道这里主要是对一些C++的系统级别的全局静态构造函数的调用.

2.5、lock_init();

进入之后是一个空函数的调用. 猜测很有可能是官方并未开源

2.6、exception_init方法

/***********************************************************************
* exception_init
* Initialize libobjc's exception handling system.
* Called by map_images().
**********************************************************************/
void exception_init(void)
{
    old_terminate = std::set_terminate(&_objc_terminate);
}

主要是对libobjc的一些初始化的异常处理. 例如调用一个未实现的方法, 在崩溃的地方可以看到是进入了 _objc_terminate这个方法里

2.7、_dyld_objc_notify_register

//
// Note: only for use by objc runtime
// Register handlers to be called when objc images are mapped, unmapped, and initialized.
// Dyld will call back the "mapped" function with an array of images that contain an objc-image-info section.
// Those images that are dylibs will have the ref-counts automatically bumped, so objc will no longer need to
// call dlopen() on them to keep them from being unloaded.  During the call to _dyld_objc_notify_register(),
// dyld will call the "mapped" function with already loaded objc images.  During any later dlopen() call,
// dyld will also call the "mapped" function.  Dyld will call the "init" function when dyld would be called
// initializers in that image.  This is when objc calls any +load methods in that image.
//
void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped);

可以得知:

  • 这个方法是针对OC特有的方法
  • 当iamge镜像文件映射时、初始化时、停止映射时来处理程序

在dyld中搜索 _dyld_objc_notify_register, 是直接调用registerObjCNotifiers将参数传递下去. 打开registerObjCNotifiers:

void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
    // record functions to call
    sNotifyObjCMapped   = mapped;
    sNotifyObjCInit     = init;
    sNotifyObjCUnmapped = unmapped;

    // call 'mapped' function with all images mapped so far
    try {
        notifyBatchPartial(dyld_image_state_bound, true, NULL, false, true);
    }
    catch (const char* msg) {
        // ignore request to abort during registration
    }

    // <rdar://problem/32209809> call 'init' function on all images already init'ed (below libSystem)
    for (std::vector<ImageLoader*>::iterator it=sAllImages.begin(); it != sAllImages.end(); it++) {
        ImageLoader* image = *it;
        if ( (image->getState() == dyld_image_state_initialized) && image->notifyObjC() ) {
            dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
            (*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
        }
    }
}

可以看到镜像的加载、初始化、关闭依赖三个方法的调用. 当前搜索一下 sNotifyObjCMapped:

(*sNotifyObjCMapped)(objcImageCount, paths, mhs);

在此处进行了调用, 代表当dyld走到了这里, 就会回调到 _dyld_objc_notify_register参数中的mapped方法.

三、map_images镜像加载

在dyld中调用了这个map_images就会回调到这里, 在map_images里主要是返回了一个map_images_nolock方法. 在map_images_nolock里可以看到绝大部份都是inform的一些打印, 还有一些对count变量的一些处理, 最终会走到关键的 _read_images 这个方法

3.1、_read_images

在这个方法里的靠前的位置可以看到一个 if (!doneOnce) { ,意味着只有第一次进入这个方法才会执行.

在这个if的末尾可以看到这一块代码:

int namedClassesSize = 
    (isPreoptimized() ? unoptimizedTotalClasses : totalClasses) * 4 / 3;
gdb_objc_realized_classes =
    NXCreateMapTable(NXStrValueMapPrototype, namedClassesSize);
    
allocatedClasses = NXCreateHashTable(NXPtrPrototype, 0, nil);

可以看到创建了一个MapTable表以及一个HashTable表.

  • HashTable:
/***********************************************************************
* allocatedClasses
* A table of all classes (and metaclasses) which have been allocated
* with objc_allocateClassPair.
**********************************************************************/
static NXHashTable *allocatedClasses = nil;

也就是MapTable表中存放的是所有alloc过空间的类, 无论是元类还是类.

  • MapTable:
// This is a misnomer: gdb_objc_realized_classes is actually a list of 
// named classes not in the dyld shared cache, whether realized or not.
NXMapTable *gdb_objc_realized_classes;  // exported for debuggers in objc-gdb.h

只要是不在共享缓存里面, 所有的类, 不管实现与否, 都会存放在这里. 所以MapTable可能会包含HashTable.

接着往下走, 根据注释来看, 下面都是一些处理:

  • //Discover classes: 发现类, 将所有的类加载到 gdb_objc_realized_classes 表中
  • //remapped classes: 对所有的类进行重映射
  • //@selector references: 引用SEL, 将所有的SEL都进行注册
  • //old objc_msgSend_fixup call sites: 修复旧的函数指针遗留
  • //Discover protocols: 发现协议, 将所有的协议都加载到表中
  • //@protocol references: 引用协议, 注册协议
  • //Realize non-lazy classes: 初始化所有的非懒加载类
  • //Realize newly-resolved future classes: 遍历已标记的懒加载类, 并做初始化
  • //Discover categories: 发现分类,注册分类
  • realizeAllClasses(): 初始化所有未初始化的类

从上面得知, 在**_read_images**中, 会对类加载到表中:

// Discover classes. Fix up unresolved future classes. Mark bundle classes.

for (EACH_HEADER) {
    classref_t *classlist = _getObjc2ClassList(hi, &count);
    
    if (! mustReadClasses(hi)) {
        // Image is sufficiently optimized that we need not call readClass()
        continue;
    }

    bool headerIsBundle = hi->isBundle();
    bool headerIsPreoptimized = hi->isPreoptimized();

    for (i = 0; i < count; i++) {
        Class cls = (Class)classlist[i];
        Class newCls = readClass(cls, headerIsBundle, headerIsPreoptimized);

        if (newCls != cls  &&  newCls) {
            // Class was moved but not deleted. Currently this occurs 
            // only when the new class resolved a future class.
            // Non-lazily realize the class below.
            resolvedFutureClasses = (Class *)
                realloc(resolvedFutureClasses, 
                        (resolvedFutureClassCount+1) * sizeof(Class));
            resolvedFutureClasses[resolvedFutureClassCount++] = newCls;
        }
    }
}

首先从编译后的头文件中得到 classlist, 并对其遍历. 拿到一个类的地址cls,此时这个cls是从地址段中读取到的, 暂时无法知道这个类是什么. 接着对这个cls进行一次处理得到一个新的类newCls.

进入 readClass() :

Class replacing = nil;
if (Class newCls = popFutureNamedClass(mangledName)) {
    // This name was previously allocated as a future class.
    // Copy objc_class to future class's struct.
    // Preserve future's rw data block.
    
    if (newCls->isAnySwift()) {
        _objc_fatal("Can't complete future class request for '%s' "
                    "because the real class is too big.", 
                    cls->nameForLogging());
    }
    
    class_rw_t *rw = newCls->data();
    const class_ro_t *old_ro = rw->ro;
    memcpy(newCls, cls, sizeof(objc_class));
    rw->ro = (class_ro_t *)newCls->data();
    newCls->setData(rw);
    freeIfMutable((char *)old_ro->name);
    free((void *)old_ro);
    
    addRemappedClass(cls, newCls);
    
    replacing = cls;
    cls = newCls;
}

在readClass()里面有一个判断处理, 在里面又对ro rw进行了处理, 但是通过断点调试发现程序运行完毕都并不会走到这个判断处理里面.通过注释得知只有将来要处理的类才会进入.

而在后面有两行代码:

addNamedClass(cls, mangledName, replacing);
addClassTableEntry(cls);

这也就是将这个类插入到 gdb_objc_realized_classes总表 和 allocatedClasses分表 中.

接着遍历往下, 判断cls 是否与 处理后的newCls 是同一个类. 在readClass()中我们得知, 正常情况下, newCls并没有对cls进行任何处理, 所以这个判断处理并不会执行. 并且在下文中的 //Realize newly-resolved future classes 处, 也会因为正常情况下不会触发.

3.2、类的加载

紧接着上文往下. 有一个 if (!noClassesRemapped()), 可以得知此判断主要是修复重映射, 特殊情况才进入. 继续向下:

{
    mutex_locker_t lock(selLock);
    for (EACH_HEADER) {
        if (hi->isPreoptimized()) continue;
        
        bool isBundle = hi->isBundle();
        SEL *sels = _getObjc2SelectorRefs(hi, &count);
        UnfixedSelectors += count;
        for (i = 0; i < count; i++) {
            const char *name = sel_cname(sels[i]);
            sels[i] = sel_registerNameNoLock(name, isBundle);
        }
    }
}

进入 _getObjc2SelectorRefs:

GETSECT(_getObjc2SelectorRefs,        SEL,             "__objc_selrefs"); 

主要是从当前的macho中 __objc_selrefs 段里去读区SEL.读区到SEL后就进入sel_registerNameNoLock(name, isBundle) 把当前的SEL名字注册到内存里面去. 紧接着几个**for (EACH_HEADER)**都是修复一些东西, 特殊情况,跳过.

来到类相关的地方:

    // Realize non-lazy classes (for +load methods and static instances)
    for (EACH_HEADER) {
        ...
        addClassTableEntry(cls);
        ...
        realizeClassWithoutSwift(cls);
    }

从映射文件中读取类cls, 做一些初始化相关处理, 来到addClassTableEntry(cls), 把将类添加到表中.继续向下来到**realizeClassWithoutSwift(cls)**里面:

...
// Normal class. Allocate writeable class data.
rw = (class_rw_t *)calloc(sizeof(class_rw_t), 1);
rw->ro = ro;
rw->flags = RW_REALIZED|RW_REALIZING;
cls->setData(rw);
...

拿到ro后, 进入判断, 来到正常的类的处理, 开辟一个rw的空间, 将ro放进去, 并且给类设置好data. 但是此时还没有将rw进行处理, 仅仅做了一步初始化. 继续向下:

// Realize superclass and metaclass, if they aren't already.
// This needs to be done after RW_REALIZED is set above, for root classes.
// This needs to be done after class index is chosen, for root metaclasses.
// This assumes that none of those classes have Swift contents,
//   or that Swift's initializers have already been called.
//   fixme that assumption will be wrong if we add support
//   for ObjC subclasses of Swift classes.
supercls = realizeClassWithoutSwift(remapClass(cls->superclass));
metacls = realizeClassWithoutSwift(remapClass(cls->ISA()));
...
// Update superclass and metaclass in case of remapping
cls->superclass = supercls;
cls->initClassIsa(metacls);
...
if (supercls) {
    addSubclass(supercls, cls);
} else {
    addRootClass(cls);
}

我们知道类的结构是 isa、superclass、cache_t、bits. 所以当处理这个类的时候, 也要递归一下将父类以及元类也进行相同的处理, 并给当前类设置好元类与父类. 来到最下面的 **methodizeClass(cls)**方法:

    bool isMeta = cls->isMetaClass();
    auto rw = cls->data();
    auto ro = rw->ro;

    // Methodizing for the first time
    if (PrintConnecting) {
        _objc_inform("CLASS: methodizing class '%s' %s", 
                     cls->nameForLogging(), isMeta ? "(meta)" : "");
    }

    // Install methods and properties that the class implements itself.
    method_list_t *list = ro->baseMethods();
    if (list) {
        prepareMethodLists(cls, &list, 1, YES, isBundleClass(cls));
        rw->methods.attachLists(&list, 1);
    }

    property_list_t *proplist = ro->baseProperties;
    if (proplist) {
        rw->properties.attachLists(&proplist, 1);
    }

    protocol_list_t *protolist = ro->baseProtocols;
    if (protolist) {
        rw->protocols.attachLists(&protolist, 1);
    }

从类中拿到ro、rw. 从ro中拿到方法列表, 属性列表, 协议列表, 拷贝进rw里面. 这就是在类的结构中 为什么rw中会有ro的东西. ro就相当于最初版本, 我们一些动态添加处理等全是在对rw进行处理, 要保证ro的不可污染.