load与initialize

621 阅读8分钟

我们都知道我们每个app的入口函数是mian函数,那么在main函数调用前都执行了什么?

dyld

iOS中用到的所有系统 framework 都是动态链接的,在运行的时候我们可以在lldb中使用

image list -o -f

查看项目链接的动态库

大多数的lib都是dylib格式,系统使用动态链接有几点好处:

  • 代码共用:很多程序都动态链接了这些 lib,但它们在内存和磁盘中中只有一份
  • 易于维护:由于被依赖的 lib 是程序执行时才 link 的,所以这些 lib 很容易做更新,比如libSystem.dyliblibSystem.B.dylib 的替身,哪天想升级直接换成 libSystem.C.dylib 然后再替换替身就行了
  • 减少可执行文件体积:相比静态链接,动态链接在编译时不需要打进去,所以可执行文件的体积要小很多

摘录孙源大神

在进行iOS逆向工程的时候如果想要在非越狱机上运行我们自己破解后的软件,其的本质其实就是在app中注入我们自己编写的动态库,然后将app重签名

ImageLoader 加载镜像文件

动态库加载完成后就该加载我们自己的编写的代码编译成的二进制文件了,就是ImageLoaderXXXXXX系列方法.这些image内就编译着我们自己写的符号、代码等.

_objc_init

runtime的初始化

void _objc_init(void)
{
    static bool initialized = false;
    if (initialized) return;
    initialized = true;
    
    // fixme defer initialization until an objc-using image is found?
    environ_init();
    tls_init();
    static_init();
    lock_init();
    exception_init();

    _dyld_objc_notify_register(&map_2_images, load_images, unmap_image);
}

load

我们先关注load_images方法

void
load_images(const char *path __unused, const struct mach_header *mh)
{
    // Return without taking locks if there are no +load methods here.
    if (!hasLoadMethods((const headerType *)mh)) return;

    recursive_mutex_locker_t lock(loadMethodLock);

    // Discover load methods
    {
        rwlock_writer_t lock2(runtimeLock);
        prepare_load_methods((const headerType *)mh);
    }

    // Call +load methods (without runtimeLock - re-entrant)
    call_load_methods();
}

方法中调用了prepare_load_methods

void prepare_load_methods(const headerType *mhdr)
{
    size_t count, i;

    runtimeLock.assertWriting();

    classref_t *classlist = 
        _getObjc2NonlazyClassList(mhdr, &count);
    for (i = 0; i < count; i++) {
        schedule_class_load(remapClass(classlist[i]));//将所有实现了+load方法的类加入到一个静态数组 loadable_classes
    }

    category_t **categorylist = _getObjc2NonlazyCategoryList(mhdr, &count);
    for (i = 0; i < count; i++) {
        category_t *cat = categorylist[i];
        Class cls = remapClass(cat->cls);
        if (!cls) continue;  // category for ignored weak-linked class
        realizeClass(cls);
        assert(cls->ISA()->isRealized());
        add_category_to_loadable_list(cat);//将所有实现+load方法的分类加入到一个静态数组 loadable_categories
    }
}

_getObjc2NonlazyCategoryList 获取所有实现了+load的分类(非懒加载的分类),然后判断分类所对应的类是否为nil,如果分类所对应的类为nil则跳过,反之初始化分类所对应的类,然后将分类加入一个静态数组

我们看到了有调用schedule_class_load方法

static void schedule_class_load(Class cls)
{
    if (!cls) return;
    assert(cls->isRealized());  // _read_images should realize

    if (cls->data()->flags & RW_LOADED) return;

    // Ensure superclass-first ordering
    schedule_class_load(cls->superclass);//yty 将父类放在前面

    add_class_to_loadable_list(cls);
    cls->setInfo(RW_LOADED); 
}

从方法中我们可以看出,这个方法是将实现了+load的方法的类加入到一个静态数组中,并优先调用父类
然后调用call_load_methods方法,从方法名上看应该是调用load方法
在这,我问大家一个问题,当一个类实现了+load方法并且分类也实现了+load,这个时候系统会调用哪个?
我们来直接看代码

void call_load_methods(void)
{
    static bool loading = NO;
    bool more_categories;

    loadMethodLock.assertLocked();

    // Re-entrant calls do nothing; the outermost call will finish the job.
    if (loading) return;
    loading = YES;

    void *pool = objc_autoreleasePoolPush();

    do {
        // 1. Repeatedly call class +loads until there aren't any more
        while (loadable_classes_used > 0) {//优先调用类的+load方法
            call_class_loads();
        }

        // 2. Call category +loads ONCE
        more_categories = call_category_loads();//调用分类的+load方法

        // 3. Run more +loads if there are classes OR more untried categories
    } while (loadable_classes_used > 0  ||  more_categories);

    objc_autoreleasePoolPop(pool);

    loading = NO;
}

这就是call_load_methods方法的实现,从代码中我们可以看出来,在这个方法中系统会把分类的+load方法与类本身的+load方法都调用了,并且类的+load要比分类中先调用。
那么如果有多个分类都实现了+load,先调用哪个分类呢?这个和编译有关,编译时谁在前面谁先调用

现在我们在去看看map_2_images方法

void
map_2_images(unsigned count, const char * const paths[],
             const struct mach_header * const mhdrs[])
{
    recursive_mutex_locker_t lock(loadMethodLock);
    map_images_nolock(count, paths, mhdrs);
}

发现会调用map_images_nolock

void 
map_images_nolock(unsigned mhCount, const char * const mhPaths[],
                  const struct mach_header * const mhdrs[])
{
    static bool firstTime = YES;
    header_info *hList[mhCount];
    uint32_t hCount;
    size_t selrefCount = 0;
   ....
   if (hCount > 0) {
        _read_images(hList, hCount, totalClasses, unoptimizedTotalClasses);
    }
}

map_images_nolock忽略了部分代码

header_info *hList[mhCount];==>类信息读取到header_info的链表数组
preopt_init==>优化共享缓存的初始化
sel_init==>初始化方法列表
arr_init==>初始化自动释放池+散列表

map_images_nolock调用了_read_images,

void _read_images(header_info **hList, uint32_t hCount, int totalClasses, int unoptimizedTotalClasses)
{
    header_info *hi;
    uint32_t hIndex;
    size_t count;
    size_t i;
    Class *resolvedFutureClasses = nil;
    size_t resolvedFutureClassCount = 0;
    static bool doneOnce;
    TimeLogger ts(PrintImageTimes);

    runtimeLock.assertWriting();
  for (EACH_HEADER) {
        classref_t *classlist = 
            _getObjc2NonlazyClassList(hi, &count);
        for (i = 0; i < count; i++) {
            Class cls = remapClass(classlist[i]);
            if (!cls) continue;
            // hack for class __ARCLite__, which didn't get this above
#if TARGET_OS_SIMULATOR
            if (cls->cache._buckets == (void*)&_objc_empty_cache  &&  
                (cls->cache._mask  ||  cls->cache._occupied)) 
            {
                cls->cache._mask = 0;
                cls->cache._occupied = 0;
            }
            if (cls->ISA()->cache._buckets == (void*)&_objc_empty_cache  &&  
                (cls->ISA()->cache._mask  ||  cls->ISA()->cache._occupied)) 
            {
                cls->ISA()->cache._mask = 0;
                cls->ISA()->cache._occupied = 0;
            }
#endif

            realizeClass(cls);
			objc_class *yty_cls = (objc_class*)cls;
			class_rw_t *yty_rw_t =  yty_cls->data();
			const class_ro_t *yty_ro_t =  yty_rw_t->ro;
			printf("Class: %s \n",yty_ro_t->name);
        }
    }

 // Discover categories. 
    for (EACH_HEADER) {
        category_t **catlist = 
            _getObjc2CategoryList(hi, &count);
        bool hasClassProperties = hi->info()->hasCategoryClassProperties();

        for (i = 0; i < count; i++) {
            category_t *cat = catlist[i];
            Class cls = remapClass(cat->cls);
			printf("Category: %s \n",cat->name); //yty fax
            if (!cls) {
                // Category's target class is missing (probably weak-linked).
                // Disavow any knowledge of this category.
                catlist[i] = nil;
                if (PrintConnecting) {
                    _objc_inform("CLASS: IGNORING category \?\?\?(%s) %p with "
                                 "missing weak-linked target class", 
                                 cat->name, cat);
                }
                continue;
			}

            // Process this category. 
            // First, register the category with its target class. 
            // Then, rebuild the class's method lists (etc) if 
            // the class is realized. 
            bool classExists = NO;
            if (cat->instanceMethods ||  cat->protocols  
                ||  cat->instanceProperties) 
            {
                addUnattachedCategoryForClass(cat, cls, hi);
                if (cls->isRealized()) {
                    remethodizeClass(cls);
                    classExists = YES;
                }
                if (PrintConnecting) {
                    _objc_inform("CLASS: found category -%s(%s) %s", 
                                 cls->nameForLogging(), cat->name, 
                                 classExists ? "on existing class" : "");
                }
            }

            if (cat->classMethods  ||  cat->protocols  
                ||  (hasClassProperties && cat->_classProperties)) 
            {
                addUnattachedCategoryForClass(cat, cls->ISA(), hi);
                if (cls->ISA()->isRealized()) {
                    remethodizeClass(cls->ISA());
                }
                if (PrintConnecting) {
                    _objc_inform("CLASS: found category +%s(%s)", 
                                 cls->nameForLogging(), cat->name);
                }
            }
        }
    }
}

因为_read_image方法太长,只截取部分

GETSECT(_getObjc2ClassList,           classref_t,      "__objc_classlist"); //获取当前前注册的所有类
GETSECT(_getObjc2NonlazyClassList,    classref_t,      "__objc_nlclslist");//获取所有非懒加载的类(实现了+load)
GETSECT(_getObjc2CategoryList,        category_t *,    "__objc_catlist");//获取当前注册的所有分类
GETSECT(_getObjc2NonlazyCategoryList, category_t *,    "__objc_nlcatlist");//获取非懒加载的分类(实现了+load)

在上面可以看到,非懒加载的(实现+load)的方法的类/分类就被初始化到内存中了,而initialize的方法则是在第一次使用了类才会调用

initialize

上文中我们已经说到initialize只有在类第一次调用的时候才会被调用

查看initialize被调用时的调用栈

0 +[XXObject initialize]
1 _class_initialize
2 lookUpImpOrForward
3 _class_lookupMethodAndLoadCache3
4 objc_msgSend

我们去看下lookUpImpOrForward的实现

IMP lookUpImpOrForward(Class cls, SEL sel, id inst, 
                       bool initialize, bool cache, bool resolver)
{
    Class curClass;
    IMP imp = nil;
    Method meth;
    bool triedResolver = NO;

    runtimeLock.assertUnlocked();

    // Optimistic cache lookup
    if (cache) {
        imp = cache_getImp(cls, sel);
        if (imp) return imp;
    }

    if (!cls->isRealized()) {
        rwlock_writer_t lock(runtimeLock);
        realizeClass(cls);
    }

    if (initialize  &&  !cls->isInitialized()) {
        _class_initialize (_class_getNonMetaClass(cls, inst));
        // If sel == initialize, _class_initialize will send +initialize and 
        // then the messenger will send +initialize again after this 
        // procedure finishes. Of course, if this is not being called 
        // from the messenger then it won't happen. 2778172
    }
}

可以看出initialize && !cls->isInitialized()满足的情况下会调用_class_initialize,initialize传值进来是true

    bool isInitialized() {
        return getMeta()->data()->flags & RW_INITIALIZED;
    }

isInitialized()是判断当前类是否初始化,保存在元类中
继续看下_class_initialize方法

void _class_initialize(Class cls)
{
    assert(!cls->isMetaClass());

    Class supercls;
    bool reallyInitialize = NO;

    // Make sure super is done initializing BEFORE beginning to initialize cls.
    // See note about deadlock above.
    supercls = cls->superclass;
	//supercls->isInitialized() 判断是否调用过initialize getMeta()->data()->flags & RW_INITIALIZED;在元类的 class_rw_t结构体的flags中保存
	//1,强制优先调用父类initialize方法
    if (supercls  &&  !supercls->isInitialized()) {
        _class_initialize(supercls);
    }
    
    // Try to atomically set CLS_INITIALIZING.
    {
        monitor_locker_t lock(classInitLock);
		//2,如果类没有调用过initialize方法或者没有正在调用initialize方法 设置标志位
        if (!cls->isInitialized() && !cls->isInitializing()) {
            cls->setInitializing();
            reallyInitialize = YES;
        }
    }
    
    if (reallyInitialize) {
		//3,成功设置标志位后,向当前类发送 +initialize 消息
        if (PrintInitializing) {
            _objc_inform("INITIALIZE: calling +[%s initialize]",
                         cls->nameForLogging());
        }
        @try {
            callInitialize(cls);

            if (PrintInitializing) {
                _objc_inform("INITIALIZE: finished +[%s initialize]",
                             cls->nameForLogging());
            }
        }
        @catch (...) {
            if (PrintInitializing) {
                _objc_inform("INITIALIZE: +[%s initialize] threw an exception",
                             cls->nameForLogging());
            }
            @throw;
        }
        @finally {

			// 4. 完成初始化,如果父类已经初始化完成,设置 RW_INITIALIZED 标志位,
			//    否则,在父类初始化完成之后再设置标志位。
            if (!supercls  ||  supercls->isInitialized()) {
                _finishInitializing(cls, supercls);
            } else {
                _finishInitializingAfter(cls, supercls);
            }
        }
        return;
    }
    
    else if (cls->isInitializing()) {
		// 5. 当前线程正在初始化当前类,直接返回,否则,会等待其它线程初始化结束后,再返回
        if (_thisThreadIsInitializingClass(cls)) {
            return;
        } else {
            waitForInitializeToComplete(cls);
            return;
        }
    }
    
    else if (cls->isInitialized()) {
		// 6. 初始化成功后,直接返回
        return;
    }
    
    else {
        // We shouldn't be here. 
        _objc_fatal("thread-safe class init in objc runtime is buggy!");
    }
}

1,首先强制把父类初始化

    if (supercls  &&  !supercls->isInitialized()) {
        _class_initialize(supercls);
    }

2,如果类没有调用过initialize方法或者没有正在调用initialize方法 设置标志位

        monitor_locker_t lock(classInitLock);
        if (!cls->isInitialized() && !cls->isInitializing()) {
            cls->setInitializing();
            reallyInitialize = YES;
        }

3,成功设置标志位后,向当前类发送 +initialize 消息

 callInitialize(cls);
 void callInitialize(Class cls)
{
    ((void(*)(Class, SEL))objc_msgSend)(cls, SEL_initialize);
    asm("");
}

这个地方我们可以看到SEL_initialize是使用objc_msgSend直接调用的,所以如果分类也有initialize方法,只会调用分类的initialize方法 4,完成初始化,如果父类已经初始化完成,设置 RW_INITIALIZED 标志位, 否则,在父类初始化完成之后再设置标志位。

      if (!supercls  ||  supercls->isInitialized()) {
                _finishInitializing(cls, supercls);
            } else {
                _finishInitializingAfter(cls, supercls);
            }

5,当前线程正在初始化当前类,直接返回,否则,会等待其它线程初始化结束后,再返回

  if (_thisThreadIsInitializingClass(cls)) {
            return;
        } else {
            waitForInitializeToComplete(cls);
            return;
        }

6,初始化成功后,直接返回

if (cls->isInitialized()) {
   return;
  }

总结

1,initialize 的调用是惰性的,它会在第一次调用当前类的方法时被调用,load会在objc_init初始化时候调用
2,initialize方法的调用与类是否已经被加载无关,如一个类中同时实现了+loadinitialize方法,则在main之前会调用+load,在第一次使用类的时候会调用initialize
3,如果分类与本类都写了initialize,只会调用分类的;而如果都谢啦load方法,则会都调用
4,与 load 不同,initialize 方法调用时,所有的类都已经加载到了内存中

文章参考:
你真的了解 load 方法么?
懒惰的 initialize 方法
iOS 程序 main 函数之前发生了什么