分析Category、load、initialize的加载原理

2,669 阅读10分钟

先来抛出3个问题: 1.Category为什么不能直接添加属性? 2.Category中有load方法吗?load方法是什么时候调用的?load方法能继承吗? 3.load、initialize有什么区别,以及它们在category重写时的调用顺序。

要回答这些问题,我们需要去查看runtime里的类的初始化方法,打开runtime源码,找到_objc_init,这就是runtime初始化的地方。

void _objc_init(void)
{
    static bool initialized = false;
    if (initialized) return;
    initialized = true;
    
    // fixme defer initialization until an objc-using image is found?
    environ_init();
    tls_init();
    static_init();
    lock_init();
    exception_init();

    _dyld_objc_notify_register(&map_images, load_images, unmap_image);
}

什么是image

1、Executable: 应用的主要二进制(比如.o文件)

2、Dylib: 动态链接库(dynamic library,又称 DSO 或 DLL)

3、Bundle: 资源文件,不能被链接的 Dylib,只能在运行时使用 dlopen() 加载

map_images:在编译的时候,把所有的代码文件map加载进来,查看源码:

map_images(unsigned count, const char * const paths[],
           const struct mach_header * const mhdrs[])
{
    mutex_locker_t lock(runtimeLock);
    return map_images_nolock(count, paths, mhdrs);
}

1.点击进去map_images我们发现其中调用了map_images_nolock(count, paths, mhdrs);这个函数,我们点进这个函数。 2.map_images_nolock(unsigned mhCount, const char * const mhPaths[], const struct mach_header * const mhdrs[]) 这个函数非常长,我们直接拉到这个函数最下面,找到**_read_images(hList, hCount, totalClasses, unoptimizedTotalClasses);** 这个函数,点击进去。 3.**void _read_images(header_info hList, uint32_t hCount, int totalClasses, int unoptimizedTotalClasses) 这个方法大概就是读取模块的意思了。 这个函数也是非常长,我们大概在中间能发现这么几句注释:

// Discover classes. Fix up unresolved future classes. Mark bundle classes.
// Discover protocols. Fix up protocol refs.
// Discover categories. 
// Category discovery MUST BE LAST to avoid potential races 
    // when other threads call the new category code before 
    // this thread finishes its fixups.

大体上意思就是从文件里面通过_getObjc2ClassList这个方式把classes/protocols/categories读进内存里,系统会维护这么一张表,我们来查看categories下面的代码

    for (EACH_HEADER) {
        category_t **catlist = 
            _getObjc2CategoryList(hi, &count);
        bool hasClassProperties = hi->info()->hasCategoryClassProperties();

        for (i = 0; i < count; i++) {
            category_t *cat = catlist[i];
            Class cls = remapClass(cat->cls);

            if (!cls) {
                // Category's target class is missing (probably weak-linked).
                // Disavow any knowledge of this category.
                catlist[i] = nil;
                if (PrintConnecting) {
                    _objc_inform("CLASS: IGNORING category \?\?\?(%s) %p with "
                                 "missing weak-linked target class", 
                                 cat->name, cat);
                }
                continue;
            }

            // Process this category. 
            // First, register the category with its target class. 
            // Then, rebuild the class's method lists (etc) if 
            // the class is realized. 
            bool classExists = NO;
            if (cat->instanceMethods ||  cat->protocols  
                ||  cat->instanceProperties) 
            {
                addUnattachedCategoryForClass(cat, cls, hi);
                if (cls->isRealized()) {
                    remethodizeClass(cls);
                    classExists = YES;
                }
                if (PrintConnecting) {
                    _objc_inform("CLASS: found category -%s(%s) %s", 
                                 cls->nameForLogging(), cat->name, 
                                 classExists ? "on existing class" : "");
                }
            }

            if (cat->classMethods  ||  cat->protocols  
                ||  (hasClassProperties && cat->_classProperties)) 
            {
                addUnattachedCategoryForClass(cat, cls->ISA(), hi);
                if (cls->ISA()->isRealized()) {
                    remethodizeClass(cls->ISA());
                }
                if (PrintConnecting) {
                    _objc_inform("CLASS: found category +%s(%s)", 
                                 cls->nameForLogging(), cat->name);
                }
            }
        }
    }

将category和它的主类(或元类)注册到哈希表中. 如果主类(或元类)已经实现,那么重建它的方法列表remethodizeClass,可见在category开始注册前,主类已经创建完毕 跳进去:

static void remethodizeClass(Class cls)
{
    category_list *cats;
    bool isMeta;

    runtimeLock.assertLocked();

    isMeta = cls->isMetaClass();

    // Re-methodizing: check for more categories
    if ((cats = unattachedCategoriesForClass(cls, false/*not realizing*/))) {
        if (PrintConnecting) {
            _objc_inform("CLASS: attaching categories to class '%s' %s", 
                         cls->nameForLogging(), isMeta ? "(meta)" : "");
        }
        //将cats的方法贴到cls里
        attachCategories(cls, cats, true /*flush caches*/);        
        free(cats);
    }
}

可以看出,通过attachCategories方法将category里的方法添加到了class里,跳入attachCategories方法:

static void 
attachCategories(Class cls, category_list *cats, bool flush_caches)
{
    if (!cats) return;
    if (PrintReplacedMethods) printReplacements(cls, cats);

    bool isMeta = cls->isMetaClass();

    // fixme rearrange to remove these intermediate allocations
    //方法数组
    method_list_t **mlists = (method_list_t **)
        malloc(cats->count * sizeof(*mlists));
    //属性数组
    property_list_t **proplists = (property_list_t **)
        malloc(cats->count * sizeof(*proplists));
    //协议数组
    protocol_list_t **protolists = (protocol_list_t **)
        malloc(cats->count * sizeof(*protolists));

    // Count backwards through cats to get newest categories first
    int mcount = 0;
    int propcount = 0;
    int protocount = 0;
    int i = cats->count;
    bool fromBundle = NO;
    while (i--) {
        auto& entry = cats->list[i];

        //判断是对象方法还是类方法    
        //创建总的方法数组,属性数组,协议数组
        method_list_t *mlist = entry.cat->methodsForMeta(isMeta);
        if (mlist) {
            mlists[mcount++] = mlist;
            fromBundle |= entry.hi->isBundle();
        }

        property_list_t *proplist = 
            entry.cat->propertiesForMeta(isMeta, entry.hi);
        if (proplist) {
            proplists[propcount++] = proplist;
        }

        protocol_list_t *protolist = entry.cat->protocols;
        if (protolist) {
            protolists[protocount++] = protolist;
        }
    }
    //得到类对象里面的数据
    auto rw = cls->data();

    //核心代码   
    rw->methods.attachLists
    prepareMethodLists(cls, mlists, mcount, NO, fromBundle);
    //将所有分类的对象方法,附加到类对象的方法列表中
    rw->methods.attachLists(mlists, mcount);
    free(mlists);
    if (flush_caches  &&  mcount > 0) flushCaches(cls);

    //将所有分类的协议,附加到类对象的协议列表中
    rw->properties.attachLists(proplists, propcount);
    free(proplists);

    rw->protocols.attachLists(protolists, protocount);
    free(protolists);
}

再跳入attachLists方法

    void attachLists(List* const * addedLists, uint32_t addedCount) {
        if (addedCount == 0) return;

        if (hasArray()) {
            // many lists -> many lists
            uint32_t oldCount = array()->count;
            uint32_t newCount = oldCount + addedCount;
            //重新分配内存
            setArray((array_t *)realloc(array(), array_t::byteSize(newCount)));
            array()->count = newCount;

        //开始进行
        //memmove 这个函数是把第二个位置的对象移动到第一个位置。这里也就是把这个 
          类本来的方法列表移动到第三个位置。
            memmove(array()->lists + addedCount, 
                              array()->lists, 
                              oldCount * sizeof(array()->lists[0]));

            //memcpy这个函数是把第二个位置的对象拷贝到第一个位置,也就是把      
              addedLists拷贝到第一个位置: 
            memcpy( array()->lists, 
                            addedLists, 
                            addedCount * sizeof(array()->lists[0]));
        }
        else if (!list  &&  addedCount == 1) {
            // 0 lists -> 1 list
            list = addedLists[0];
        } 
        else {
            // 1 list -> many lists
            List* oldList = list;
            uint32_t oldCount = oldList ? 1 : 0;
            uint32_t newCount = oldCount + addedCount;
            setArray((array_t *)malloc(array_t::byteSize(newCount)));
            array()->count = newCount;
            if (oldList) array()->lists[addedCount] = oldList;
            memcpy(array()->lists, addedLists, 
                   addedCount * sizeof(array()->lists[0]));
        }
    }

至此就把分类中的方法列表合并到了类的方法列表中。 通过上面的合并过程我们也明白了,当分类和类中有同样的方法时,类中的方法并没有被覆盖,只是分类的方法被放在了类的方法前面,导致先找到了分类的方法,所以分类的方法就被执行了。

所有的方法、协议、分类加载完毕后,就开始进行load

接下来我们走进load_images

void
load_images(const char *path __unused, const struct mach_header *mh)
{
    // Return without taking locks if there are no +load methods here.
    if (!hasLoadMethods((const headerType *)mh)) return;

    recursive_mutex_locker_t lock(loadMethodLock);

    // Discover load methods
    {
        mutex_locker_t lock2(runtimeLock);
        prepare_load_methods((const headerType *)mh);
    }

    // Call +load methods (without runtimeLock - re-entrant)
    call_load_methods();
}

可以看到这是先prepare_load_methods然后call调用了call_load_methods方法,进入prepare_load_methods

void prepare_load_methods(const headerType *mhdr)
{
    size_t count, i;

    runtimeLock.assertLocked();

    //获取到类里面的方法并调整顺序:remap
    classref_t *classlist = 
        _getObjc2NonlazyClassList(mhdr, &count);
    for (i = 0; i < count; i++) {
        schedule_class_load(remapClass(classlist[i]));
    }

    //获取到分类里的方法,并调整顺序,最终加入数组
    category_t **categorylist = _getObjc2NonlazyCategoryList(mhdr, &count);
    for (i = 0; i < count; i++) {
        category_t *cat = categorylist[i];
        Class cls = remapClass(cat->cls);
        if (!cls) continue;  // category for ignored weak-linked class
        realizeClass(cls);
        assert(cls->ISA()->isRealized());
        add_category_to_loadable_list(cat);
    }
}

先看类的方法

static void schedule_class_load(Class cls)
{
    if (!cls) return;
    assert(cls->isRealized());  // _read_images should realize

    //如果已经加入就return
    if (cls->data()->flags & RW_LOADED) return;

    // Ensure superclass-first ordering
    //递归去对父类执行schedule_class_load,直至nsobject
    schedule_class_load(cls->superclass);

    //把类加入到load数组中
    add_class_to_loadable_list(cls);
    cls->setInfo(RW_LOADED); 
}

再看分类的方法:

    //根据编译顺序拿到分类列表
    category_t **categorylist = _getObjc2NonlazyCategoryList(mhdr, &count);

    //通过for循环加入到load数组
    for (i = 0; i < count; i++) {
        category_t *cat = categorylist[i];
        Class cls = remapClass(cat->cls);
        if (!cls) continue;  // category for ignored weak-linked class
        realizeClass(cls);
        assert(cls->ISA()->isRealized());
        add_category_to_loadable_list(cat);
    }

至此,我们可以得出结论,当要把一个类加入最终的这个classes数组的时候,会先去上溯这个类的父类,先把父类加入这个数组。 由于在classes数组中父类永远在子类的前面,所以在加载类的load方法时一定是先加载父类的load方法,再加载子类的load方法。 分类的load方法加载顺序很简单,就是谁先编译的,谁的load方法就被先加载。

initialize方法

void _class_initialize(Class cls)
{
    assert(!cls->isMetaClass());

    Class supercls;
    bool reallyInitialize = NO;

    // Make sure super is done initializing BEFORE beginning to initialize cls.
    // See note about deadlock above.
    supercls = cls->superclass;
    // Try to atomically set CLS_INITIALIZING.
    {
        monitor_locker_t lock(classInitLock);
        if (!cls->isInitialized() && !cls->isInitializing()) {
            cls->setInitializing();
            reallyInitialize = YES;
        }
    }
    
    if (reallyInitialize) {
        // We successfully set the CLS_INITIALIZING bit. Initialize the class.
        
        // Record that we're initializing this class so we can message it.
        _setThisThreadIsInitializingClass(cls);

        if (MultithreadedForkChild) {
            // LOL JK we don't really call +initialize methods after fork().
            performForkChildInitialize(cls, supercls);
            return;
        }
        
        // Send the +initialize message.
        // Note that +initialize is sent to the superclass (again) if 
        // this class doesn't implement +initialize. 2157218
        if (PrintInitializing) {
            _objc_inform("INITIALIZE: thread %p: calling +[%s initialize]",
                         pthread_self(), cls->nameForLogging());
        }

        // Exceptions: A +initialize call that throws an exception 
        // is deemed to be a complete and successful +initialize.
        //
        // Only __OBJC2__ adds these handlers. !__OBJC2__ has a
        // bootstrapping problem of this versus CF's call to
        // objc_exception_set_functions().
#if __OBJC2__
        @try
#endif
        {
            callInitialize(cls);

    //可以看到,initialize会先调用父类的initialize方法,然后再调用自己的initialize,并且只会调用一次
    if (supercls  &&  !supercls->isInitialized()) {
        _class_initialize(supercls);
    }

查看callInitialize的实现
void callInitialize(Class cls)
{
    ((void(*)(Class, SEL))objc_msgSend)(cls, SEL_initialize);
    asm("");
}

+initialize的调用过程:

1.查看本类的initialize方法有没有实现过,如果已经实现过就返回,不再实现。 2.如果本类没有实现过initialize方法,那么就去递归查看该类的父类有没有实现过initialize方法,如果没有实现就去实现,最后实现本类的initialize方法。并且initialize方法是通过objc_msgSend()实现的。

initialize方法是在类第一次接收到消息时调用,也就是objc_msgSend()。我们可以重写类的initialize方法,会发现在编译期间并不会执行此方法,而在第一次初始化该类时会调用。

+initialize和+load的一个很大区别是,+initialize是通过objc_msgSend进行调用的,所以有以下特点: 如果子类没有实现+initialize方法,会调用父类的+initialize(所以父类的+initialize方法可能会被调用多次) 如果分类实现了+initialize,会覆盖类本身的+initialize调用。

category为什么不能直接添加属性?

Category就是通过runtime动态地把Category中的方法等添加到类中(苹果在实现的过程中并未将属性添加到类中,所以属性仅仅是声明了setter和getter方法,而并未实现)

在Objective-C提供的runtime函数中,确实有一个lass_addIvar()

函数用于给类添加成员变量,但是文档中特别说明:

This function may only be called after objc_allocateClassPair and before objc_registerClassPair. Adding an instance variable to an existing class is not supported.

意思是说,这个函数只能在“构建一个类的过程中”调用。一旦完成类定义,就不能再添加成员变量了。经过编译的类在程序启动后就被runtime加载,没有机会调用addIvar。程序在运行时动态构建的类需要在调用objc_registerClassPair之后才可以被使用,同样没有机会再添加成员变量

我们来对比下objc_class和的结构

struct objc_class {
    Class _Nonnull isa  OBJC_ISA_AVAILABILITY;

#if !__OBJC2__
    Class super_class                       OBJC2_UNAVAILABLE;  // 父类
        const char *name                        OBJC2_UNAVAILABLE;  // 类名
        long version                                OBJC2_UNAVAILABLE;  // 类的版本信息,默认0
        long info                                      OBJC2_UNAVAILABLE;  // 类信息,供运行期使用的一些位标识
        long instance_size                      OBJC2_UNAVAILABLE;  // 该类的实例变量大小

        struct objc_ivar_list *ivars           OBJC2_UNAVAILABLE;  // 该类的成员变量链表
        struct objc_method_list **methodLists   OBJC2_UNAVAILABLE;  // 方法定义的链表
        struct objc_cache *cache                       OBJC2_UNAVAILABLE;  // 方法缓存
        struct objc_protocol_list *protocols        OBJC2_UNAVAILABLE;  // 协议链表
#endif

} OBJC2_UNAVAILABLE;

struct category_t {
    const char *name;
    classref_t cls;
    struct method_list_t *instanceMethods;    //存储实例方法
    struct method_list_t *classMethods;        //存储类方法
    struct protocol_list_t *protocols;            //协议
    struct property_list_t *instanceProperties;//属性
    // Fields below this point are not always present on disk.
    struct property_list_t *_classProperties;

    method_list_t *methodsForMeta(bool isMeta) {
        if (isMeta) return classMethods;
        else return instanceMethods;
    }

    property_list_t *propertiesForMeta(bool isMeta, struct header_info *hi);
};

可以看到,category里没有存放成员变量的地方,所以我们在给category添加属性时,它的功能是不完整的,我们来写个代码:

@interface Person (Category)
@interface Person (Category) {
    NSString * _name;      // 报错: Instance variables may not be placed in categories
}

@property (copy,nonatomic) NSString * name;

@end
@implementation Person (Category)

@end

//进行调用
    Person * person = [[Person alloc] init];
    person.name = @"nick";
  
//执行报错
2019-03-06 09:44:40.193970+0800 Demo[61353:6422033] -[Person setName:]: unrecognized selector sent to instance 0x600003dc2520
2019-03-06 09:44:40.198534+0800 Demo[61353:6422033] *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '-[Person setName:]: unrecognized selector sent to instance 0x600003dc2520'

可见,property不会自动生成带_的成员变量,也不会实现setter和getter方法,只是声明了setter和getter方法,我们尝试自己添加上setter和getter方法

- (void)setName:(NSString *)name {
    // 没办法赋值
}
- (NSString *)name {
    return @"123";
}

//调用
   Person * person = [[Person alloc] init];
    person.name = @"nick";
    NSLog(@"name = %@",person.name);

//执行:
2019-03-06 09:48:43.086810+0800 Demo[61450:6426832] name = 123

可见,赋值并没有成功,按照正常的setter和getter,我们知道在setter里会赋值给“_name”,然后在getter里返回“_name”,既然category里没办法存放_name,那我们有什么办法来实现呢?我们在后期的文章里来尝试一下。

至此,我们文章起始的三个问题也就都有答案了。