iOS底层原理-objc_init()以及read_images分析

104 阅读6分钟

前言

我们编写的代码,通过编译后形成可执行文件machO,那么这些类信息是什么时候加载到内存的呢?分类是什么?分类中的方法是什么时候加载到本类的呢?带着这些问题,下面逐步分析!

我们在进行dyld分析时研究过的,在objc_init()中有一个重要的方法:

1.png

  • map_images:管理文件中和动态库中所有的符号,完成类class方法selector协议protocol分类category的加载;
  • load_images:加载执行load方法。

1.objc_init()流程分析

已经学习了objc_init()的初始化时机。objc_init()实现源码:

void _objc_init(void)
{
    static bool initialized = false;
    if (initialized) return;
    initialized = true;
    
    // fixme defer initialization until an objc-using image is found?
    environ_init();
    tls_init();
    static_init();
    runtime_init();
    exception_init();
#if __OBJC2__
    cache_t::init();
#endif
    _imp_implementationWithBlock_init();

    _dyld_objc_notify_register(&map_images, load_images, unmap_image);

#if __OBJC2__
    didCallDyldNotifyRegister = true;
#endif
}

1.environ_init()

读取影响运⾏时的环境变量。在源码中做一些修改,可以打印环境变量信息。添加图中的代码:

2.png

控制台打印了环境变量信息,比如

  • 是否针对isa进行优化OBJC_DISABLE_NONPOINTER_ISA
  • 是否打印输出load方法OBJC_PRINT_LOAD_METHODS 比如我们可以打印一个对象指向类的isa指针,见下如:

尾数为1,此时开启了isa指针优化。下面做个环境变量的配置,将OBJC_DISABLE_NONPOINTER_ISA设置为YES。配置方式见下图:

4.png

再次运行程序,再次打印指向类对象的isa指针,以二进制输出:

5.png

尾数为0,未进行isa指针的优化。

我们还可以设置OBJC_PRINT_LOAD_METHODS环境变量,来打印load方法。添加环境变量,将OBJC_PRINT_LOAD_METHODS设置YES

2.tls_init()

初始化本地线程池,关于线程key的绑定。

3.static_init()

运⾏C ++静态构造函数。在dyld调⽤我们的静态构造函数之前,libc会调⽤ _objc_init(),因此我们必须⾃⼰做。

static void static_init()
{
    size_t count;
    auto inits = getLibobjcInitializers(&_mh_dylib_header, &count);
    for (size_t i = 0; i < count; i++) {
        inits[i]();
    }
}

4.runtime_init()

  • unattachedCategories.init分类表的初始化
  • allocatedClasses.init进行内存中类表的创建
void runtime_init(void)
{
    objc::unattachedCategories.init(32);
    objc::allocatedClasses.init();
}

5.exception_init()

完成objc异常处理系统的初始化,进行回调函数的设置,实现异常捕获处理。

/***********************************************************************
* exception_init
* Initialize libobjc's exception handling system.
* Called by map_images().
**********************************************************************/
void exception_init(void)
{
    old_terminate = std::set_terminate(&_objc_terminate);
}

static void (*old_terminate)(void) = nil;
static void _objc_terminate(void)
{
    if (PrintExceptions) {
        _objc_inform("EXCEPTIONS: terminating");
    }

    if (! __cxa_current_exception_type()) {
        // No current exception.
        (*old_terminate)();
    }
    else {
        // There is a current exception. Check if it's an objc exception.
        @try {
            __cxa_rethrow();
        } @catch (id e) {
            // It's an objc object. Call Foundation's handler, if any.
            (*uncaught_handler)((id)e);
            (*old_terminate)();
        } @catch (...) {
            // It's not an objc object. Continue to C++ terminate.
            (*old_terminate)();
        }
    }
}

当出现一个异常,会判断是否为objc异常,如果是objc异常会执行回调函数uncaught_handler。全局搜索uncaught_handler,找到回调函数设置的方法。

objc_uncaught_exception_handler 
objc_setUncaughtExceptionHandler(objc_uncaught_exception_handler fn)
{
    objc_uncaught_exception_handler result = uncaught_handler;
    uncaught_handler = fn;
    return result;
}

OC层,我们可以通过调用方法NSSetUncaughtExceptionHandler设置回调函数,回调函数会被赋值给uncaught_handler

6.cache_init()

缓存条件的初始化。

void cache_t::init()
{
#if HAVE_TASK_RESTARTABLE_RANGES
    mach_msg_type_number_t count = 0;
    kern_return_t kr;

    while (objc_restartableRanges[count].location) {
        count++;
    }

    kr = task_restartable_ranges_register(mach_task_self(),
                                          objc_restartableRanges, count);
    if (kr == KERN_SUCCESS) return;
    _objc_fatal("task_restartable_ranges_register failed (result 0x%x: %s)",
                kr, mach_error_string(kr));
#endif // HAVE_TASK_RESTARTABLE_RANGES
}

7._imp_implementationWithBlock_init()

启动回调机制。通常这不会做什么,因为所有的初始化都是惰性的,但是对于某些进程,我们会迫不及待地加载trampolines dylib

void
_imp_implementationWithBlock_init(void)
{
#if TARGET_OS_OSX
    // Eagerly load libobjc-trampolines.dylib in certain processes. Some
    // programs (most notably QtWebEngineProcess used by older versions of
    // embedded Chromium) enable a highly restrictive sandbox profile which
    // blocks access to that dylib. If anything calls
    // imp_implementationWithBlock (as AppKit has started doing) then we'll
    // crash trying to load it. Loading it here sets it up before the sandbox
    // profile is enabled and blocks it.
    //
    // This fixes EA Origin (rdar://problem/50813789)
    // and Steam (rdar://problem/55286131)
    if (__progname &&
        (strcmp(__progname, "QtWebEngineProcess") == 0 ||
         strcmp(__progname, "Steam Helper") == 0)) {
        Trampolines.Initialize();
    }
#endif
}

8._dyld_objc_notify_register dyld注册

应用程序加载时,会调换用objc_init(),当执行_dyld_objc_notify_register注册函数时,会将三个方法注册到dyld中。 这三个方法是:

  • map_images:这里传入的是方法引用,也就是方法的实现地址。管理文件中和动态库中所有文件,如类class协议protocol方法selector分类category的实现。
  • load_images:该方法传入的是值,即方法实现。加载执行+load方法
  • unmap_image:dyld将image移除时,会触发该函数。

2. _read_images分析

map_images -> map_images_nolock -> _read_images

1.整体分析

深入研究map_images类的加载流程,通过解读注释和代码解析,找到核心函数_read_images_read_images核心代码如下:

void _read_images(header_info **hList, uint32_t hCount, int totalClasses, int unoptimizedTotalClasses)
{
    header_info *hi;
    uint32_t hIndex;
    size_t count;
    size_t i;
    Class *resolvedFutureClasses = nil;
    size_t resolvedFutureClassCount = 0;
    static bool doneOnce;
    bool launchTime = NO;
    TimeLogger ts(PrintImageTimes);

    runtimeLock.assertLocked();

#define EACH_HEADER \
    hIndex = 0;         \
    hIndex < hCount && (hi = hList[hIndex]); \
    hIndex++
   // 1.条件控制,进行一次的加载。将所有类放入一个表中
    if (!doneOnce) {
        doneOnce = YES;
        launchTime = YES;

#if SUPPORT_NONPOINTER_ISA
        // Disable non-pointer isa under some conditions.

# if SUPPORT_INDEXED_ISA
        // Disable nonpointer isa if any image contains old Swift code
        for (EACH_HEADER) {
            if (hi->info()->containsSwift()  &&
                hi->info()->swiftUnstableVersion() < objc_image_info::SwiftVersion3)
            {
                DisableNonpointerIsa = true;
                if (PrintRawIsa) {
                    _objc_inform("RAW ISA: disabling non-pointer isa because "
                                 "the app or a framework contains Swift code "
                                 "older than Swift 3.0");
                }
                break;
            }
        }
# endif

# if TARGET_OS_OSX
        // Disable non-pointer isa if the app is too old
        // (linked before OS X 10.11)
//        if (!dyld_program_sdk_at_least(dyld_platform_version_macOS_10_11)) {
//            DisableNonpointerIsa = true;
//            if (PrintRawIsa) {
//                _objc_inform("RAW ISA: disabling non-pointer isa because "
//                             "the app is too old.");
//            }
//        }

        // Disable non-pointer isa if the app has a __DATA,__objc_rawisa section
        // New apps that load old extensions may need this.
        for (EACH_HEADER) {
            if (hi->mhdr()->filetype != MH_EXECUTE) continue;
            unsigned long size;
            if (getsectiondata(hi->mhdr(), "__DATA", "__objc_rawisa", &size)) {
                DisableNonpointerIsa = true;
                if (PrintRawIsa) {
                    _objc_inform("RAW ISA: disabling non-pointer isa because "
                                 "the app has a __DATA,__objc_rawisa section");
                }
            }
            break;  // assume only one MH_EXECUTE image
        }
# endif

#endif

        if (DisableTaggedPointers) {
            disableTaggedPointers();
        }
        
        initializeTaggedPointerObfuscator();

        if (PrintConnecting) {
            _objc_inform("CLASS: found %d classes during launch", totalClasses);
        }

        // namedClasses
        // Preoptimized classes don't go in this table.
        // 4/3 is NXMapTable's load factor
        // objc::unattachedCategories.init(32);
        // objc::allocatedClasses.init();
        
        int namedClassesSize = 
            (isPreoptimized() ? unoptimizedTotalClasses : totalClasses) * 4 / 3;
        gdb_objc_realized_classes =
            NXCreateMapTable(NXStrValueMapPrototype, namedClassesSize);

        ts.log("IMAGE TIMES: first time tasks");
    }

    // Fix up @selector references
    // sel 名字 + 地址
    // 2.修复预编译阶段的 `@selector` 的混乱问题
    // 带地址的字符串匹配,不同库的坐标不一样,需要加载一块进行统一调度
    static size_t UnfixedSelectors;
    {
        mutex_locker_t lock(selLock);
        for (EACH_HEADER) {
            if (hi->hasPreoptimizedSelectors()) continue;

            bool isBundle = hi->isBundle();
            SEL *sels = _getObjc2SelectorRefs(hi, &count);
            UnfixedSelectors += count;
            for (i = 0; i < count; i++) {
                const char *name = sel_cname(sels[i]);
                SEL sel = sel_registerNameNoLock(name, isBundle);
                if (sels[i] != sel) {
                    sels[i] = sel;
                }
            }
        }
    }

    ts.log("IMAGE TIMES: fix up selector references");

    // Discover classes. Fix up unresolved future classes. Mark bundle classes.
    bool hasDyldRoots = dyld_shared_cache_some_image_overridden();
    // 3.初始化名称 - 错误混乱的类处理
    for (EACH_HEADER) {
        if (! mustReadClasses(hi, hasDyldRoots)) {
            // Image is sufficiently optimized that we need not call readClass()
            continue;
        }

        classref_t const *classlist = _getObjc2ClassList(hi, &count);

        bool headerIsBundle = hi->isBundle();
        bool headerIsPreoptimized = hi->hasPreoptimizedClasses();

        for (i = 0; i < count; i++) {
            Class cls = (Class)classlist[i];
            Class newCls = readClass(cls, headerIsBundle, headerIsPreoptimized);

            if (newCls != cls  &&  newCls) {
                // Class was moved but not deleted. Currently this occurs 
                // only when the new class resolved a future class.
                // Non-lazily realize the class below.
                resolvedFutureClasses = (Class *)
                    realloc(resolvedFutureClasses, 
                            (resolvedFutureClassCount+1) * sizeof(Class));
                resolvedFutureClasses[resolvedFutureClassCount++] = newCls;
            }
        }
    }

    ts.log("IMAGE TIMES: discover classes");

    // Fix up remapped classes
    // Class list and nonlazy class list remain unremapped.
    // Class refs and super refs are remapped for message dispatching.
    // 4.修复重映射⼀些没有被镜像⽂件加载进来的 类
    if (!noClassesRemapped()) {
        for (EACH_HEADER) {
            Class *classrefs = _getObjc2ClassRefs(hi, &count);
            for (i = 0; i < count; i++) {
                remapClassRef(&classrefs[i]);
            }
            // fixme why doesn't test future1 catch the absence of this?
            classrefs = _getObjc2SuperRefs(hi, &count);
            for (i = 0; i < count; i++) {
                remapClassRef(&classrefs[i]);
            }
        }
    }

    ts.log("IMAGE TIMES: remap classes");

#if SUPPORT_FIXUP
    // Fix up old objc_msgSend_fixup call sites
    // 5.修复一些消息
    for (EACH_HEADER) {
        message_ref_t *refs = _getObjc2MessageRefs(hi, &count);
        if (count == 0) continue;

        if (PrintVtables) {
            _objc_inform("VTABLES: repairing %zu unsupported vtable dispatch "
                         "call sites in %s", count, hi->fname());
        }
        for (i = 0; i < count; i++) {
            fixupMessageRef(refs+i);
        }
    }

    ts.log("IMAGE TIMES: fix up objc_msgSend_fixup");
#endif


    // Discover protocols. Fix up protocol refs.
    // 6.读取协议
    for (EACH_HEADER) {
        extern objc_class OBJC_CLASS_$_Protocol;
        Class cls = (Class)&OBJC_CLASS_$_Protocol;
        ASSERT(cls);
        NXMapTable *protocol_map = protocols();
        bool isPreoptimized = hi->hasPreoptimizedProtocols();

        // Skip reading protocols if this is an image from the shared cache
        // and we support roots
        // Note, after launch we do need to walk the protocol as the protocol
        // in the shared cache is marked with isCanonical() and that may not
        // be true if some non-shared cache binary was chosen as the canonical
        // definition
        if (launchTime && isPreoptimized) {
            if (PrintProtocols) {
                _objc_inform("PROTOCOLS: Skipping reading protocols in image: %s",
                             hi->fname());
            }
            continue;
        }

        bool isBundle = hi->isBundle();

        protocol_t * const *protolist = _getObjc2ProtocolList(hi, &count);
        for (i = 0; i < count; i++) {
            readProtocol(protolist[i], cls, protocol_map, 
                         isPreoptimized, isBundle);
        }
    }

    ts.log("IMAGE TIMES: discover protocols");

    // Fix up @protocol references
    // Preoptimized images may have the right 
    // answer already but we don't know for sure.
    // 7.修复没有被加载的协议
    for (EACH_HEADER) {
        // At launch time, we know preoptimized image refs are pointing at the
        // shared cache definition of a protocol.  We can skip the check on
        // launch, but have to visit @protocol refs for shared cache images
        // loaded later.
        if (launchTime && hi->isPreoptimized())
            continue;
        protocol_t **protolist = _getObjc2ProtocolRefs(hi, &count);
        for (i = 0; i < count; i++) {
            remapProtocolRef(&protolist[i]);
        }
    }

    ts.log("IMAGE TIMES: fix up @protocol references");

    // Discover categories. Only do this after the initial category
    // attachment has been done. For categories present at startup,
    // discovery is deferred until the first load_images call after
    // the call to _dyld_objc_notify_register completes. rdar://problem/53119145
    // 8.分类处理
    if (didInitialAttachCategories) {
        for (EACH_HEADER) {
            load_categories_nolock(hi);
        }
    }

    ts.log("IMAGE TIMES: discover categories");

    // Category discovery MUST BE Late to avoid potential races
    // when other threads call the new category code before
    // this thread finishes its fixups.

    // +load handled by prepare_load_methods()

    // Realize non-lazy classes (for +load methods and static instances)
    // 9.类的加载处理 类实现
    for (EACH_HEADER) {
        classref_t const *classlist = hi->nlclslist(&count);
        for (i = 0; i < count; i++) {
            Class cls = remapClass(classlist[i]);
            if (!cls) continue;

            addClassTableEntry(cls);

            if (cls->isSwiftStable()) {
                if (cls->swiftMetadataInitializer()) {
                    _objc_fatal("Swift class %s with a metadata initializer "
                                "is not allowed to be non-lazy",
                                cls->nameForLogging());
                }
                // fixme also disallow relocatable classes
                // We can't disallow all Swift classes because of
                // classes like Swift.__EmptyArrayStorage
            }
            realizeClassWithoutSwift(cls, nil);
        }
    }

    ts.log("IMAGE TIMES: realize non-lazy classes");

    // Realize newly-resolved future classes, in case CF manipulates them
    // 10.没有被处理的类 优化那些被侵犯的类
    if (resolvedFutureClasses) {
        for (i = 0; i < resolvedFutureClassCount; i++) {
            Class cls = resolvedFutureClasses[i];
            if (cls->isSwiftStable()) {
                _objc_fatal("Swift class is not allowed to be future");
            }
            realizeClassWithoutSwift(cls, nil);
            cls->setInstancesRequireRawIsaRecursively(false/*inherited*/);
        }
        free(resolvedFutureClasses);
    }

    ts.log("IMAGE TIMES: realize future classes");

    if (DebugNonFragileIvars) {
        realizeAllClasses();
    }
  • 1: 条件控制进⾏⼀次的加载

  • 2: 修复预编译阶段的 @selector 的混乱问题

  • 3: 错误混乱的类处理

  • 4:修复重映射⼀些没有被镜像⽂件加载进来的类

  • 5: 修复⼀些消息!

  • 6: 当我们类⾥⾯有协议的时候 :readProtocol

  • 7: 修复没有被加载的协议

  • 8: 分类处理

  • 9: 类的加载处理

  • 10 : 没有被处理的类 优化那些被侵犯的类

2.关键流程解析

1.条件控制,进行一次的加载

doneOnceNO时,即第一次进来时,会进入if判断里面,然后将doneOnce修改为YES,所以说这个判断只会进行一次,即第一次进来时。

        // 表-查找快-将所有的类放在一个表中
        gdb_objc_realized_classes =
        NXCreateMapTable(NXStrValueMapPrototype, namedClassesSize);

这里会创建一个哈希表gdb_objc_realized_classes,所有的类将放入这个表中,目的是方便快捷查找类gdb_objc_realized_classes是命名类并且不在dyld共享缓存中,无论是否实现。

2.readClass初始化名称 - 错误混乱的类处理

在此部分会初始化类的名称。类已移动但未删除,对错误混乱的类进行处理。

for (EACH_HEADER) {

        if (! mustReadClasses(hi, hasDyldRoots)) {

            // Image is sufficiently optimized that we need not call readClass()

            continue;

        }
        classref_t const *classlist = _getObjc2ClassList(hi, &count);

        bool headerIsBundle = hi->isBundle();

        bool headerIsPreoptimized = hi->hasPreoptimizedClasses();

        for (i = 0; i < count; i++) {

            Class cls = (Class)classlist[i];

            Class newCls = readClass(cls, headerIsBundle, headerIsPreoptimized);

            

           // 错误混乱的类处理

           // 类的信息发生了结混乱,  类已经被和移动,但是没有删除。宁愿删除重建也不移动,消耗太大

            if (newCls != cls  &&  newCls) {

                // Class was moved but not deleted. Currently this occurs 

                // only when the new class resolved a future class.

                // Non-lazily realize the class below.

                resolvedFutureClasses = (Class *)

                    realloc(resolvedFutureClasses, 

                            (resolvedFutureClassCount+1) * sizeof(Class));

                resolvedFutureClasses[resolvedFutureClassCount++] = newCls;

            }

        }

    }

_getObjc2ClassList 可执行文件machO中获取类列表,对类进行处理。进入readClass方法查看源码实现:

6.png 为了便于研究,我们可以对cls进行过滤,过滤出我们要研究的类LGPerson。在此流程中会通过cls->mangledName();获取类的名称。mangledName源码实现如下:

const char *nonlazyMangledName() const {

        return bits.safe_ro()->getName();

    }
// 如果已经初始化 - 从ro取,否则从machO 数据中读取数据
const class_ro_t *safe_ro() const {

        class_rw_t *maybe_rw = data();

        if (maybe_rw->flags & RW_REALIZED) {

            // maybe_rw is rw

            return maybe_rw->ro();

        } else {

            // maybe_rw is actually ro

            return (class_ro_t *)maybe_rw;

        }

    }
   
  const char *getName() const {

        return name.load(std::memory_order_acquire);

    }

如果类已经实现,则从rw->ro中获取name;如果类没有实现,则从machO中获取的数据强转为ro再获取name。 继续跟踪代码,程序会运行到addNamedClass,通过该方法,将类名称添加到已命名的非元类映射。

static void addNamedClass(Class cls, const char *name, Class replacing = nil)

{

    runtimeLock.assertLocked();

    Class old;

    if ((old = getClassExceptSomeSwift(name))  &&  old != replacing) {

        inform_duplicate(name, old, cls);

        // getMaybeUnrealizedNonMetaClass uses name lookups.

        // Classes not found by name lookup must be in the

        // secondary meta->nonmeta table.

        addNonMetaClass(cls);

    } else {
       // 将名字插入到类对应的表中
        NXMapInsert(gdb_objc_realized_classes, name, cls);

    }

    ASSERT(!(cls->data()->flags & RO_META));

    // wrong: constructed classes are already realized when they get here

    // ASSERT(!cls->isRealized());

}

即将名称更新到类对应的表中,而该表就是哈希表NXMapTable *gdb_objc_realized_classes;,可以理解为是一个查阅表,该表在[1. 条件控制,进行一次的加载]流程中已被创建。然后在调用addClassTableEntry,将类添加到内存中的类对应表中。 如果addMeta为true,也递归添加该类的元类

static void

addClassTableEntry(Class cls, bool addMeta = true)

{

    runtimeLock.assertLocked();

    // This class is allowed to be a known class via the shared cache or via

    // data segments, but it is not allowed to be in the dynamic table already.

    auto &set = objc::allocatedClasses.get();

    ASSERT(set.find(cls) == set.end());

    if (!isKnownClass(cls))

        set.insert(cls);

    if (addMeta)
      // 将类插入表中-递归
        addClassTableEntry(cls->ISA(), false);

}

addClassTableEntry方法中的objc::allocatedClasses是不是很眼熟,没错在objc_init()runtime_init()方法中出现过,allocatedClasses.init进行内存中类的表创建。 根据前面过滤的JhsPerson,跟进流程,打印类的信息。见下图:

7.png 根据readClass运行结果,名称被添加到类信息中。

3.核心流程定位

通过上面的分析我们发现,第8步分类处理第9步类的加载处理才涉及到类加载。并且采用LGPerson类名称过滤:

const char *mangledName = cls->nonlazyMangledName();
    
    if (strcmp(mangledName, "LGPerson") == 0)
    {
        printf("LGPerson....");
    }

跟踪断点发现,一些修复的流程根本不会进入。继续跟踪代码,当运行到如下代码时,发现一些关键的注释,比如对非懒加载的类和未来类进行初始化。见下图:

8.png

继续运行代码,很遗憾,没有进入到realizeClassWithoutSwift流程中,查看hi->nlclslist(&count)源码实现:

9.png

获取非懒加载的类列表,何为非懒加载?没错实现+load方法!我们在JhsPerson中添加load方法。重新运行并过滤JhsPerson类!见下图:

10.png

11.png

并且此时打印clsro数据,是无法获取的,无法实现结果变量:无法读取其内存

很显然,realizeClassWithoutSwift才是我们的研究重点!