012-应用程序加载(上)

866 阅读16分钟

通过这篇文章可以获得什么

前言

现在iOS市场上经常会听到“二进制重排”“冷启动优化”等字眼,仿佛已经成为iOS成长之路上绕不开的必须掌握的知识点。这些技术点无非就是在App的启动流程中做文章,想进行后续的技术成长,熟练的掌握应用程序加载的工作原理是必要的。

需要的源码

应用加载流程案例引入

创建一个工程,在main.m文件内声明一个类,实现load方法,并在load与main函数分别设置断点。

@interface FFPerson : NSObject

@end

@implementation FFPerson

+ (void)load {
    NSLog(@"%s",__func__);
}

@end

int main(int argc, char * argv[]) {
    @autoreleasepool {
        NSLog(@"%s",__func__);
    }
    return UIApplicationMain(argc, argv, nil, NSStringFromClass([AppDelegate class]));
}

图解:

模拟器(simulator)

应用程序加载引入.png

真机(iphone)

应用加载流程引入-真机.png

应用加载流程整理

模拟器(simulator)

  • dyld`_dyld_start ->
  • dyld`dyldbootstrap::start ->
  • dyld`dyld::_main ->
  • dyld`dyld::useSimulatorDyld ->
  • dyld_sim`dyld::_main ->
  • dyld_sim`dyld::initializeMainExecutable() ->
  • dyld_sim`ImageLoader::runInitializers ->
  • dyld_sim`ImageLoader::processInitializers ->
  • dyld_sim`ImageLoader::recursiveInitialization ->
  • dyld_sim`dyld::notifySingle ->
  • libobjc.A.dylib`load_images

真机(iPhone)

  • dyld`_dyld_start ->
  • dyld`dyldbootstrap::start ->
  • dyld`dyld::_main ->
  • dyld`dyld::initializeMainExecutable() ->
  • dyld`ImageLoader::runInitializers ->
  • dyld`ImageLoader::processInitializers
  • dyld`ImageLoader::recursiveInitialization ->
  • dyld`dyld::notifySingle ->
  • libobjc.A.dylib`load_images

通过对比可以得出两点结论:

  1. 模拟器真机的启动流程是有细微的差别的,模拟器单独实现了一套加载,至于具体哪里不同,效率如何,为什么这样做,对不起,我也不知道。

  2. 整个加载过程是由dyld开始,最终到达libobjc结束,也就是说应用加载流程不只是dyld在工作,还有很多库在配合一起完成这个过程。

dyld的基本认知

dyld:动态链接器

通过上述加载过程可以看到,几乎全是dyld在工作,dyld在应用程序加载这个过程起到了至关重要的作用,我认为非常有必要了解一下dyld是什么?演变过程如何,现在是什么版本了。

dyld演变过程

资料来源:WWDC2017:App Startup Time: Past, Present, and Future

  • dyld 1.0(1996-2004)
    • 包含在NeXTStep3.3中
    • 作用并不是特别大,其历史早于标准化POSIX diopen调用
    • 他是在大多数使用C++动态库的系统之前编写的
    • 在macOS Cheetah(10.0)添加了与绑定功能(P retending)
  • dyld 2.0(2004-2007)
    • 包含在macOS Tiger 中
    • dyld2是完全重写的
    • 正确支持C++初始化器的语义
    • 具有完整的dlopen和dlsym实现
    • dyld2的设计目标是提高速度,因此进行了有限的健全性检查
    • 安全性增强
    • 由于素的的大幅提升,因此可以减少了预绑定的工作量
    • 不同于dyld1编辑你的程序数据,此次只编辑系统库,并且仅在软件更新的时候更新预绑定。
  • dyld 2.x(2007-2017)
    • 增加了大量基础结构和平台(x86、x86_64、arm、arm64)
    • 通过多种途径增强安全性
    • 增加代码签名和ASLR(地址空间的随机加载)
    • 增强了性能(消除了预绑定,转而使用了共享缓存 )
  • dyld 3(2017-至今)
    • dyld3是全新的动态连接器
    • 2017以后的apple所有的系统都将使用dyld3
    • 为了性能,尽可能的更快的提高启动速度
    • 在设计上提升了安全性
    • 可测试性与可靠性

图形结构

dyld2.jpeg

dyld3.jpeg

dyld 2是如何加载程序的

  • 第一Parse mach-0 header/Find dependencies:分析mach-o headers,通过分析得知需要加载哪些库,然后通过递归查找上述的库又需要那些其他库的支持,知道获得所有dylib完整的二进制文件。普通的iOS程序需要3-600个dylib,数据很庞大,需要大量的处理。
  • 第二Map mach-o files:映射所有的mach-o文件,将他们放入到地址空间内,即内存
  • 第三Perform symbol lookups:执行符号查找,假设程序内使用了printf函数,将会查找printf是否在系统库中,然后找到函数的地址,将它复制到你的程序汇中的函数指针
  • 第四Bind and rebase:进行符号的重绑定,复制这些指针,由于使用随机地址,所有指针必须使用基地址
  • 第五Run initializers:运行所有的初始化器
  • 第六:准备运行main函数

经过对dyld 2的优化,dyld 3的加载过程的不同之处

  • perform symbol lookups移到第二步,向磁盘写入闭包处理。
  • dyld分成了3部分,红色部分是一个进程外的mach-o分析器与编译器,也是一个进程内引擎,执行启动闭包处理,也是一个启用闭包的缓存服务,大多数的程序启动会使用缓存,但是始终不需要调用进程外mach-o分析器和编译器,启用闭包比mach-o更加的简单,它们是内存映射文件,不需要复杂的方法进行分析
  • 进程外编译器部分:首先解析所有的搜索路径,所有rpaths、所有环境变量、然后分析mach-o的二进制数据,执行所有的符号查找,利用这些结果来创建闭包处理
  • dyld3也是一个小型进程内引擎,这部分驻留在进程内,它所做的事情就是验证闭包是否正确,然后映射到dylib中,再跳转到main函数,与dyld2对比,dyld3不需要分析mach-o文件头或执行符号查找,不需要做这些事情就可以启动应用,因此极大的提升了程序的启动速度
  • 最后dyld3还会启动一个闭包缓存服务,这里指的是将系统程序闭包直接加入到共享缓存

应用启动过程(以真机为例)

_dyld_start

发生在汇编层次,为dyldbootstrap::start提供参数,运行程序看汇编:

dyld_bootstrap.png

在汇编层级查看源码:

#if __arm64__ && !TARGET_OS_SIMULATOR
	.text
	.align 2
	.globl __dyld_start
__dyld_start:
	mov 	x28, sp
	and     sp, x28, #~15		// force 16-byte alignment of stack
	mov	x0, #0
	mov	x1, #0
	stp	x1, x0, [sp, #-16]!	// make aligned terminating frame
	mov	fp, sp			// set up fp to point to terminating frame
	sub	sp, sp, #16             // make room for local variables
#if __LP64__
	ldr     x0, [x28]               // get app's mh into x0
	ldr     x1, [x28, #8]           // get argc into x1 (kernel passes 32-bit int argc as 64-bits on stack to keep alignment)
	add     x2, x28, #16            // get argv into x2
#else
	ldr     w0, [x28]               // get app's mh into x0
	ldr     w1, [x28, #4]           // get argc into x1 (kernel passes 32-bit int argc as 64-bits on stack to keep alignment)
	add     w2, w28, #8             // get argv into x2
#endif
	adrp	x3,___dso_handle@page
	add 	x3,x3,___dso_handle@pageoff // get dyld's mh in to x4
	mov	x4,sp                   // x5 has &startGlue

	// call dyldbootstrap::start(app_mh, argc, argv, dyld_mh, &startGlue)
	bl	__ZN13dyldbootstrap5startEPKN5dyld311MachOLoadedEiPPKcS3_Pm

dyldbootstrap::start

dyld的引导以及dyld加载应用程序及库前的准备工作

//这是引导 dyld 的代码。 这项工作通常由 dyld 和 crt 为程序完成。
uintptr_t start(const dyld3::MachOLoaded* appsMachHeader, int argc, const char* argv[],
				const dyld3::MachOLoaded* dyldsMachHeader, uintptr_t* startGlue)
{

    // 发出 kdebug 跟踪点以指示 dyld 引导程序已启动
    dyld3::kdebug_trace_dyld_marker(DBG_DYLD_TIMING_BOOTSTRAP_START, 0, 0, 0, 0);

	// if kernel had to slide dyld, we need to fix up load sensitive locations
	// we have to do this before using any global variables
    rebaseDyld(dyldsMachHeader);

	// kernel sets up env pointer to be just past end of agv array
	const char** envp = &argv[argc+1];
	
	// kernel sets up apple pointer to be just past end of envp array
	const char** apple = envp;
	while(*apple != NULL) { ++apple; }
	++apple;

	// set up random value for stack canary
	__guard_setup(apple);

#if DYLD_INITIALIZER_SUPPORT
	// run all C++ initializers inside dyld
	runDyldInitializers(argc, argv, envp, apple);
#endif

	_subsystem_init(apple);

	// 已经完成了 dyld 的引导,调用 dyld 的 main
	uintptr_t appsSlide = appsMachHeader->getSlide();
	return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}

dyld::_main

(截取了部分关键源码,具体源码位置6452-7305行) 在文章的结尾附上了一份reloadAllImages

  • 系统检测
  • 配置信息,获取主程序的mach-o header、silder
  • 设置上下文,将这里所有的变量放到了gLinkContext中了,保存起来
  • 判断进程是否受限,如果是受限制的进程,环境变量envp有可能变化,重新设置上下文
  • 通过读取macho-header加载共享缓存
  • 加载fremework
  • 如果是dyld3加载
    • launch闭包,找到主程序main函数,并且返回
  • 如果非dyld3加载
    • 序列化主程序initializeMainExecutable();
    • 将主程序添加到AllImages
    • 插入动态库(如果有的话)
    • 链接主程序
    • 链接动态库
    • 绑定符号
    • 初始化main方法
uintptr_t
_main(const macho_header* mainExecutableMH, uintptr_t mainExecutableSlide, 
		int argc, const char* argv[], const char* envp[], const char* apple[], 
		uintptr_t* startGlue)

        // 创建主程序cdHash的空间
	uint8_t mainExecutableCDHashBuffer[20];
        //从环境中获取主可执行文件的 cdHash
	const uint8_t* mainExecutableCDHash = nullptr;
	if ( const char* mainExeCdHashStr = _simple_getenv(apple, "executable_cdhash") ) {
		unsigned bufferLenUsed;
		if ( hexStringToBytes(mainExeCdHashStr, mainExecutableCDHashBuffer, sizeof(mainExecutableCDHashBuffer), bufferLenUsed) )
			mainExecutableCDHash = mainExecutableCDHashBuffer;
	}
        //配置信息,获取主程序的mach-o header、silder(ASLR的偏移值)
	getHostInfo(mainExecutableMH, mainExecutableSlide);
        
        //赋值
        //这里通过silder+ASLR可以找到信息
        uintptr_t result = 0;
	sMainExecutableMachHeader = mainExecutableMH;
	sMainExecutableSlide = mainExecutableSlide;
        
        //设置上下文,将这里所有的变量放到了gLinkContext中了,保存起来
        setContext(mainExecutableMH, argc, argv, envp, apple);
        
        //配置进程是否受限,envp是环境变量
        configureProcessRestrictions(mainExecutableMH, envp);
        //检查是否应该强制 dyld3。
        //移动文件保护:AFMI(Apple Mobile File Integrity)
        if ( dyld3::internalInstall() ) {
            //具体实现删掉了,源码位置6667-6678行
	}
#if TARGET_OS_OSX
    //如果是受限制的进程,环境变量envp有可能变化,这里重新设置
    if ( !gLinkContext.allowEnvVarsPrint && !gLinkContext.allowEnvVarsPath && !gLinkContext.allowEnvVarsSharedCache ) {
		pruneEnvironmentVariables(envp, &apple);
		setContext(mainExecutableMH, argc, argv, envp, apple);
	}
	else
#endif
	{
                //检测环境变量
		checkEnvironmentVariables(envp);
                //设置环境变量默认值
		defaultUninitializedFallbackPaths(envp);
	}
        
        //如果工程中设置了DYLD_PRINT_OPTS或DYLD_PRINT_ENV环境变量,会在load之前打印信息
        if ( sEnv.DYLD_PRINT_OPTS )
		printOptions(argv);
	if ( sEnv.DYLD_PRINT_ENV ) 
		printEnvironmentVariables(envp);
                
        // 通过读取macho-header加载共享缓存(UIKit,Foundation等等)
	checkSharedRegionDisable((dyld3::MachOLoaded*)mainExecutableMH, mainExecutableSlide);
        
#if !TARGET_OS_SIMULATOR
        //判断是不是使用dyld3加载,off则不使用dyld3,else是dyld3
	if ( sClosureMode == ClosureMode::Off ) {
		if ( gLinkContext.verboseWarnings )
			dyld::log("dyld: not using closures\n");
	} else {
                //设置加载模式为clorure
		sLaunchModeUsed = DYLD_LAUNCH_MODE_USING_CLOSURE;
                //配置closure,让dyld3知道如何加载主程序
		const dyld3::closure::LaunchClosure* mainClosure = nullptr;
		dyld3::closure::LoadedFileInfo mainFileInfo;
		mainFileInfo.fileContent = mainExecutableMH;
		mainFileInfo.path = sExecPath;
                //删除了部分代码
                // 首先检查缓存中是否已经又了mainClosure
		if ( sSharedCacheLoadInfo.loadAddress != nullptr ) {
                        //如果没有,那么去共享缓存中找,并初始化这个mainClosure实例
			mainClosure = sSharedCacheLoadInfo.loadAddress->findClosure(sExecPath);
                        //如果不等于空,就使用
			if ( mainClosure != nullptr )
				sLaunchModeUsed |= DYLD_LAUNCH_MODE_CLOSURE_FROM_OS;
		}
                
                //如果这个闭包不为空,但是验证已经失效了
                if ( (mainClosure != nullptr) && !closureValid(mainClosure, mainFileInfo, mainExecutableCDHash, true, envp) ) {
                        //将mainClosure置为空
			mainClosure = nullptr;
			sLaunchModeUsed &= ~DYLD_LAUNCH_MODE_CLOSURE_FROM_OS;
		}
                
                // 如果我们没有找到有效的缓存闭包,则尝试构建一个新的缓存闭包
		if ( (mainClosure == nullptr) && allowClosureRebuilds ) {
			// 如果强制关闭,并且缓存中没有闭包,或者它是无效的,请检查缓存的闭包
			if ( !sForceInvalidSharedCacheClosureFormat )
				mainClosure = findCachedLaunchClosure(mainExecutableCDHash, mainFileInfo, envp, bootToken);
			if ( mainClosure == nullptr ) {
				// 如果缓存中没有,就创建一个
				mainClosure = buildLaunchClosure(mainExecutableCDHash, mainFileInfo, envp, bootToken);
				if ( mainClosure != nullptr )
					sLaunchModeUsed |= DYLD_LAUNCH_MODE_BUILT_CLOSURE_AT_LAUNCH;
			}
		}
                //通过前面一些了的查找,如果没有就创建,最终确定了mainClosure的存在
                //启动mainClosure
                bool launched = launchWithClosure(mainClosure, sSharedCacheLoadInfo.loadAddress, (dyld3::MachOLoaded*)mainExecutableMH,
											  mainExecutableSlide, argc, argv, envp, apple, diag, &result, startGlue, &closureOutOfDate, &recoverable);
                  
                //如果启动失败,并且colsure过期
                if ( !launched && closureOutOfDate && allowClosureRebuilds ) {
			// 重新创建一个闭包
			mainClosure = buildLaunchClosure(mainExecutableCDHash, mainFileInfo, envp, bootToken);
                        //然后重新启动mainClosure
                        launched = launchWithClosure(mainClosure, sSharedCacheLoadInfo.loadAddress, (dyld3::MachOLoaded*)mainExecutableMH,
												 mainExecutableSlide, argc, argv, envp, apple, diag, &result, startGlue, &closureOutOfDate, &recoverable);
                 }
                 //如果启动成功了,这里就会找到主程序的main函数,并且将其返回
                 if ( launched ) {
			gLinkContext.startedInitializingMainExecutable = true;
			if (sSkipMain)
				result = (uintptr_t)&fake_main;
			return result;
		 }
            }
  //如果不是dyld3的加载方式
  reloadAllImages:
        //实例化加载主程序
        sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
	gLinkContext.mainExecutable = sMainExecutable;
        //代码签名
	gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);
        
        // 插入动态库,如果在越狱环境下是可以修改的
	if( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
		for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib) 
		loadInsertedDylib(*lib);
	}
        //链接主程序与动态库、插入动态库
        link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
        
        // 链接任何插入的库
        // 在链接主可执行文件后执行此操作,以便插入的任何 dylib
        // dylibs(例如libSystem)不会在程序使用的dylibs之前
        if ( sInsertedDylibCount > 0 ) {
		for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
                        //将动态库装载进AllImages,在[i+1]位置开始插入,因为第0个位置是主程序
			ImageLoader* image = sAllImages[i+1];
			link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
			image->setNeverUnloadRecursive();
		}
		if ( gLinkContext.allowInterposing ) {
			// 只有 INSERTED 库可以插入
                        // 在所有插入的库绑定后注册插入信息,以便链接工作
			for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
				ImageLoader* image = sAllImages[i+1];
				image->registerInterposing(gLinkContext);
			}
		}
	}
#if SUPPORT_ACCELERATE_TABLES
	if ( (sAllCacheImagesProxy != NULL) && ImageLoader::haveInterposingTuples() ) {
            //刷新所有镜像,准备重新加载
            resetAllImages();
            //如果没有加载成功,重新回到reloadAllImages再来一遍
            goto reloadAllImages;
	}
#endif

        // 绑定插入动态库
	if ( sInsertedDylibCount > 0 ) {
		for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
			ImageLoader* image = sAllImages[i+1];
			image->recursiveBind(gLinkContext, sEnv.DYLD_BIND_AT_LAUNCH, true, nullptr);
		}
	}
        
        //弱符号绑定
        sMainExecutable->weakBind(gLinkContext);
        
  #if SUPPORT_OLD_CRT_INITIALIZATION
	// 旧方法是通过 crt1.o 的回调运行初始化程序
	if ( ! gRunInitializersOldWay ) 
            initializeMainExecutable(); 
	#else
            // 初始化main方法
            initializeMainExecutable(); 
	#endif
}

mapSharedCache(加载共享缓存相关)

// iOS是必须有共享缓存的
static void mapSharedCache(uintptr_t mainExecutableSlide)
{
	dyld3::SharedCacheOptions opts;
	opts.cacheDirOverride	= sSharedCacheOverrideDir;
	opts.forcePrivate		= (gLinkContext.sharedRegionMode == ImageLoader::kUsePrivateSharedRegion);
#if __x86_64__ && !TARGET_OS_SIMULATOR
	opts.useHaswell			= sHaswell;
#else
	opts.useHaswell			= false;
#endif
	opts.verbose			= gLinkContext.verboseMapping;
    // <rdar://problem/32031197> respect -disable_aslr boot-arg
    // <rdar://problem/56299169> kern.bootargs is now blocked
	opts.disableASLR		= (mainExecutableSlide == 0) && dyld3::internalInstall(); // infer ASLR is off if main executable is not slid
	loadDyldCache(opts, &sSharedCacheLoadInfo);
}

loadDyldCache(加载共享缓存相关)

bool loadDyldCache(const SharedCacheOptions& options, SharedCacheLoadInfo* results)
{
    results->loadAddress        = 0;
    results->slide              = 0;
    results->errorMessage       = nullptr;

#if TARGET_OS_SIMULATOR
    // 模拟器只支持 mmap() 缓存私下进入进程
    return mapCachePrivate(options, results);
#else
    if ( options.forcePrivate ) {
        // 仅加载当前进程
        return mapCachePrivate(options, results);
    }
    else {
        // 如果共享缓存已经加在了,不做任何处理
        bool hasError = false;
        if ( reuseExistingCache(options, results) ) {
            hasError = (results->errorMessage != nullptr);
        } else {
            // 第一次夹在共享缓存,调用mapCacheSystemWide
            hasError = mapCacheSystemWide(options, results);
        }
        return hasError;
    }
#endif
}

instantiateFromLoadedImage(实例化加载主程序相关)

// 内核在 dyld 获得控制之前映射到主可执行文件中。
// 为已经映射到主可执行文件中的对象创建一个 ImageLoader*。
static ImageLoaderMachO* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, const char* path)
{
        //加载moch-o
	ImageLoader* image = ImageLoaderMachO::instantiateMainExecutable(mh, slide, path, gLinkContext);
        //将image添加到AllImages,所以AllImage里面第一个是主程序
	addImage(image);
	return (ImageLoaderMachO*)image;
}

sniffLoadCommands(实例化加载主程序相关)

// 确定此 mach-o 文件是否具有经典或压缩的 LINKEDIT 以及它具有的段数
void ImageLoaderMachO::sniffLoadCommands(const macho_header* mh, const char* path, bool inCache, bool* compressed,
											unsigned int* segCount, unsigned int* libCount, const LinkContext& context,
											const linkedit_data_command** codeSigCmd,
											const encryption_info_command** encryptCmd)
{
	*compressed = false;
        //segment数量
	*segCount = 0;
        //lib数量
	*libCount = 0;
        //代码签名
	*codeSigCmd = NULL;
        //代码加密
	*encryptCmd = NULL;
        
        //截取一些关键信息,中间逻辑运算忽略了
        
        // 如果segment>255就会报错
	if ( *segCount > 255 )
		dyld::throwf("malformed mach-o image: more than 255 segments in %s", path);

	// libCount>255就会报错
	if ( *libCount > 4095 )
		dyld::throwf("malformed mach-o image: more than 4095 dependent libraries in %s", path);
}

ImageLoader::link(链接主程序与动态库、插入动态库)

dyld的加载链接过程中的所有耗时都可以通过设置环境变量打印出来

void ImageLoader::link(const LinkContext& context, bool forceLazysBound, bool preflightOnly, bool neverUnload, const RPathChain& loaderRPaths, const char* imagePath)
{

	
	// clear error strings
	(*context.setErrorStrings)(0, NULL, NULL, NULL);
        //记录起始时间
	uint64_t t0 = mach_absolute_time();
        //递归加载主程序依赖的库,完成之后发送通知
	this->recursiveLoadLibraries(context, preflightOnly, loaderRPaths, imagePath);
	context.notifyBatch(dyld_image_state_dependents_mapped, preflightOnly);

	// we only do the loading step for preflights
	if ( preflightOnly )
		return;

	uint64_t t1 = mach_absolute_time();
	context.clearAllDepths();
	this->updateDepth(context.imageCount());

	__block uint64_t t2, t3, t4, t5;
	{
		dyld3::ScopedTimer(DBG_DYLD_TIMING_APPLY_FIXUPS, 0, 0, 0);
		t2 = mach_absolute_time();
                //修正ASLR
		this->recursiveRebaseWithAccounting(context);
		context.notifyBatch(dyld_image_state_rebased, false);

		t3 = mach_absolute_time();
		if ( !context.linkingMainExecutable )
                        //绑定NoLazy(非懒加载)符号
			this->recursiveBindWithAccounting(context, forceLazysBound, neverUnload);

		t4 = mach_absolute_time();
		if ( !context.linkingMainExecutable )
                        //绑定弱引用符号
			this->weakBind(context);
		t5 = mach_absolute_time();
	}

	// interpose any dynamically loaded images
	if ( !context.linkingMainExecutable && (fgInterposingTuples.size() != 0) ) {
		dyld3::ScopedTimer timer(DBG_DYLD_TIMING_APPLY_INTERPOSING, 0, 0, 0);
                //递归应用插入的动态库
		this->recursiveApplyInterposing(context);
	}

	// now that all fixups are done, make __DATA_CONST segments read-only
	if ( !context.linkingMainExecutable )
		this->recursiveMakeDataReadOnly(context);

    if ( !context.linkingMainExecutable )
        context.notifyBatch(dyld_image_state_bound, false);
	uint64_t t6 = mach_absolute_time();

	if ( context.registerDOFs != NULL ) {
		std::vector<DOFInfo> dofs;
		this->recursiveGetDOFSections(context, dofs);
                //注册
		context.registerDOFs(dofs);
	}
        //计算结束时间
	uint64_t t7 = mach_absolute_time();

	// clear error strings
	(*context.setErrorStrings)(0, NULL, NULL, NULL);

        //配置环境变量,可以看到dyld加载过程的全部时间(WWDC中可以看到)
	fgTotalLoadLibrariesTime += t1 - t0;
	fgTotalRebaseTime += t3 - t2;
	fgTotalBindTime += t4 - t3;
	fgTotalWeakBindTime += t5 - t4;
	fgTotalDOF += t7 - t6;
	
	// done with initial dylib loads
	fgNextPIEDylibAddress = 0;
}

dyld::initializeMainExecutable()(能力有限,未详细分解)

void initializeMainExecutable()
{
	// record that we've reached this step
	gLinkContext.startedInitializingMainExecutable = true;

	// run initialzers for any inserted dylibs
	ImageLoader::InitializerTimingList initializerTimes[allImagesCount()];
	initializerTimes[0].count = 0;
	const size_t rootCount = sImageRoots.size();
	if ( rootCount > 1 ) {
		for(size_t i=1; i < rootCount; ++i) {
			sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]);
		}
	}
	
	// run initializers for main executable and everything it brings up 
	sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]);
	
	// register cxa_atexit() handler to run static terminators in all loaded images when this process exits
	if ( gLibSystemHelpers != NULL ) 
		(*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL, NULL);

	// dump info if requested
	if ( sEnv.DYLD_PRINT_STATISTICS )
		ImageLoader::printStatistics((unsigned int)allImagesCount(), initializerTimes[0]);
	if ( sEnv.DYLD_PRINT_STATISTICS_DETAILS )
		ImageLoaderMachO::printStatisticsDetails((unsigned int)allImagesCount(), initializerTimes[0]);
}

ImageLoader::runInitializers(能力有限,未详细分解)

void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo)
{
	uint64_t t1 = mach_absolute_time();
	mach_port_t thisThread = mach_thread_self();
	ImageLoader::UninitedUpwards up;
	up.count = 1;
	up.imagesAndPaths[0] = { this, this->getPath() };
	processInitializers(context, thisThread, timingInfo, up);
	context.notifyBatch(dyld_image_state_initialized, false);
	mach_port_deallocate(mach_task_self(), thisThread);
	uint64_t t2 = mach_absolute_time();
	fgTotalInitTime += (t2 - t1);
}

ImageLoader::processInitializers(能力有限,未详细分解)

// <rdar://problem/14412057> upward dylib initializers can be run too soon
// To handle dangling dylibs which are upward linked but not downward, all upward linked dylibs
// have their initialization postponed until after the recursion through downward dylibs
// has completed.
void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread,
									 InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images)
{
	uint32_t maxImageCount = context.imageCount()+2;
	ImageLoader::UninitedUpwards upsBuffer[maxImageCount];
	ImageLoader::UninitedUpwards& ups = upsBuffer[0];
	ups.count = 0;
	// Calling recursive init on all images in images list, building a new list of
	// uninitialized upward dependencies.
	for (uintptr_t i=0; i < images.count; ++i) {
		images.imagesAndPaths[i].first->recursiveInitialization(context, thisThread, images.imagesAndPaths[i].second, timingInfo, ups);
	}
	// If any upward dependencies remain, init them.
	if ( ups.count > 0 )
		processInitializers(context, thisThread, timingInfo, ups);
}

ImageLoader::recursiveInitialization(能力有限,未详细分解)

void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize,
										  InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
{
	recursive_lock lock_info(this_thread);
	recursiveSpinLock(lock_info);

	if ( fState < dyld_image_state_dependents_initialized-1 ) {
		uint8_t oldState = fState;
		// break cycles
		fState = dyld_image_state_dependents_initialized-1;
		try {
			// initialize lower level libraries first
			for(unsigned int i=0; i < libraryCount(); ++i) {
				ImageLoader* dependentImage = libImage(i);
				if ( dependentImage != NULL ) {
					// don't try to initialize stuff "above" me yet
					if ( libIsUpward(i) ) {
						uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) };
						uninitUps.count++;
					}
					else if ( dependentImage->fDepth >= fDepth ) {
						dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps);
					}
                }
			}
			
			// record termination order
			if ( this->needsTermination() )
				context.terminationRecorder(this);

			// let objc know we are about to initialize this image
			uint64_t t1 = mach_absolute_time();
			fState = dyld_image_state_dependents_initialized;
			oldState = fState;
			context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
			
			// initialize this image
			bool hasInitializers = this->doInitialization(context);

			// let anyone know we finished initializing this image
			fState = dyld_image_state_initialized;
			oldState = fState;
			context.notifySingle(dyld_image_state_initialized, this, NULL);
			
			if ( hasInitializers ) {
				uint64_t t2 = mach_absolute_time();
				timingInfo.addTime(this->getShortName(), t2-t1);
			}
		}
		catch (const char* msg) {
			// this image is not initialized
			fState = oldState;
			recursiveSpinUnLock();
			throw;
		}
	}
	
	recursiveSpinUnLock();
}

dyld::notifySingle(能力有限,未详细分解)

通过调用链路,此函数里面找不到load_images函数了,应为load_images已经不是dyld中的函数了,是属于libobjc库的函数。在此函数中发现了关键的回调(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());,但是在回调之前判断了sNotifyObjCInit是否为空,就证明一定有给sNotifyObjCInit变量赋值的地方

static void notifySingle(dyld_image_states state, const ImageLoader* image, ImageLoader::InitializerTimingList* timingInfo)
{
	//dyld::log("notifySingle(state=%d, image=%s)\n", state, image->getPath());
	std::vector<dyld_image_state_change_handler>* handlers = stateToHandlers(state, sSingleHandlers);
	if ( handlers != NULL ) {
		dyld_image_info info;
		info.imageLoadAddress	= image->machHeader();
		info.imageFilePath		= image->getRealPath();
		info.imageFileModDate	= image->lastModified();
		for (std::vector<dyld_image_state_change_handler>::iterator it = handlers->begin(); it != handlers->end(); ++it) {
			const char* result = (*it)(state, 1, &info);
			if ( (result != NULL) && (state == dyld_image_state_mapped) ) {
				//fprintf(stderr, "  image rejected by handler=%p\n", *it);
				// make copy of thrown string so that later catch clauses can free it
				const char* str = strdup(result);
				throw str;
			}
		}
	}
	if ( state == dyld_image_state_mapped ) {
		// <rdar://problem/7008875> Save load addr + UUID for images from outside the shared cache
		// <rdar://problem/50432671> Include UUIDs for shared cache dylibs in all image info when using private mapped shared caches
		if (!image->inSharedCache()
			|| (gLinkContext.sharedRegionMode == ImageLoader::kUsePrivateSharedRegion)) {
			dyld_uuid_info info;
			if ( image->getUUID(info.imageUUID) ) {
				info.imageLoadAddress = image->machHeader();
				addNonSharedCacheImageUUID(info);
			}
		}
	}
	if ( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit != NULL) && image->notifyObjC() ) {
		uint64_t t0 = mach_absolute_time();
		dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
		(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
		uint64_t t1 = mach_absolute_time();
		uint64_t t2 = mach_absolute_time();
		uint64_t timeInObjC = t1-t0;
		uint64_t emptyTime = (t2-t1)*100;
		if ( (timeInObjC > emptyTime) && (timingInfo != NULL) ) {
			timingInfo->addTime(image->getShortName(), timeInObjC);
		}
	}
    // mach message csdlc about dynamically unloaded images
	if ( image->addFuncNotified() && (state == dyld_image_state_terminated) ) {
		notifyKernel(*image, false);
		const struct mach_header* loadAddress[] = { image->machHeader() };
		const char* loadPath[] = { image->getPath() };
		notifyMonitoringDyld(true, 1, loadAddress, loadPath);
	}
}

registerObjCNotifiers(notifySingle相关)

在本文件中搜索sNotifyObjCInit,找到了函数registerObjCNotifiers

void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
	// record functions to call
	sNotifyObjCMapped	= mapped;
	sNotifyObjCInit		= init;
	sNotifyObjCUnmapped = unmapped;

	// call 'mapped' function with all images mapped so far
	try {
		notifyBatchPartial(dyld_image_state_bound, true, NULL, false, true);
	}
	catch (const char* msg) {
		// ignore request to abort during registration
	}

	// <rdar://problem/32209809> call 'init' function on all images already init'ed (below libSystem)
	for (std::vector<ImageLoader*>::iterator it=sAllImages.begin(); it != sAllImages.end(); it++) {
		ImageLoader* image = *it;
		if ( (image->getState() == dyld_image_state_initialized) && image->notifyObjC() ) {
			dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
			(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
		}
	}
}

_dyld_objc_notify_register(notifySingle相关)

通过registerObjCNotifiers函数得知,有一个调用者调用了这个函数,并且传递的第二个参数是init,全局搜索这个调用者

全局搜索registerObjCNotifires.png 源码:

void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped)
{
	dyld::registerObjCNotifiers(mapped, init, unmapped);
}

_dyld_objc_notify_register

此函数为注册通知,通过全局查找,没有注册的地方,这里通过对工程下符号断点的方式看一下是哪个库调用的这个方法

load_images(能力有限,未详细分解)

void
load_images(const char *path __unused, const struct mach_header *mh)
{
    if (!didInitialAttachCategories && didCallDyldNotifyRegister) {
        didInitialAttachCategories = true;
        loadAllCategories();
    }

    // Return without taking locks if there are no +load methods here.
    if (!hasLoadMethods((const headerType *)mh)) return;

    recursive_mutex_locker_t lock(loadMethodLock);

    // Discover load methods
    {
        mutex_locker_t lock2(runtimeLock);
        prepare_load_methods((const headerType *)mh);
    }

    // Call +load methods (without runtimeLock - re-entrant)
    call_load_methods();
}

image.png

添加符号断点.png

应用启动流程图(粗略图)

应用程序加载流程图.png

附件

reloadAllImages

一个基本应用的在启动过程中加载的全部image

reloadAllImages.png