iOS12 应用程序加载dyld

583 阅读18分钟
  • 作为iOS开发者,关于App启动,是必不可少的一部分,今天我们就来了解App的启动流程,掌握应用程序加载。

官方源码

应用程序加载

  • 加载流程图

image.png

  1. 源文件:.c/.m/.cpp等文件。
  2. 预处理:在处理的时候注释被删除、条件编译被处理、头文件展开、宏被替换。
  3. 编译:进行词法分心语法分析以及中间层IR文件,最后生成.s文件
  4. 汇编:将.s文件妆换成机器语言生成.o文件
  5. 链接:将所有的.o文件以及链接的第三方库,生成一个macho类型的可执行文件

动态库和静态库

  • 静态库:在链接阶段会将汇编生成的目标和引用库一起链接打包到可执行文件中
  • 动态库:程序编译不会链接到目标代码中,而是程序运行时才被载入
静态库特点
  1. 静态库打包到可执行文件中,编译成功后可以独立运行,不需要依赖外部环境
  2. 编译的文件会变大,如果静态库更新必须重新编译
动态库特点
  1. 减少打包之后App的大小
  2. 共享内容,资源共享
  3. 通过更新动态库达到更新程序的目的
  4. 可执行文件不可以单独运行,必须依赖外部环境

静态库动态库图解

image.png

dyld加载流程探索

  • 因为是探索main函数之前的流程,直接在main函数断点

F9ACF8AD-F72E-4E80-AE0E-14E36D297D9C.png

  • 堆栈信息显示libdyld.dylib库的start函数是开始的位置,然后直接到了main函数。

  • 断点到load运行

模拟器版本

96372CCD-4196-4A76-88CC-8594E1D3E02F.png

流程

  1. _dyld_start
  2. dyldbootstrap::start
  3. dyld::_main
  4. dyld::useSimulatorDyld
  5. start_sim
  6. dyld::_main
  7. dyld::initializeMainExecutable
  8. ImageLoader::runInitializers
  9. ImageLoader::processInitializers
  10. ImageLoader::recursiveInitialization
  11. dyld::notifySingle
  12. load_images
  13. +[ViewController load]

真机版本

69C58ECA-0A2F-4C97-AAE8-30C87F078C57.png

流程

  1. _dyld_start
  2. dyldbootstrap::start
  3. dyld::_main
  4. dyld::initializeMainExecutable
  5. ImageLoader::runInitializers
  6. ImageLoader::processInitializers
  7. ImageLoader::recursiveInitialization
  8. dyld::notifySingle
  9. load_images
  10. +[ViewController load]

主要探究真机

  • _dyld_start汇编源码

6F494F78-1B29-4E9C-86E8-834BD59FB0A1.png

汇编_dyld_start调用dyldbootstrap 查看源码

16E3EFD6-250F-4B27-807C-2C4711A1BAF6.png

dyldbootstrap是一个命名空间,我们可以在dyldInitialization.cpp文件中搜索start方法

7DB9645E-101C-4BA6-9A9A-1141F102D1F2.png


//重定位 dyld
rebaseDyld(dyldsMachHeader);

//栈平衡保护
__guard_setup(apple);

//获取虚拟内存偏移
uintptr_t appsSlide = appsMachHeader->getSlide();

//返回_main方法的 返回值
return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);

  • 重定位dyld,因为App一启动系统就会自动给App随机分配ASLRdyld需要重定位因为它需要到当前进程中获取自己的信息
  • 调用dyld::_main方法,获取返回结果

dyld::_main源码分析

  • 系统检测
  • 配置信息,获取主程序的mach-o header、silder
  • 设置上下文,将这里所有的变量放到了glinkContext,保存起来
  • 判断进程是否受限,如果受限制的进程,环境变量envp有可能变化,重新设置上下文
  • 通过读取macho-header加载共享缓存
  • 加载framework
  • 如果是dyld3加载、
  1. launch闭包,找到主程序main函数,并且返回
  • 如果非dyld3加载
  1. 序列化主程序initializeMainExecutable();
  2. 将主程序添加到AllImages
  3. 插入动态库(如果有的话)
  4. 链接主程序
  5. 链接动态库
  6. 绑定符号
  7. 初始化main方法
uintptr_t
_main(const macho_header* mainExecutableMH, uintptr_t mainExecutableSlide, 
		int argc, const char* argv[], const char* envp[], const char* apple[], 
		uintptr_t* startGlue)

        // 创建主程序cdHash的空间
	uint8_t mainExecutableCDHashBuffer[20];
        //从环境中获取主可执行文件的 cdHash
	const uint8_t* mainExecutableCDHash = nullptr;
	if ( const char* mainExeCdHashStr = _simple_getenv(apple, "executable_cdhash") ) {
		unsigned bufferLenUsed;
		if ( hexStringToBytes(mainExeCdHashStr, mainExecutableCDHashBuffer, sizeof(mainExecutableCDHashBuffer), bufferLenUsed) )
			mainExecutableCDHash = mainExecutableCDHashBuffer;
	}
        //配置信息,获取主程序的mach-o header、silder(ASLR的偏移值)
	getHostInfo(mainExecutableMH, mainExecutableSlide);
        
        //赋值
        //这里通过silder+ASLR可以找到信息
        uintptr_t result = 0;
	sMainExecutableMachHeader = mainExecutableMH;
	sMainExecutableSlide = mainExecutableSlide;
        
        //设置上下文,将这里所有的变量放到了gLinkContext中了,保存起来
        setContext(mainExecutableMH, argc, argv, envp, apple);
        
        //配置进程是否受限,envp是环境变量
        configureProcessRestrictions(mainExecutableMH, envp);
        //检查是否应该强制 dyld3。
        //移动文件保护:AFMI(Apple Mobile File Integrity)
        if ( dyld3::internalInstall() ) {
            //具体实现删掉了,源码位置6667-6678行
	}
#if TARGET_OS_OSX
    //如果是受限制的进程,环境变量envp有可能变化,这里重新设置
    if ( !gLinkContext.allowEnvVarsPrint && !gLinkContext.allowEnvVarsPath && !gLinkContext.allowEnvVarsSharedCache ) {
		pruneEnvironmentVariables(envp, &apple);
		setContext(mainExecutableMH, argc, argv, envp, apple);
	}
	else
#endif
	{
                //检测环境变量
		checkEnvironmentVariables(envp);
                //设置环境变量默认值
		defaultUninitializedFallbackPaths(envp);
	}
        
        //如果工程中设置了DYLD_PRINT_OPTS或DYLD_PRINT_ENV环境变量,会在load之前打印信息
        if ( sEnv.DYLD_PRINT_OPTS )
		printOptions(argv);
	if ( sEnv.DYLD_PRINT_ENV ) 
		printEnvironmentVariables(envp);
                
        // 通过读取macho-header加载共享缓存(UIKit,Foundation等等)
	checkSharedRegionDisable((dyld3::MachOLoaded*)mainExecutableMH, mainExecutableSlide);
        
#if !TARGET_OS_SIMULATOR
        //判断是不是使用dyld3加载,off则不使用dyld3,else是dyld3
	if ( sClosureMode == ClosureMode::Off ) {
		if ( gLinkContext.verboseWarnings )
			dyld::log("dyld: not using closures\n");
	} else {
                //设置加载模式为clorure
		sLaunchModeUsed = DYLD_LAUNCH_MODE_USING_CLOSURE;
                //配置closure,让dyld3知道如何加载主程序
		const dyld3::closure::LaunchClosure* mainClosure = nullptr;
		dyld3::closure::LoadedFileInfo mainFileInfo;
		mainFileInfo.fileContent = mainExecutableMH;
		mainFileInfo.path = sExecPath;
                //删除了部分代码
                // 首先检查缓存中是否已经又了mainClosure
		if ( sSharedCacheLoadInfo.loadAddress != nullptr ) {
                        //如果没有,那么去共享缓存中找,并初始化这个mainClosure实例
			mainClosure = sSharedCacheLoadInfo.loadAddress->findClosure(sExecPath);
                        //如果不等于空,就使用
			if ( mainClosure != nullptr )
				sLaunchModeUsed |= DYLD_LAUNCH_MODE_CLOSURE_FROM_OS;
		}
                
                //如果这个闭包不为空,但是验证已经失效了
                if ( (mainClosure != nullptr) && !closureValid(mainClosure, mainFileInfo, mainExecutableCDHash, true, envp) ) {
                        //将mainClosure置为空
			mainClosure = nullptr;
			sLaunchModeUsed &= ~DYLD_LAUNCH_MODE_CLOSURE_FROM_OS;
		}
                
                // 如果我们没有找到有效的缓存闭包,则尝试构建一个新的缓存闭包
		if ( (mainClosure == nullptr) && allowClosureRebuilds ) {
			// 如果强制关闭,并且缓存中没有闭包,或者它是无效的,请检查缓存的闭包
			if ( !sForceInvalidSharedCacheClosureFormat )
				mainClosure = findCachedLaunchClosure(mainExecutableCDHash, mainFileInfo, envp, bootToken);
			if ( mainClosure == nullptr ) {
				// 如果缓存中没有,就创建一个
				mainClosure = buildLaunchClosure(mainExecutableCDHash, mainFileInfo, envp, bootToken);
				if ( mainClosure != nullptr )
					sLaunchModeUsed |= DYLD_LAUNCH_MODE_BUILT_CLOSURE_AT_LAUNCH;
			}
		}
                //通过前面一些了的查找,如果没有就创建,最终确定了mainClosure的存在
                //启动mainClosure
                bool launched = launchWithClosure(mainClosure, sSharedCacheLoadInfo.loadAddress, (dyld3::MachOLoaded*)mainExecutableMH,
											  mainExecutableSlide, argc, argv, envp, apple, diag, &result, startGlue, &closureOutOfDate, &recoverable);
                  
                //如果启动失败,并且colsure过期
                if ( !launched && closureOutOfDate && allowClosureRebuilds ) {
			// 重新创建一个闭包
			mainClosure = buildLaunchClosure(mainExecutableCDHash, mainFileInfo, envp, bootToken);
                        //然后重新启动mainClosure
                        launched = launchWithClosure(mainClosure, sSharedCacheLoadInfo.loadAddress, (dyld3::MachOLoaded*)mainExecutableMH,
												 mainExecutableSlide, argc, argv, envp, apple, diag, &result, startGlue, &closureOutOfDate, &recoverable);
                 }
                 //如果启动成功了,这里就会找到主程序的main函数,并且将其返回
                 if ( launched ) {
			gLinkContext.startedInitializingMainExecutable = true;
			if (sSkipMain)
				result = (uintptr_t)&fake_main;
			return result;
		 }
            }
  //如果不是dyld3的加载方式
  reloadAllImages:
        //实例化加载主程序
        sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
	gLinkContext.mainExecutable = sMainExecutable;
        //代码签名
	gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);
        
        // 插入动态库,如果在越狱环境下是可以修改的
	if( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
		for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib) 
		loadInsertedDylib(*lib);
	}
        //链接主程序与动态库、插入动态库
        link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
        
        // 链接任何插入的库
        // 在链接主可执行文件后执行此操作,以便插入的任何 dylib
        // dylibs(例如libSystem)不会在程序使用的dylibs之前
        if ( sInsertedDylibCount > 0 ) {
		for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
                        //将动态库装载进AllImages,在[i+1]位置开始插入,因为第0个位置是主程序
			ImageLoader* image = sAllImages[i+1];
			link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
			image->setNeverUnloadRecursive();
		}
		if ( gLinkContext.allowInterposing ) {
			// 只有 INSERTED 库可以插入
                        // 在所有插入的库绑定后注册插入信息,以便链接工作
			for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
				ImageLoader* image = sAllImages[i+1];
				image->registerInterposing(gLinkContext);
			}
		}
	}
#if SUPPORT_ACCELERATE_TABLES
	if ( (sAllCacheImagesProxy != NULL) && ImageLoader::haveInterposingTuples() ) {
            //刷新所有镜像,准备重新加载
            resetAllImages();
            //如果没有加载成功,重新回到reloadAllImages再来一遍
            goto reloadAllImages;
	}
#endif

        // 绑定插入动态库
	if ( sInsertedDylibCount > 0 ) {
		for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
			ImageLoader* image = sAllImages[i+1];
			image->recursiveBind(gLinkContext, sEnv.DYLD_BIND_AT_LAUNCH, true, nullptr);
		}
	}
        
        //弱符号绑定
        sMainExecutable->weakBind(gLinkContext);
        
  #if SUPPORT_OLD_CRT_INITIALIZATION
	// 旧方法是通过 crt1.o 的回调运行初始化程序
	if ( ! gRunInitializersOldWay ) 
            initializeMainExecutable(); 
	#else
            // 初始化main方法
            initializeMainExecutable(); 
	#endif
}
  • iOS是必须要有共享缓存,共享缓存中村的都是系统级别的动态库,比如UIKit,CoreFoundation等。自己创建的动态库或者第三方库是不会放在共享缓存的 87F8F4B0-1363-4C56-8095-D2039E944BFB.png
  • checkSharedRegionDisable方法是检测是否需要不同的架构是否需要共享缓存
  • mapSharedCache加载共享缓存

4E37DCB5-89E4-48F5-8B30-5B753541D31C.png

  • checkSharedRegionDisable方法很明显的提示IOS是必须要有共享缓存的。
  • 下面探究下具体是怎么加载共享缓存的mapSharedCache方法中有调用了loadDyldCache加载共享缓存的方法

4ABBA91D-A778-4BA7-91A7-2BC19EA9386A.png

bool loadDyldCache(const SharedCacheOptions& options, SharedCacheLoadInfo* results)
{
    results->loadAddress        = 0;
    results->slide              = 0;
    results->errorMessage       = nullptr;

#if TARGET_OS_SIMULATOR
    // simulator only supports mmap()ing cache privately into process
    return mapCachePrivate(options, results);
#else
    //如果forcePrivate = YES 就是强制私有
    //此时就不会记载共享缓存,而是把你需要的系统库子冻驾到当前的进程中 只在当前进程中使用
    if ( options.forcePrivate ) {
        // mmap cache into this process only
        return mapCachePrivate(options, results);
    }
    else {
        // fast path: when cache is already mapped into shared region
        bool hasError = false;
        // 如果你需要的系统库在共享缓存中,直接拿过来用,不需要做其他的
        if ( reuseExistingCache(options, results) ) {
            hasError = (results->errorMessage != nullptr);
        } else {
        //如果是第一次加载共享缓存,就直接去加载
            // slow path: this is first process to load cache
            hasError = mapCacheSystemWide(options, results);
        }
        return hasError;
    }
#endif
}
共享缓存的加载共有三种情况
  1. 强制私有:forcePrivate = YES,表示强制私有。只加载到盗抢App进程中,不放在共享缓存中
  2. 共享缓存已加载:如果你依赖的库在共享缓存中已经加载过了,此时就可以直接用无需其他操作
  3. 第一次加载:如果你依赖的库共享缓存汇总没有,它就回被加载到共享缓存中
  • iOS是必须有共享缓存的,共享缓存中之存放系统库,创建共享缓存的目的是为了多进程共同使用系统库

dyld3和dyld2

  • dyld3又叫做闭包模式,他的加载速度更快,效率更高。iOS11以后主程序都是用dyld3加载,iOS13以后动态库和三方库用dyld3加载
  //判断是否使用闭包模式也是dyld3的模式启动 ClosureMode::on 用dyld3 否则使用dyld2
  if ( sClosureMode == ClosureMode::Off ) {
    //dyld2
    if ( gLinkContext.verboseWarnings )
            dyld::log("dyld: not using closures\n");
  } else {
    //dyld3  DYLD_LAUNCH_MODE_USING_CLOSURE 用闭包模式
    sLaunchModeUsed = DYLD_LAUNCH_MODE_USING_CLOSURE;
    const dyld3::closure::LaunchClosure* mainClosure = nullptr;
    dyld3::closure::LoadedFileInfo mainFileInfo;
    mainFileInfo.fileContent = mainExecutableMH;
    mainFileInfo.path = sExecPath;
    ...
    // 首先到共享缓存中去找是否有dyld3的mainClosure
    if ( sSharedCacheLoadInfo.loadAddress != nullptr ) {
            mainClosure = sSharedCacheLoadInfo.loadAddress->findClosure(sExecPath);
            ...
    }
 
   ...
    //如果共享缓存中有,然后去验证closure是否是有效的
    if ( (mainClosure != nullptr) && !closureValid(mainClosure, mainFileInfo, 
    、mainExecutableCDHash, true, envp) ) {
            mainClosure = nullptr;
            sLaunchModeUsed &= ~DYLD_LAUNCH_MODE_CLOSURE_FROM_OS;
    }
    
    bool allowClosureRebuilds = false;
    if ( sClosureMode == ClosureMode::On ) {
            allowClosureRebuilds = true;
    } 
    ...
    
    //如果没有在共享缓存中找到有效的closure 此时就会自动创建一个closure
    if ( (mainClosure == nullptr) && allowClosureRebuilds ) {
        ...
        if ( mainClosure == nullptr ) { 
        // 创建一个mainClosure
        mainClosure = buildLaunchClosure(mainExecutableCDHash, mainFileInfo, envp, 
        bootToken);
        if ( mainClosure != nullptr )
                sLaunchModeUsed |= DYLD_LAUNCH_MODE_BUILT_CLOSURE_AT_LAUNCH;
        }
    }
   
    // try using launch closure
    // dyld3 开始启动
    if ( mainClosure != nullptr ) {
        CRSetCrashLogMessage("dyld3: launch started");
        ...
        //启动 launchWithClosure
        bool launched = launchWithClosure(mainClosure, 
        sSharedCacheLoadInfo.loadAddress,(dyld3::MachOLoaded*)mainExecutableMH,...);
         //启动失败                                                              
        if ( !launched && closureOutOfDate && allowClosureRebuilds ) {
                // closure is out of date, build new one
                // 如果启动失败 重新去创建mainClosure
                mainClosure = buildLaunchClosure(mainExecutableCDHash, mainFileInfo, 
                envp, bootToken);
                if ( mainClosure != nullptr ) {
                    ...
                    //dyld3再次启动
                    launched = launchWithClosure(mainClosure,  sSharedCacheLoadInfo.loadAddress,
                    (dyld3::MachOLoaded*)mainExecutableMH,...);
                }
            }
            if ( launched ) {
                    gLinkContext.startedInitializingMainExecutable = true;
                    if (sSkipMain)
                    //启动成功直接返回main函数的地址
                    result = (uintptr_t)&fake_main;
                    return result;
            }
            else {  
            //启动失败      
            }
    }
}
  • dyld3启动过程是经过很多次的尝试,系统给了很多次机会,一般情况不会出现启动失败的情况

  • 如果不采用dyld3的方式就会采用dyld2的模式

{
// could not use closure info, launch old way
	// 用dyld2的模式,不用dyld3
	sLaunchModeUsed = 0;
	// install gdb notifier
	stateToHandlers(dyld_image_state_dependents_mapped, sBatchHandlers)->push_back(notifyGDB);
	stateToHandlers(dyld_image_state_mapped, sSingleHandlers)->push_back(updateAllImages);
	// make initial allocations large enough that it is unlikely to need to be re-alloced
	sImageRoots.reserve(16);
	sAddImageCallbacks.reserve(4);
	sRemoveImageCallbacks.reserve(4);
	sAddLoadImageCallbacks.reserve(4);
	sImageFilesNeedingTermination.reserve(16);
	sImageFilesNeedingDOFUnregistration.reserve(8);

#if !TARGET_OS_SIMULATOR
#ifdef WAIT_FOR_SYSTEM_ORDER_HANDSHAKE
	  file generation process
	WAIT_FOR_SYSTEM_ORDER_HANDSHAKE(dyld::gProcessInfo->systemOrderFlag);
#endif
#endif
	try {
		// add dyld itself to UUID list
		addDyldImageToUUIDList();
                ...
 }


dyld3启动流程

  1. 从共享缓存中获取dyld3的实例mainClosure
  2. 验证mainClosure是否有效
  3. 再去共享缓存中查找有效的mainClosure,如果有直接启动
  4. 如果没有,创建mainClosure
  5. 启动mainClosure,启动dyld3
  6. 启动成功以后,主程序启动成功,result就是main函数的地址,返回到dyldbootstrap::start方法,
  7. 然后进入main函数

实例化主程序

  • dyld3和dyld2走的流程都是一样的,dyld3用的是闭包模式。
  • image在源码中经常出现,image是镜像文件
  • 镜像文件就是从磁盘映射到内存的macho文件即加载到内存的macho文件就叫镜像文件
		// instantiate ImageLoader for main executable
                //实例化主程序
		sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
		gLinkContext.mainExecutable = sMainExecutable;
                //主程序签名
		gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);


实例化主程序就是把需要的主程序的部分信息加载到内存中,通过instantiateMainExecutable方法返回ImageLoader类型的实例对象,然后对主程序进行签名

static ImageLoaderMachO* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, const char* path)
{
	// try mach-o loader
//	if ( isCompatibleMachO((const uint8_t*)mh, path) ) {
		ImageLoader* image = ImageLoaderMachO::instantiateMainExecutable(mh, slide, path, gLinkContext);
		addImage(image);
		return (ImageLoaderMachO*)image;
//	}
	
//	throw "main executable not a known format";
}

将实例化的image添加到镜像文件数组中,主程序的image是第一个添加到数组中的。接着探究ImageLoaderMacho::instantiateMainExecutable方法

ImageLoader* ImageLoaderMachO::instantiateMainExecutable(const macho_header* mh, uintptr_t slide, const char* path, const LinkContext& context)
{
	//dyld::log("ImageLoader=%ld, ImageLoaderMachO=%ld, ImageLoaderMachOClassic=%ld, ImageLoaderMachOCompressed=%ld\n",
	//	sizeof(ImageLoader), sizeof(ImageLoaderMachO), sizeof(ImageLoaderMachOClassic), sizeof(ImageLoaderMachOCompressed));
	bool compressed;
	unsigned int segCount;//segment段 个数
	unsigned int libCount;//动态库 个数
	const linkedit_data_command* codeSigCmd;//签名信息
	const encryption_info_command* encryptCmd;//加密信息
        //加载command
	sniffLoadCommands(mh, path, false, &compressed, &segCount, &libCount, context, &codeSigCmd, &encryptCmd);
	// instantiate concrete class based on content of load commands
	if ( compressed ) //不同方式去初始化
		return ImageLoaderMachOCompressed::instantiateMainExecutable(mh, slide, path, segCount, libCount, context);
	else
#if SUPPORT_CLASSIC_MACHO
		return ImageLoaderMachOClassic::instantiateMainExecutable(mh, slide, path, segCount, libCount, context);
#else
		throw "missing LC_DYLD_INFO load command";
#endif
}

	// fSegmentsArrayCount is only 8-bits
        //segment段最大是256
	if ( *segCount > 255 )
		dyld::throwf("malformed mach-o image: more than 255 segments in %s", path);

	// fSegmentsArrayCount is only 8-bits
        //动态库最大是4096
	if ( *libCount > 4095 )
		dyld::throwf("malformed mach-o image: more than 4095 dependent libraries in %s", path);
               

        //确保依赖了libSystem库
	if ( needsAddedLibSystemDepency(*libCount, mh) )
		*libCount = 1;
  • sniffLoadCommands中加载segment和commod信息,以及一些校验

插入动态库

		// load any inserted libraries
                //加载插入动态库DYLD_INSERT_LIBRARIES
		if	( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
			for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib) 
				loadInsertedDylib(*lib);
		}
		// record count of inserted libraries so that a flat search will look at 
		// inserted libraries, then main, then others.    
                //获取插入动态库的数量第一个是主程序所以需要-1
		sInsertedDylibCount = sAllImages.size()-1;

链接主程序

		// link main executable
                //开始链接主程序
		gLinkContext.linkingMainExecutable = true;
#if SUPPORT_ACCELERATE_TABLES
		if ( mainExcutableAlreadyRebased ) {
			// previous link() on main executable has already adjusted its internal pointers for ASLR
			// work around that by rebasing by inverse amount
                        //如果主程序没有重定位,进行重定位
			sMainExecutable->rebase(gLinkContext, -mainExecutableSlide);
		}
#endif
                //链接主程序
		link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
		sMainExecutable->setNeverUnloadRecursive();
		if ( sMainExecutable->forceFlat() ) {
			gLinkContext.bindFlat = true;
			gLinkContext.prebindUsage = ImageLoader::kUseNoPrebinding;
		}
  • link中调用了ImageLoader::link方法,ImageLoader负责加载image文件(主程序,动态库)每个image对应一个ImageLoader类的实例
void ImageLoader::link(const LinkContext& context, bool forceLazysBound, bool 
preflightOnly, bool neverUnload, const RPathChain& loaderRPaths, const char* imagePath{
	 
	...
	//递归加载所有的动态库
	this->recursiveLoadLibraries(context, preflightOnly, loaderRPaths, imagePath);
	context.notifyBatch(dyld_image_state_dependents_mapped, preflightOnly);
	 ...
	__block uint64_t t2, t3, t4, t5;
	{
		dyld3::ScopedTimer(DBG_DYLD_TIMING_APPLY_FIXUPS, 0, 0, 0);
		t2 = mach_absolute_time();
		//递归重定位
		this->recursiveRebaseWithAccounting(context);
		context.notifyBatch(dyld_image_state_rebased, false);

		t3 = mach_absolute_time();
		if ( !context.linkingMainExecutable )
		     //递归绑定非懒加载
		     this->recursiveBindWithAccounting(context, forceLazysBound, neverUnload);

		t4 = mach_absolute_time();
		if ( !context.linkingMainExecutable )
			//弱绑定
			this->weakBind(context);
		t5 = mach_absolute_time();
	}

	... 
        //链接过程中可以统计链接动态库的时间,在环境变量设置可以打印出信息 	 
}

  • link主流程
  1. 递归加载所有动态库
  2. 递归image重定位
  3. 递归绑定蓝加载
  4. 若绑定
  • recursiveLoadLibraries方法是递归加载动态库,探究下具体怎么递归加载的
void ImageLoader::recursiveLoadLibraries(const LinkContext& context, bool 
preflightOnly, const RPathChain& loaderRPaths, const char* loadPath){
    ...
    // get list of libraries this image needs
    //获取当前的image依赖的动态库
    DependentLibraryInfo libraryInfos[fLibraryCount]; 
    this->doGetDependentLibraries(libraryInfos);

    // get list of rpaths that this image adds
    //获取当前的image依赖的动态库的文件路径
    std::vector<const char*> rpathsFromThisImage;
    this->getRPaths(context, rpathsFromThisImage);
    const RPathChain thisRPaths(&loaderRPaths, &rpathsFromThisImage);

    // 加载image依赖的动态库
    for(unsigned int i=0; i < fLibraryCount; ++i){
      ...
      dependentLib = context.loadLibrary(requiredLibInfo.name, true, this->getPath(),
      &thisRPaths, cacheIndex);
      // 保存加载的动态库
      setLibImage(i, dependentLib, depLibReExported, requiredLibInfo.upward);	 
      ...
    `}`

    //告诉image依赖的动态库去加载各自需要的动态库
    for(unsigned int i=0; i < libraryCount(); ++i) {
            ImageLoader* dependentImage = libImage(i);
            if ( dependentImage != NULL ) {
                 dependentImage->recursiveLoadLibraries(context, preflightOnly,
                 thisRPaths, libraryInfos[i].name);
            }
    }
}

  • 加载动态库主流程
  1. 获取当前image依赖的动态库和动态库的文件路径
  2. 加载image依赖的动态库,并保存起来
  3. 告诉image依赖的动态库去加载各自需要的动态库

链接动态库

		if ( sInsertedDylibCount > 0 ) {
			for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
				ImageLoader* image = sAllImages[i+1];
                                //链接动态库 动态库里也可能依赖了其他的动态库
				link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
				image->setNeverUnloadRecursive();
			}
			if ( gLinkContext.allowInterposing ) {
				// only INSERTED libraries can interpose
				// register interposing info after all inserted libraries are bound so chaining works
				for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
					ImageLoader* image = sAllImages[i+1];
					image->registerInterposing(gLinkContext);
				}
			}
		}

  • 链接动态库和链接主程序逻辑基本一样,去image文件从下表1开始,因为第0个位置是主程序

B6458C69-04A7-4F15-B3C7-FFA71E4AD952.png

我们可以看到image list中第一个数据就是主程序

弱绑定主程序

		sMainExecutable->weakBind(gLinkContext);
		gLinkContext.linkingMainExecutable = false;

  • 在链接主程序之linkingMainExecutable = true,所以link里面的弱绑定在主程序时是不调用的,等动态库的都进行了弱绑定,最后对主程序进行弱绑定

运行初始化方法

	#if SUPPORT_OLD_CRT_INITIALIZATION
		// Old way is to run initializers via a callback from crt1.o
		if ( ! gRunInitializersOldWay ) 
			initializeMainExecutable(); 
	#else
		// run all initializers
		initializeMainExecutable(); 
	#endif

返回main函数


	if (sSkipMain) {
		notifyMonitoringDyldMain();
		if (dyld3::kdebug_trace_dyld_enabled(DBG_DYLD_TIMING_LAUNCH_EXECUTABLE)) {
			dyld3::kdebug_trace_dyld_duration_end(launchTraceID, DBG_DYLD_TIMING_LAUNCH_EXECUTABLE, 0, 0, 2);
		}
		ARIADNEDBG_CODE(220, 1);
                //获取main函数的地址
		result = (uintptr_t)&fake_main;
		*startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
	}

	return result;

获取到main函数以后,就回进入到main函数中

回顾dyld加载流程

  1. dyld::main
  2. 配置环境变量
  3. 加载共享缓存
  4. 实例主程序
  5. 插入动态库
  6. 链接主程序
  7. 链接动态库
  8. 弱绑定主程序
  9. 运行初始化方法
  10. 返回main函数

initializeMainExecutable

void initializeMainExecutable()
{
    // record that we've reached this step
    gLinkContext.startedInitializingMainExecutable = true;

    // run initialzers for any inserted dylibs
    // 运行所有的dylibs中的initialzers方法
    ImageLoader::InitializerTimingList initializerTimes[allImagesCount()];
    initializerTimes[0].count = 0;
    const size_t rootCount = sImageRoots.size();
    //先运行动态库的初始化方法
    if ( rootCount > 1 ) {
            for(size_t i=1; i < rootCount; ++i) {
               sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]);
            }
    }

    // run initializers for main executable and everything it brings up 
    // 运行主程序的初始化方法
    sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]);

    ...
}
  • 先运行动态库的初始化方法,在运行主程序的初始化方法

  • 初始化 runInitializers 方法


void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo)
{
	uint64_t t1 = mach_absolute_time();
	mach_port_t thisThread = mach_thread_self();
	ImageLoader::UninitedUpwards up;
	up.count = 1;
	up.imagesAndPaths[0] = { this, this->getPath() };
        // 递归当前image的镜像列表实例化
	processInitializers(context, thisThread, timingInfo, up);
	context.notifyBatch(dyld_image_state_initialized, false);
	mach_port_deallocate(mach_task_self(), thisThread);
	uint64_t t2 = mach_absolute_time();
	fgTotalInitTime += (t2 - t1);
}

processInitializers

void ImageLoader::processInitializers(const LinkContext& context, mach_port_t 
thisThread,InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images)
{
	uint32_t maxImageCount = context.imageCount()+2;
	ImageLoader::UninitedUpwards upsBuffer[maxImageCount];
	ImageLoader::UninitedUpwards& ups = upsBuffer[0];
	ups.count = 0;
	// Calling recursive init on all images in images list, building a new list of
	// uninitialized upward dependencies.
        //递归所有镜像列表中的所有`image`,如果有没有初始化就去初始化
	for (uintptr_t i=0; i < images.count; ++i) {
		images.imagesAndPaths[i].first->recursiveInitialization(context, 
                thisThread, images.imagesAndPaths[i].second, timingInfo, ups);
	}
	// If any upward dependencies remain, init them.
        // 为了保证所有的向上依赖关系都初始化,再次把没有初始化的image去初始化
	if ( ups.count > 0 )
		processInitializers(context, thisThread, timingInfo, ups);
}

recursiveInitialization
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t 
this_thread, const char* pathToInitialize,
InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
{
	recursive_lock lock_info(this_thread);
	recursiveSpinLock(lock_info);
        ...
        // initialize lower level libraries first
         //优先初始化依赖最深的库
        for(unsigned int i=0; i < libraryCount(); ++i) {
            ImageLoader* dependentImage = libImage(i);
            if ( dependentImage != NULL ) {
                // don't try to initialize stuff "above" me yet
                if ( libIsUpward(i) ) {
                        uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, 
                        libPath(i) };
                        uninitUps.count++;
                }
                else if ( dependentImage->fDepth >= fDepth ) {
                        dependentImage->recursiveInitialization(context, this_thread, 
                        libPath(i), timingInfo, uninitUps);
                }
           }
        }
			
        // 将要初始化的image
        uint64_t t1 = mach_absolute_time();
        fState = dyld_image_state_dependents_initialized;
        oldState = fState;
        context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);

        // initialize this image
        // 初始化image
        bool hasInitializers = this->doInitialization(context);
        // image初始化完成
        // let anyone know we finished initializing this image
        fState = dyld_image_state_initialized;
        oldState = fState;
        context.notifySingle(dyld_image_state_initialized, this, NULL);

	... 
	 
	recursiveSpinUnLock();
}
  1. 需要初始化的动态库image 是从libImage()中获取,二libImage()的数据是在链接动态库的 recursiveLoadLibraries中的setLibImage保存的image
  2. 系统会根据每个库的依赖深度去初始化,深度值最大的先去初始化,每次初始化都会有一个image文件
  3. image都会调用context.notifySingle方法去调用load_images调用load方法
  4. doInitialization是初始化没有依赖的库
notifySingle
static void notifySingle(dyld_image_states state, const ImageLoader* image, ImageLoader::InitializerTimingList* timingInfo)
{
	//dyld::log("notifySingle(state=%d, image=%s)\n", state, image->getPath());
	std::vector<dyld_image_state_change_handler>* handlers = stateToHandlers(state, sSingleHandlers);
	if ( handlers != NULL ) {
		dyld_image_info info;
		info.imageLoadAddress	= image->machHeader();
		info.imageFilePath		= image->getRealPath();
		info.imageFileModDate	= image->lastModified();
		for (std::vector<dyld_image_state_change_handler>::iterator it = handlers->begin(); it != handlers->end(); ++it) {
			const char* result = (*it)(state, 1, &info);
			if ( (result != NULL) && (state == dyld_image_state_mapped) ) {
				//fprintf(stderr, "  image rejected by handler=%p\n", *it);
				// make copy of thrown string so that later catch clauses can free it
				const char* str = strdup(result);
				throw str;
			}
		}
	}
	if ( state == dyld_image_state_mapped ) {
		// <rdar://problem/7008875> Save load addr + UUID for images from outside the shared cache
		// <rdar://problem/50432671> Include UUIDs for shared cache dylibs in all image info when using private mapped shared caches
		if (!image->inSharedCache()
			|| (gLinkContext.sharedRegionMode == ImageLoader::kUsePrivateSharedRegion)) {
			dyld_uuid_info info;
			if ( image->getUUID(info.imageUUID) ) {
				info.imageLoadAddress = image->machHeader();
				addNonSharedCacheImageUUID(info);
			}
		}
	}
	if ( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit != NULL) && image->notifyObjC() ) {
		uint64_t t0 = mach_absolute_time();
		dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
		(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
		uint64_t t1 = mach_absolute_time();
		uint64_t t2 = mach_absolute_time();
		uint64_t timeInObjC = t1-t0;
		uint64_t emptyTime = (t2-t1)*100;
		if ( (timeInObjC > emptyTime) && (timingInfo != NULL) ) {
			timingInfo->addTime(image->getShortName(), timeInObjC);
		}
	}
    // mach message csdlc about dynamically unloaded images
	if ( image->addFuncNotified() && (state == dyld_image_state_terminated) ) {
		notifyKernel(*image, false);
		const struct mach_header* loadAddress[] = { image->machHeader() };
		const char* loadPath[] = { image->getPath() };
		notifyMonitoringDyld(true, 1, loadAddress, loadPath);
	}
}
sNotifyObjCInit 复制的地方
void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
	// record functions to call
	sNotifyObjCMapped	= mapped;
        //这里赋值
	sNotifyObjCInit		= init;
	sNotifyObjCUnmapped = unmapped;

	// call 'mapped' function with all images mapped so far
	try {
		notifyBatchPartial(dyld_image_state_bound, true, NULL, false, true);
	}
	catch (const char* msg) {
		// ignore request to abort during registration
	}

	// <rdar://problem/32209809> call 'init' function on all images already init'ed (below libSystem)
	for (std::vector<ImageLoader*>::iterator it=sAllImages.begin(); it != sAllImages.end(); it++) {
		ImageLoader* image = *it;
		if ( (image->getState() == dyld_image_state_initialized) && image->notifyObjC() ) {
			dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
			(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
		}
	}
}

_dyld_objc_notify_register

截屏2021-09-03 下午7.55.25.png

  1. doInitialization --> doModInitFunctions --> libSystem_initializer --> libdispatch_init --> _os_object_init --> _objc_init --> _dyld_objc_notify_register --> registerObjCNotifiers
  2. libSystem_initializer方法在libSystem系统库中
  3. libdispatch_init_os_object_init方法在libdispatch系统库中
  4. _objc_init方法在libobjc系统库中,objc源码库最
  • 下面就来探究下*sNotifyObjCInit具体执行了什么内容,首先找到赋值的地方_objc_init调用的_dyld_objc_notify_register所以赋值应该在_objc_init方法

14AA6E31-AE98-4FFD-9457-3EDC897C7A6C.png

  • sNotifyObjCInitload_images 其实就是调用load_images方法,下面探究下load_images方法

3CB3C8A5-94B5-425E-84E6-1DAC314ABC27.png

  • call_load_methods 从方法名字来看就是调用load方法,接着往下探究call_load_methods方法

void call_load_methods(void)
{
    static bool loading = NO;
    bool more_categories;

    loadMethodLock.assertLocked();

    // Re-entrant calls do nothing; the outermost call will finish the job.
    if (loading) return;
    loading = YES;

    void *pool = objc_autoreleasePoolPush();

    do {
        // 1. Repeatedly call class +loads until there aren't any more
        while (loadable_classes_used > 0) {
            call_class_loads();
        }

        // 2. Call category +loads ONCE
        more_categories = call_category_loads();

        // 3. Run more +loads if there are classes OR more untried categories
    } while (loadable_classes_used > 0  ||  more_categories);

    objc_autoreleasePoolPop(pool);

    loading = NO;
}
  • call_class_loads(void)
static void call_class_loads(void)
{
    int i;
    
    // Detach current loadable list.
    struct loadable_class *classes = loadable_classes;
    int used = loadable_classes_used;
    loadable_classes = nil;
    loadable_classes_allocated = 0;
    loadable_classes_used = 0;
    
    // Call all +loads for the detached list.
    for (i = 0; i < used; i++) {
        Class cls = classes[i].cls;
        load_method_t load_method = (load_method_t)classes[i].method;
        if (!cls) continue; 

        if (PrintLoading) {
            _objc_inform("LOAD: +[%s load]\n", cls->nameForLogging());
        }
        (*load_method)(cls, @selector(load));
    }
    
    // Destroy the detached list.
    if (classes) free(classes);
}

61384EC1-1E36-4A79-8A6A-331F1BA4D8B6.png

  • 类和分类都会调用load方法,从这里调用顺序可不可得出这样一个结论
  1. 类的load比分类的load方法先调用,类中load方法调用完才开始调用分类的load方法
  2. 类中的load方法按编译先后顺序,谁先编译谁的load方法先调用
  3. 分类中的的load方法按编译先后顺序,谁先编译谁的load方法先调用

_objc_init流程

_objc_init反推整个流程。调用_objc_init方法的是_os_object_init方法,在libdispatch源码库中全局搜索_os_object_init

  • _os_object_init

void _os_object_init(void) { _objc_init(); Block_callbacks_RR callbacks = { sizeof(Block_callbacks_RR), (void (*)(const void ))&objc_retain, (void ()(const void ))&objc_release, (void ()(const void *))&_os_objc_destructInstance }; _Block_use_RR2(&callbacks); #if DISPATCH_COCOA_COMPAT const char *v = getenv("OBJC_DEBUG_MISSING_POOLS"); if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v); v = getenv("DISPATCH_DEBUG_MISSING_POOLS"); if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v); v = getenv("LIBDISPATCH_DEBUG_MISSING_POOLS"); if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v); #endif }

  • _os_object_init方法确实调用_objc_init方法。_os_object_init方法是被libdispatch_init调用,继续验证

image.png

  • libdispatch_init方法确实调用_os_object_init方法。libdispatch_initlibSystem_initializer调用,libSystem_initializer方法是在libSystem系统库中

image.png

  • libSystem_initializer方法确实调用libdispatch_init方法。libSystem_initializer方法是被doModInitFunctions调用, doModInitFunctions方法是在dyld源码库中的

image.png

libSystem的初始化程序,必须最先运行,然后再运行其他的,doModInitFunctions 调用所有的C++函数

  • doModInitFunctions方法调用了全局的c++方法,是在load方法之后
bool ImageLoaderMachO::doInitialization(const LinkContext& context)
{
	CRSetCrashLogMessage2(this->getPath());

	// mach-o has -init and static initializers
	doImageInit(context);
	doModInitFunctions(context);
	
	CRSetCrashLogMessage2(NULL);
	
	return (fHasDashInit || fHasInitializers);
}
  • recursiveInitialization调用doInitialization方法,又会到了递归的方法里,完美的串联起来 F6446030-E862-417F-BD2D-E6A3C49CEB38.png

load 方法的调用流程

  • _dyld_start --> dyldbootstrap::start --> dyld::_main --> intializeMainExecutable --> runInitializers --> processInitializers --> runInitializers -->recursiveInitialization --> notifySingle --> load_images -->+[ViewController load]

_objc_init方法的调用流程

  • doInitialization --> doModInitFunctions --> libSystem_initializer --> libdispatch_init --> _os_object_init --> _objc_init --> _dyld_objc_notify_register -->registerObjCNotifiers

这两个调用流程通过doInitialization和 notifySingle 完美形成一个完整的流程