9.iOS底层探索之APP加载流程(DYLD)

1,507 阅读18分钟

我们都知道main函数是程序启动的入口函数,但是main函数之前还会调用一些其他的函数,比如+ (void)load函数,+ (void)initialize函数,__attribute__((constructor)) void func(){}构造函数(c++的)等等,这里我们来探究main函数之前到底做了什么?

一、静态库和动态库

1.编译过程

  • 预编译:主要是宏替换,导入的头文件替换成头文件里面的代码,将#开头的预编译指令展开,比如#define,#include,#import
  • 编译:对预编译处理过的文件进行词法分析语法分析语义分析,并进行源代码优化,然后生成汇编代码;
  • 汇编:通过汇编器将汇编代码转换为机器可以执行的指令,并生成目标文件.o文件
  • 链接:将目标文件和用到的静态库动态库链接成可执行文件。

2.静态库

静态库:链接时完整的拷贝到可执行文件,会多次拷贝,造成冗余,使包变的更大 如.a.lib都是静态库

3.动态库

3.1 动态库:链接时不复制,程序运行时由系统加在到内存中,供系统调用,系统只需加载一次,多次使用,共用节省内存。

.dylib.framework都是动态库 系统的framework是动态的,开发者创建的framework是静态的

3.2 优势:
  • 减少APP包的体积
  • 共享内容,节约资源
  • 更新动态库,达到更新APP的目的
3.3 常见的动态库
  • UIKit
  • Foundation
  • libsystem
  • libDispatch
  • libobjc

二、dylddyld_shared_cache

1.dyld简介

dyld(The dynamic link editor)是苹果的动态链接器,负责程序的链接及加载工作,是苹果操作系统的重要组成部分,存在于MacOS系统的(/usr/lib/dyld)目录下。在应用被编译打包成可执行文件格式的Mach-O文件之后 ,交由dyld负责链接,加载程序

2.dyld_shared_cache

由于不止一个程序需要使用UIKit系统动态库,所以不可能在每个程序加载时都去加载所有的系统动态库。为了优化程序启动速度和利用动态库缓存,苹果从iOS3.1之后,将所有系统库(私有与公有)编译成一个大的缓存文件,这就是dyld_shared_cache,该缓存文件存在iOS系统下的/System/Library/Caches/com.apple.dyld/dyld_shared_cache_arn64目录下,这个文件大约1.61G大小

三、dyld加载流程

1.初探

我们知道loadmain函数之前,给它下一个断点,查看函数调用栈

发现第一个调用的是_dyld_start,通过dyld源码来探索

2._dyld_start

在源码中全局搜索_dyld_start,在dyldStartup.s找到,它是由汇编实现的,查看arm64的源码

在末尾又看到了熟悉的字眼dyldbootstrap::start

3.dyldbootstrap::start

发现中间有2个冒号,这是c++的命名空间,即startdyldbootstrap这个命名空间作用域里函数 全局搜索dyldbootstrap,找到namespace dyldbootstrapctrl +shift+j,快速定位文件,在文件中找到start函数。

//
//  This is code to bootstrap dyld.  This work in normally done for a program by dyld and crt.
//  In dyld we have to do this manually.
//
uintptr_t start(const struct macho_header* appsMachHeader, int argc, const char* argv[], 
				intptr_t slide, const struct macho_header* dyldsMachHeader,
				uintptr_t* startGlue)
{
	// if kernel had to slide dyld, we need to fix up load sensitive locations
	// we have to do this before using any global variables
	//计算出偏移值
    slide = slideOfMainExecutable(dyldsMachHeader);
    bool shouldRebase = slide != 0;
#if __has_feature(ptrauth_calls)
    shouldRebase = true;
#endif
    if ( shouldRebase ) {
        //重定向
        rebaseDyld(dyldsMachHeader, slide);
    }

	// allow dyld to use mach messaging
	//初始化消息,允许dyld使用mach消息传递
	mach_init();

	// kernel sets up env pointer to be just past end of agv array
	const char** envp = &argv[argc+1];
	
	// kernel sets up apple pointer to be just past end of envp array
	const char** apple = envp;
	while(*apple != NULL) { ++apple; }
	++apple;

	// set up random value for stack canary
	//栈溢出保护
	__guard_setup(apple);

#if DYLD_INITIALIZER_SUPPORT
	// run all C++ initializers inside dyld
	runDyldInitializers(dyldsMachHeader, slide, argc, argv, envp, apple);
#endif

	// now that we are done bootstrapping dyld, call dyld's main
	uintptr_t appsSlide = slideOfMainExecutable(appsMachHeader);
	return dyld::_main(appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}
  • 根据slideOfMainExecutable计算出slide, 通过slide判定是否需要重定位;这里的slide是根据ASLR技术 计算出的一个随机值,使得程序每一次运行的偏移值都不一样,防止攻击者通过固定地址发起恶意攻击
  • rebaseDyld进行重定向
  • mach_init初始化消息,允许dyld使用mach消息传递
  • __guard_setup栈溢出保护
  • 计算appsMachHeader的偏移,调用dyld::_main函数

4.dyld::_main

查看源码
//
// Entry point for dyld.  The kernel loads dyld and jumps to __dyld_start which
// sets up some registers and call this function.
//
// Returns address of main() in target program which __dyld_start jumps to
//
uintptr_t
_main(const macho_header* mainExecutableMH, uintptr_t mainExecutableSlide, 
		int argc, const char* argv[], const char* envp[], const char* apple[], 
		uintptr_t* startGlue)
{
	if (dyld3::kdebug_trace_dyld_enabled(DBG_DYLD_TIMING_LAUNCH_EXECUTABLE)) {
		launchTraceID = dyld3::kdebug_trace_dyld_duration_start(DBG_DYLD_TIMING_LAUNCH_EXECUTABLE, (uint64_t)mainExecutableMH, 0, 0);
	}

    // Grab the cdHash of the main executable from the environment
    ///🌹从环境中获取主可执行文件的cdHash
	uint8_t mainExecutableCDHashBuffer[20];
	const uint8_t* mainExecutableCDHash = nullptr;
	if ( hexToBytes(_simple_getenv(apple, "executable_cdhash"), 40, mainExecutableCDHashBuffer) )
		mainExecutableCDHash = mainExecutableCDHashBuffer;

	// Trace dyld's load
	notifyKernelAboutImage((macho_header*)&__dso_handle, _simple_getenv(apple, "dyld_file"));
#if !TARGET_IPHONE_SIMULATOR
	// Trace the main executable's load
	notifyKernelAboutImage(mainExecutableMH, _simple_getenv(apple, "executable_file"));
#endif

	uintptr_t result = 0;
	///🌹保存传入的可执行文件的头部(是一个struct macho_header结构体),后面根据头部访问信息
	sMainExecutableMachHeader = mainExecutableMH;
	sMainExecutableSlide = mainExecutableSlide;
#if __MAC_OS_X_VERSION_MIN_REQUIRED
	// if this is host dyld, check to see if iOS simulator is being run
	const char* rootPath = _simple_getenv(envp, "DYLD_ROOT_PATH");
	if ( (rootPath != NULL) ) {
		// look to see if simulator has its own dyld
		char simDyldPath[PATH_MAX]; 
		strlcpy(simDyldPath, rootPath, PATH_MAX);
		strlcat(simDyldPath, "/usr/lib/dyld_sim", PATH_MAX);
		int fd = my_open(simDyldPath, O_RDONLY, 0);
		if ( fd != -1 ) {
			const char* errMessage = useSimulatorDyld(fd, mainExecutableMH, simDyldPath, argc, argv, envp, apple, startGlue, &result);
			if ( errMessage != NULL )
				halt(errMessage);
			return result;
		}
	}
#endif

	CRSetCrashLogMessage("dyld: launch started");
    /// 🌹根据可执行文件头部,参数等设置上下文信息
	setContext(mainExecutableMH, argc, argv, envp, apple);

	// Pickup the pointer to the exec path.
	/// 🌹获取可执行文件路径
	sExecPath = _simple_getenv(apple, "executable_path");

	// <rdar://problem/13868260> Remove interim apple[0] transition code from dyld
	if (!sExecPath) sExecPath = apple[0];
	
	if ( sExecPath[0] != '/' ) {
		// have relative path, use cwd to make absolute
		char cwdbuff[MAXPATHLEN];
	    if ( getcwd(cwdbuff, MAXPATHLEN) != NULL ) {
			// maybe use static buffer to avoid calling malloc so early...
			char* s = new char[strlen(cwdbuff) + strlen(sExecPath) + 2];
			strcpy(s, cwdbuff);
			strcat(s, "/");
			strcat(s, sExecPath);
			sExecPath = s;
		}
	}

	// Remember short name of process for later logging
	/// 🌹获取可执行文件的名字
	sExecShortName = ::strrchr(sExecPath, '/');
	if ( sExecShortName != NULL )
		++sExecShortName;
	else
		sExecShortName = sExecPath;
    /// 🌹检测进程是否受限
    configureProcessRestrictions(mainExecutableMH);

#if __MAC_OS_X_VERSION_MIN_REQUIRED
    if ( !gLinkContext.allowEnvVarsPrint && !gLinkContext.allowEnvVarsPath && !gLinkContext.allowEnvVarsSharedCache ) {
		pruneEnvironmentVariables(envp, &apple);
		// set again because envp and apple may have changed or moved
		setContext(mainExecutableMH, argc, argv, envp, apple);
	}
	else
#endif
	{
	    /// 🌹检查设置的环境变量
		checkEnvironmentVariables(envp);
		defaultUninitializedFallbackPaths(envp);
	}
#if __MAC_OS_X_VERSION_MIN_REQUIRED
	if (  ((dyld3::MachOFile*)mainExecutableMH)->supportsPlatform(dyld3::Platform::iOSMac)
	  && !((dyld3::MachOFile*)mainExecutableMH)->supportsPlatform(dyld3::Platform::macOS)) {
		gLinkContext.rootPaths = parseColonList("/System/iOSSupport", NULL);
		gLinkContext.marzipan = true;
		if ( sEnv.DYLD_FALLBACK_LIBRARY_PATH == sLibraryFallbackPaths )
			sEnv.DYLD_FALLBACK_LIBRARY_PATH = sRestrictedLibraryFallbackPaths;
		if ( sEnv.DYLD_FALLBACK_FRAMEWORK_PATH == sFrameworkFallbackPaths )
			sEnv.DYLD_FALLBACK_FRAMEWORK_PATH = sRestrictedFrameworkFallbackPaths;
	}
#endif
    /// 🌹如果设置了DYLD_PRINT_OPTS环境变量,则打印参数
	if ( sEnv.DYLD_PRINT_OPTS )
		printOptions(argv);
	/// 🌹如果设置了DYLD_PRINT_ENV环境变量,则打印环境变量
	if ( sEnv.DYLD_PRINT_ENV ) 
		printEnvironmentVariables(envp);
	/// 🌹根据Mach-O头部获取当前运行架构信息
	getHostInfo(mainExecutableMH, mainExecutableSlide);

	// load shared cache
	/// 🌹检查共享缓存是否开启,iOS中必须开启
	checkSharedRegionDisable((dyld3::MachOLoaded*)mainExecutableMH, mainExecutableSlide);
#if TARGET_IPHONE_SIMULATOR
	// <HACK> until <rdar://30773711> is fixed
	gLinkContext.sharedRegionMode = ImageLoader::kUsePrivateSharedRegion;
	// </HACK>
#endif
	if ( gLinkContext.sharedRegionMode != ImageLoader::kDontUseSharedRegion ) {
	    /// 🌹映射共享缓存到共享区域
		mapSharedCache();
	}
	bool cacheCompatible = (sSharedCacheLoadInfo.loadAddress == nullptr) || (sSharedCacheLoadInfo.loadAddress->header.formatVersion == dyld3::closure::kFormatVersion);
	if ( cacheCompatible && (sEnableClosures || inWhiteList(sExecPath)) ) {
		const dyld3::closure::LaunchClosure* mainClosure = nullptr;
		dyld3::closure::LoadedFileInfo mainFileInfo;
		mainFileInfo.fileContent = mainExecutableMH;
		mainFileInfo.path = sExecPath;
		// FIXME: If we are saving this closure, this slice offset/length is probably wrong in the case of FAT files.
		mainFileInfo.sliceOffset = 0;
		mainFileInfo.sliceLen = std::numeric_limits<__typeof(mainFileInfo.sliceLen)>::max();
		struct stat mainExeStatBuf;
		if ( ::stat(sExecPath, &mainExeStatBuf) == 0 ) {
			mainFileInfo.inode = mainExeStatBuf.st_ino;
			mainFileInfo.mtime = mainExeStatBuf.st_mtime;
		}
		// check for closure in cache first
		if ( sSharedCacheLoadInfo.loadAddress != nullptr ) {
			mainClosure = sSharedCacheLoadInfo.loadAddress->findClosure(sExecPath);
			if ( gLinkContext.verboseWarnings && (mainClosure != nullptr) )
				dyld::log("dyld: found closure %p (size=%lu) in dyld shared cache\n", mainClosure, mainClosure->size());
		}
	#if !TARGET_IPHONE_SIMULATOR
		if ( (mainClosure == nullptr) || !closureValid(mainClosure, mainFileInfo, mainExecutableCDHash, true, envp) ) {
			mainClosure = nullptr;
			if ( sEnableClosures || isStagedApp((dyld3::MachOFile*)mainExecutableMH, sExecPath) ) {
				// if forcing closures, and no closure in cache, or it is invalid, check for cached closure
				mainClosure = findCachedLaunchClosure(mainExecutableCDHash, mainFileInfo, envp);
				if ( mainClosure == nullptr ) {
					// if  no cached closure found, build new one
					mainClosure = buildLaunchClosure(mainExecutableCDHash, mainFileInfo, envp);
				}
			}
		}
	#endif
		// try using launch closure
		if ( mainClosure != nullptr ) {
			CRSetCrashLogMessage("dyld3: launch started");
			bool launched = launchWithClosure(mainClosure, sSharedCacheLoadInfo.loadAddress, (dyld3::MachOLoaded*)mainExecutableMH,
											  mainExecutableSlide, argc, argv, envp, apple, &result, startGlue);
	#if !TARGET_IPHONE_SIMULATOR
			if ( !launched ) {
				// closure is out of date, build new one
				mainClosure = buildLaunchClosure(mainExecutableCDHash, mainFileInfo, envp);
				if ( mainClosure != nullptr ) {
					launched = launchWithClosure(mainClosure, sSharedCacheLoadInfo.loadAddress, (dyld3::MachOLoaded*)mainExecutableMH,
												 mainExecutableSlide, argc, argv, envp, apple, &result, startGlue);
				}
			}
	#endif
			if ( launched ) {
#if __has_feature(ptrauth_calls)
				// start() calls the result pointer as a function pointer so we need to sign it.
				result = (uintptr_t)__builtin_ptrauth_sign_unauthenticated((void*)result, 0, 0);
#endif
				if (sSkipMain)
					result = (uintptr_t)&fake_main;
				return result;
			}
			else {
				if ( gLinkContext.verboseWarnings )
					dyld::log("dyld: unable to use closure %p\n", mainClosure);
			}
		}
	}
	else {
		if ( gLinkContext.verboseWarnings )
			dyld::log("dyld: not using closure because shared cache format version does not match dyld's\n");
	}
	// could not use closure info, launch old way



	// install gdb notifier
	stateToHandlers(dyld_image_state_dependents_mapped, sBatchHandlers)->push_back(notifyGDB);
	stateToHandlers(dyld_image_state_mapped, sSingleHandlers)->push_back(updateAllImages);
	// make initial allocations large enough that it is unlikely to need to be re-alloced
	sImageRoots.reserve(16);
	sAddImageCallbacks.reserve(4);
	sRemoveImageCallbacks.reserve(4);
	sAddLoadImageCallbacks.reserve(4);
	sImageFilesNeedingTermination.reserve(16);
	sImageFilesNeedingDOFUnregistration.reserve(8);

#if !TARGET_IPHONE_SIMULATOR
#ifdef WAIT_FOR_SYSTEM_ORDER_HANDSHAKE
	// <rdar://problem/6849505> Add gating mechanism to dyld support system order file generation process
	WAIT_FOR_SYSTEM_ORDER_HANDSHAKE(dyld::gProcessInfo->systemOrderFlag);
#endif
#endif


	try {
		// add dyld itself to UUID list
		/// 🌹添加dyld到UUID列表
		addDyldImageToUUIDList();

#if SUPPORT_ACCELERATE_TABLES
#if __arm64e__
		// Disable accelerator tables when we have threaded rebase/bind, which is arm64e executables only for now.
		if (sMainExecutableMachHeader->cpusubtype == CPU_SUBTYPE_ARM64_E)
			sDisableAcceleratorTables = true;
#endif
		bool mainExcutableAlreadyRebased = false;
		if ( (sSharedCacheLoadInfo.loadAddress != nullptr) && !dylibsCanOverrideCache() && !sDisableAcceleratorTables && (sSharedCacheLoadInfo.loadAddress->header.accelerateInfoAddr != 0) ) {
			struct stat statBuf;
			if ( ::stat(IPHONE_DYLD_SHARED_CACHE_DIR "no-dyld2-accelerator-tables", &statBuf) != 0 )
				sAllCacheImagesProxy = ImageLoaderMegaDylib::makeImageLoaderMegaDylib(&sSharedCacheLoadInfo.loadAddress->header, sSharedCacheLoadInfo.slide, mainExecutableMH, gLinkContext);
		}
/// 🌹 🌹 🌹
reloadAllImages:
#endif

		CRSetCrashLogMessage(sLoadingCrashMessage);
		// instantiate ImageLoader for main executable
		/// 🌹加载主可执行文件并生成一个ImageLoader实例对象
		sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
		gLinkContext.mainExecutable = sMainExecutable;
		gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);

#if TARGET_IPHONE_SIMULATOR
		// check main executable is not too new for this OS
		{
			if ( ! isSimulatorBinary((uint8_t*)mainExecutableMH, sExecPath) ) {
				throwf("program was built for a platform that is not supported by this runtime");
			}
			uint32_t mainMinOS = sMainExecutable->minOSVersion();

			// dyld is always built for the current OS, so we can get the current OS version
			// from the load command in dyld itself.
			uint32_t dyldMinOS = ImageLoaderMachO::minOSVersion((const mach_header*)&__dso_handle);
			if ( mainMinOS > dyldMinOS ) {
	#if TARGET_OS_WATCH
				throwf("app was built for watchOS %d.%d which is newer than this simulator %d.%d",
						mainMinOS >> 16, ((mainMinOS >> 8) & 0xFF),
						dyldMinOS >> 16, ((dyldMinOS >> 8) & 0xFF));
	#elif TARGET_OS_TV
				throwf("app was built for tvOS %d.%d which is newer than this simulator %d.%d",
						mainMinOS >> 16, ((mainMinOS >> 8) & 0xFF),
						dyldMinOS >> 16, ((dyldMinOS >> 8) & 0xFF));
	#else
				throwf("app was built for iOS %d.%d which is newer than this simulator %d.%d",
						mainMinOS >> 16, ((mainMinOS >> 8) & 0xFF),
						dyldMinOS >> 16, ((dyldMinOS >> 8) & 0xFF));
	#endif
			}
		}
#endif


	#if __MAC_OS_X_VERSION_MIN_REQUIRED
		// <rdar://problem/22805519> be less strict about old mach-o binaries
		uint32_t mainSDK = sMainExecutable->sdkVersion();
		gLinkContext.strictMachORequired = (mainSDK >= DYLD_MACOSX_VERSION_10_12) || gLinkContext.allowInsertFailures;
	#else
		// simulators, iOS, tvOS, and watchOS are always strict
		gLinkContext.strictMachORequired = true;
	#endif

	#if SUPPORT_ACCELERATE_TABLES
		sAllImages.reserve((sAllCacheImagesProxy != NULL) ? 16 : INITIAL_IMAGE_COUNT);
	#else
		sAllImages.reserve(INITIAL_IMAGE_COUNT);
	#endif

		// Now that shared cache is loaded, setup an versioned dylib overrides
	#if SUPPORT_VERSIONED_PATHS
	    /// 🌹检查库的版本是否有更新,有则覆盖原有的
		checkVersionedPaths();
	#endif


		// dyld_all_image_infos image list does not contain dyld
		// add it as dyldPath field in dyld_all_image_infos
		// for simulator, dyld_sim is in image list, need host dyld added
#if TARGET_IPHONE_SIMULATOR
		// get path of host dyld from table of syscall vectors in host dyld
		void* addressInDyld = gSyscallHelpers;
#else
		// get path of dyld itself
		void*  addressInDyld = (void*)&__dso_handle;
#endif
		char dyldPathBuffer[MAXPATHLEN+1];
		int len = proc_regionfilename(getpid(), (uint64_t)(long)addressInDyld, dyldPathBuffer, MAXPATHLEN);
		if ( len > 0 ) {
			dyldPathBuffer[len] = '\0'; // proc_regionfilename() does not zero terminate returned string
			if ( strcmp(dyldPathBuffer, gProcessInfo->dyldPath) != 0 )
				gProcessInfo->dyldPath = strdup(dyldPathBuffer);
		}

		// load any inserted libraries
		/// 🌹加载所有DYLD_INSERT_LIBRARIES指定的库
		if	( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
			for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib) 
			    /// 🌹加载插入动态库
				loadInsertedDylib(*lib);
		}
		// record count of inserted libraries so that a flat search will look at 
		// inserted libraries, then main, then others.
		sInsertedDylibCount = sAllImages.size()-1;

		// link main executable
		/// 🌹链接主程序
		gLinkContext.linkingMainExecutable = true;
#if SUPPORT_ACCELERATE_TABLES
		if ( mainExcutableAlreadyRebased ) {
			// previous link() on main executable has already adjusted its internal pointers for ASLR
			// work around that by rebasing by inverse amount
			sMainExecutable->rebase(gLinkContext, -mainExecutableSlide);
		}
#endif
		link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
		sMainExecutable->setNeverUnloadRecursive();
		if ( sMainExecutable->forceFlat() ) {
			gLinkContext.bindFlat = true;
			gLinkContext.prebindUsage = ImageLoader::kUseNoPrebinding;
		}

		// link any inserted libraries
		/// 🌹链接所有插入的动态库
		// do this after linking main executable so that any dylibs pulled in by inserted 
		// dylibs (e.g. libSystem) will not be in front of dylibs the program uses
		if ( sInsertedDylibCount > 0 ) {
			for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
				ImageLoader* image = sAllImages[i+1];
				/// 🌹链接
				link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
				image->setNeverUnloadRecursive();
			}
			// only INSERTED libraries can interpose
			// register interposing info after all inserted libraries are bound so chaining works
			for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
				ImageLoader* image = sAllImages[i+1];
				image->registerInterposing(gLinkContext);
			}
		}

		// <rdar://problem/19315404> dyld should support interposition even without DYLD_INSERT_LIBRARIES
		for (long i=sInsertedDylibCount+1; i < sAllImages.size(); ++i) {
			ImageLoader* image = sAllImages[i];
			if ( image->inSharedCache() )
				continue;
			image->registerInterposing(gLinkContext);
		}
	#if SUPPORT_ACCELERATE_TABLES
		if ( (sAllCacheImagesProxy != NULL) && ImageLoader::haveInterposingTuples() ) {
			// Accelerator tables cannot be used with implicit interposing, so relaunch with accelerator tables disabled
			ImageLoader::clearInterposingTuples();
			// unmap all loaded dylibs (but not main executable)
			for (long i=1; i < sAllImages.size(); ++i) {
				ImageLoader* image = sAllImages[i];
				if ( image == sMainExecutable )
					continue;
				if ( image == sAllCacheImagesProxy )
					continue;
				image->setCanUnload();
				ImageLoader::deleteImage(image);
			}
			// note: we don't need to worry about inserted images because if DYLD_INSERT_LIBRARIES was set we would not be using the accelerator table
			sAllImages.clear();
			sImageRoots.clear();
			sImageFilesNeedingTermination.clear();
			sImageFilesNeedingDOFUnregistration.clear();
			sAddImageCallbacks.clear();
			sRemoveImageCallbacks.clear();
			sAddLoadImageCallbacks.clear();
			sDisableAcceleratorTables = true;
			sAllCacheImagesProxy = NULL;
			sMappedRangesStart = NULL;
			mainExcutableAlreadyRebased = true;
			gLinkContext.linkingMainExecutable = false;
			resetAllImages();
			goto reloadAllImages;
		}
	#endif

		// apply interposing to initial set of images
		for(int i=0; i < sImageRoots.size(); ++i) {
			sImageRoots[i]->applyInterposing(gLinkContext);
		}
		ImageLoader::applyInterposingToDyldCache(gLinkContext);
		gLinkContext.linkingMainExecutable = false;

		// Bind and notify for the main executable now that interposing has been registered
		uint64_t bindMainExecutableStartTime = mach_absolute_time();
		sMainExecutable->recursiveBindWithAccounting(gLinkContext, sEnv.DYLD_BIND_AT_LAUNCH, true);
		uint64_t bindMainExecutableEndTime = mach_absolute_time();
		ImageLoaderMachO::fgTotalBindTime += bindMainExecutableEndTime - bindMainExecutableStartTime;
		gLinkContext.notifyBatch(dyld_image_state_bound, false);

		// Bind and notify for the inserted images now interposing has been registered
		if ( sInsertedDylibCount > 0 ) {
			for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
				ImageLoader* image = sAllImages[i+1];
				image->recursiveBind(gLinkContext, sEnv.DYLD_BIND_AT_LAUNCH, true);
			}
		}
		
		// <rdar://problem/12186933> do weak binding only after all inserted images linked
		sMainExecutable->weakBind(gLinkContext);

		// If cache has branch island dylibs, tell debugger about them
		if ( (sSharedCacheLoadInfo.loadAddress != NULL) && (sSharedCacheLoadInfo.loadAddress->header.mappingOffset >= 0x78) && (sSharedCacheLoadInfo.loadAddress->header.branchPoolsOffset != 0) ) {
			uint32_t count = sSharedCacheLoadInfo.loadAddress->header.branchPoolsCount;
			dyld_image_info info[count];
			const uint64_t* poolAddress = (uint64_t*)((char*)sSharedCacheLoadInfo.loadAddress + sSharedCacheLoadInfo.loadAddress->header.branchPoolsOffset);
			// <rdar://problem/20799203> empty branch pools can be in development cache
			if ( ((mach_header*)poolAddress)->magic == sMainExecutableMachHeader->magic ) {
				for (int poolIndex=0; poolIndex < count; ++poolIndex) {
					uint64_t poolAddr = poolAddress[poolIndex] + sSharedCacheLoadInfo.slide;
					info[poolIndex].imageLoadAddress = (mach_header*)(long)poolAddr;
					info[poolIndex].imageFilePath = "dyld_shared_cache_branch_islands";
					info[poolIndex].imageFileModDate = 0;
				}
				// add to all_images list
				addImagesToAllImages(count, info);
				// tell gdb about new branch island images
				gProcessInfo->notification(dyld_image_adding, count, info);
			}
		}

		CRSetCrashLogMessage("dyld: launch, running initializers");
	#if SUPPORT_OLD_CRT_INITIALIZATION
		// Old way is to run initializers via a callback from crt1.o
		if ( ! gRunInitializersOldWay ) 
			initializeMainExecutable(); 
	#else
		// run all initializers
		/// 🌹执行所有初始化方法
		initializeMainExecutable(); 
	#endif

		// notify any montoring proccesses that this process is about to enter main()
		/// 通知任何引导过程此过程将进入main()
		if (dyld3::kdebug_trace_dyld_enabled(DBG_DYLD_TIMING_LAUNCH_EXECUTABLE)) {
			dyld3::kdebug_trace_dyld_duration_end(launchTraceID, DBG_DYLD_TIMING_LAUNCH_EXECUTABLE, 0, 0, 2);
		}
		/// 🌹通知监听dyld main
		notifyMonitoringDyldMain();

		// find entry point for main executable
		/// 🌹找到主可执行文件的入口
		result = (uintptr_t)sMainExecutable->getEntryFromLC_MAIN();
		if ( result != 0 ) {
			// main executable uses LC_MAIN, we need to use helper in libdyld to call into main()
			if ( (gLibSystemHelpers != NULL) && (gLibSystemHelpers->version >= 9) )
				*startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
			else
				halt("libdyld.dylib support not present for LC_MAIN");
		}
		else {
			// main executable uses LC_UNIXTHREAD, dyld needs to let "start" in program set up for main()
			result = (uintptr_t)sMainExecutable->getEntryFromLC_UNIXTHREAD();
			*startGlue = 0;
		}
#if __has_feature(ptrauth_calls)
		// start() calls the result pointer as a function pointer so we need to sign it.
		result = (uintptr_t)__builtin_ptrauth_sign_unauthenticated((void*)result, 0, 0);
#endif
	}
	catch(const char* message) {
		syncAllImages();
		halt(message);
	}
	catch(...) {
		dyld::log("dyld: launch failed\n");
	}

	CRSetCrashLogMessage("dyld2 mode");

	if (sSkipMain) {
		if (dyld3::kdebug_trace_dyld_enabled(DBG_DYLD_TIMING_LAUNCH_EXECUTABLE)) {
			dyld3::kdebug_trace_dyld_duration_end(launchTraceID, DBG_DYLD_TIMING_LAUNCH_EXECUTABLE, 0, 0, 2);
		}
		result = (uintptr_t)&fake_main;
		*startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
	}
	
	return result;
}


} // namespace

dyld::_main主要做了如下事情理:

4.1 环境变量相关处理
从环境中获取主可执行文件的cdHash
mainExecutableCDHash = mainExecutableCDHashBuffer;

从环境中获取主可执行文件的头部信息是macho_header结构体  和偏移量
sMainExecutableMachHeader = mainExecutableMH;
sMainExecutableSlide = mainExecutableSlide;

根据可执行文件头部,参数等设置上下文信息
setContext(mainExecutableMH, argc, argv, envp, apple);

获取可执行文件路径
sExecPath = _simple_getenv(apple, "executable_path");

检测进程是否受限
configureProcessRestrictions(mainExecutableMH);
	
检查设置的环境变量
checkEnvironmentVariables(envp);
defaultUninitializedFallbackPaths(envp);
		
如果设置了DYLD_PRINT_OPTS环境变量,则打印参数
if ( sEnv.DYLD_PRINT_OPTS )
    printOptions(argv);
如果设置了DYLD_PRINT_ENV环境变量,则打印环境变量
if ( sEnv.DYLD_PRINT_ENV ) 
    printEnvironmentVariables(envp);
    
根据Mach-O头部获取当前运行架构信息
getHostInfo(mainExecutableMH, mainExecutableSlide);
4.2 加载共享缓存

checkSharedRegionDisable检查是否开启共享缓存(iOS 下不会被禁用) mapSharedCache加载共享缓存库,其中调用loadDyldCache函数有这么几种情况:

  • 仅加载到当前进程mapCachePrivate(模拟器仅支持加载到当前进程)
  • 共享缓存是第一次被加载,就去做加载操作mapCacheSystemWide
  • 共享缓存不是第一次被加载,那么就不做任何处理
bool loadDyldCache(const SharedCacheOptions& options, SharedCacheLoadInfo* results)
{
    results->loadAddress        = 0;
    results->slide              = 0;
    results->errorMessage       = nullptr;

#if TARGET_IPHONE_SIMULATOR
    // simulator only supports mmap()ing cache privately into process
    /// 🌹模拟器仅支持加载到当前进程
    return mapCachePrivate(options, results);
#else
    if ( options.forcePrivate ) {
        // mmap cache into this process only
        return mapCachePrivate(options, results);
    }
    else {
        // fast path: when cache is already mapped into shared region
        bool hasError = false;
        /// 🌹共享缓存不是第一次被加
        if ( reuseExistingCache(options, results) ) {
            hasError = (results->errorMessage != nullptr);
        } else {
            // slow path: this is first process to load cache
            /// 🌹共享缓存是第一次被加载
            hasError = mapCacheSystemWide(options, results);
        }
        return hasError;
    }
#endif
}
4.3 将dyld本身添加到UUID列表

addDyldImageToUUIDList() 将dyld本身添加到UUID列表

4.4 reloadAllImages
4.4.1 实例化主程序instantiateFromLoadedImage -> instantiateMainExecutable -> sniffLoadCommands
// The kernel maps in main executable before dyld gets control.  We need to 
// make an ImageLoader* for the already mapped in main executable.
static ImageLoaderMachO* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, const char* path)
{
	// try mach-o loader
	//🌹isCompatibleMachO 里就会通过 header 里的 magic , cputype , cpusubtype 去检测是否兼容 
	if ( isCompatibleMachO((const uint8_t*)mh, path) ) {
	        //🌹在dyld获得控制之前,内核映射到主可执行文件中。我们需要为主可执行文件中已经映射的文件创建一个ImageLoader*。
		ImageLoader* image = ImageLoaderMachO::instantiateMainExecutable(mh, slide, path, gLinkContext);
		///添加镜像
		addImage(image);
		return (ImageLoaderMachO*)image;
	}
	
	throw "main executable not a known format";
}

instantiateMainExecutable中调用ImageLoaderMachO::sniffLoadCommands,这才是真正实例化主程序的函数

// determine if this mach-o file has classic or compressed LINKEDIT and number of segments it has
void ImageLoaderMachO::sniffLoadCommands(const macho_header* mh, const char* path, bool inCache, bool* compressed,
											unsigned int* segCount, unsigned int* libCount, const LinkContext& context,
											const linkedit_data_command** codeSigCmd,
											const encryption_info_command** encryptCmd)
{
	*compressed = false;
	*segCount = 0;
	*libCount = 0;
	*codeSigCmd = NULL;
	*encryptCmd = NULL;
	
     
        ...
    


	// fSegmentsArrayCount is only 8-bits
	if ( *segCount > 255 )
		dyld::throwf("malformed mach-o image: more than 255 segments in %s", path);

	// fSegmentsArrayCount is only 8-bits
	if ( *libCount > 4095 )
		dyld::throwf("malformed mach-o image: more than 4095 dependent libraries in %s", path);

	if ( needsAddedLibSystemDepency(*libCount, mh) )
		*libCount = 1;
}

这里几个参数都与MachO有关:

  • compressed:根据LC_DYLD_INFO_ONYL来决定
  • segCount: MachO文件中segment数量,最多不超过255个
  • libCount: MachO文件中依赖的动态库的数量, LC_LOAD_DYLIB(Foundation / UIKit ..) , 最大不能超过 4095 个.
  • codeSigCmd: 签名信息
  • encryptCmd: 加密信息,即应用加壳如cryptid
4.4.2 加载插入动态库
// load any inserted libraries
if ( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
    for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib) 
        loadInsertedDylib(*lib);
}
//记录插入动态库的数量 
sInsertedDylibCount = sAllImages.size()-1;

遍历DYLD_INSERT_LIBRARIES环境变量,调用loadInsertedDylib加载,越狱的插件就是基于这个原理来实现只需要下载插件就可以影响到应用,有部分防护手段就用到了这个环境变量

4.4.3 链接库
// link main executable
///🌹 链接主可执行文件
gLinkContext.linkingMainExecutable = true;
link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
sMainExecutable->setNeverUnloadRecursive();
if ( sMainExecutable->forceFlat() ) {
	gLinkContext.bindFlat = true;
	gLinkContext.prebindUsage = ImageLoader::kUseNoPrebinding;
}

// link any inserted libraries
///🌹 链接所有插入的库
// do this after linking main executable so that any dylibs pulled in by inserted 
// dylibs (e.g. libSystem) will not be in front of dylibs the program uses
if ( sInsertedDylibCount > 0 ) {
	for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
		ImageLoader* image = sAllImages[i+1];
		link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
		image->setNeverUnloadRecursive();
	}
	// only INSERTED libraries can interpose
	// register interposing info after all inserted libraries are bound so chaining works
	for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
		ImageLoader* image = sAllImages[i+1];
		image->registerInterposing(gLinkContext);
	}
}
...
// Bind and notify for the inserted images now interposing has been registered
if ( sInsertedDylibCount > 0 ) {
	for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
		ImageLoader* image = sAllImages[i+1];
		image->recursiveBind(gLinkContext, sEnv.DYLD_BIND_AT_LAUNCH, true);
	}
}

// do weak binding only after all inserted images linked
///🌹 其他库链接完成之后 进行弱绑定
sMainExecutable->weakBind(gLinkContext);
  • 进入 link 函数 , link 函数中有一系列 recursiveLoadLibraries , recursiveBindWithAccounting -> recursiveBind , 也就是递归进行符号绑定的过程 .
  • link 函数执行完毕之后会调用 sMainExecutable->weakBind(gLinkContext); 进行弱绑定 也就是说弱绑定一定发生在 其他库链接绑定完成之后 .

接下来是最重要的初始化主程序initializeMainExecutable

4.5 运行初始化主程序initializeMainExecutable
4.5.1 initializeMainExecutable
void initializeMainExecutable()
{
	...
	// run initializers for main executable and everything it brings up 
	sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]);
	
	...
}

initializeMainExecutable调用runInitializers

4.5.2 runInitializers
void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo)
{
	uint64_t t1 = mach_absolute_time();
	mach_port_t thisThread = mach_thread_self();
	ImageLoader::UninitedUpwards up;
	up.count = 1;
	up.images[0] = this;
	processInitializers(context, thisThread, timingInfo, up);
	context.notifyBatch(dyld_image_state_initialized, false);
	mach_port_deallocate(mach_task_self(), thisThread);
	uint64_t t2 = mach_absolute_time();
	fgTotalInitTime += (t2 - t1);
}

runInitializers调用processInitializers初始化准备

4.5.3 processInitializers初始化准备
// <rdar://problem/14412057> upward dylib initializers can be run too soon
// To handle dangling dylibs which are upward linked but not downward, all upward linked dylibs
// have their initialization postponed until after the recursion through downward dylibs
// has completed.
void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread,
									 InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images)
{
	uint32_t maxImageCount = context.imageCount()+2;
	ImageLoader::UninitedUpwards upsBuffer[maxImageCount];
	ImageLoader::UninitedUpwards& ups = upsBuffer[0];
	ups.count = 0;
	// Calling recursive init on all images in images list, building a new list of
	// uninitialized upward dependencies.
	for (uintptr_t i=0; i < images.count; ++i) {
	        /// 🌹递归初始局
		images.images[i]->recursiveInitialization(context, thisThread, images.images[i]->getPath(), timingInfo, ups);
	}
	// If any upward dependencies remain, init them.
	if ( ups.count > 0 )
		processInitializers(context, thisThread, timingInfo, ups);
}

点进去却只有声明,shift+cmd+O搜索recursiveInitialization

4.5.4 recursiveInitialization递归初始化镜像
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize,
										  InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
{
	...
			
 
	// let objc know we are about to initialize this image
	///通知objc 我们要初始化这个镜像
	uint64_t t1 = mach_absolute_time();
	fState = dyld_image_state_dependents_initialized;
	oldState = fState;
	context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
	
	// initialize this image
	///初始化镜像
	bool hasInitializers = this->doInitialization(context);

	// let anyone know we finished initializing this image
	/// 通知任何人我们完成映像的初始化
	fState = dyld_image_state_initialized;
	oldState = fState;
	context.notifySingle(dyld_image_state_initialized, this, NULL);
			
	 
}

使用notifySingle通知要初始化和初始化完毕,notifySingle很有可能是个回调函数

4.5.5 notifySingle 通知单个
static void notifySingle(dyld_image_states state, const ImageLoader* image, ImageLoader::InitializerTimingList* timingInfo)
{
	...
	
	if ( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit != NULL) && image->notifyObjC() ) {
		uint64_t t0 = mach_absolute_time();
		dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
		(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
		uint64_t t1 = mach_absolute_time();
		uint64_t t2 = mach_absolute_time();
		uint64_t timeInObjC = t1-t0;
		uint64_t emptyTime = (t2-t1)*100;
		if ( (timeInObjC > emptyTime) && (timingInfo != NULL) ) {
			timingInfo->addTime(image->getShortName(), timeInObjC);
		}
	}
    // mach message csdlc about dynamically unloaded images
	if ( image->addFuncNotified() && (state == dyld_image_state_terminated) ) {
		notifyKernel(*image, false);
		const struct mach_header* loadAddress[] = { image->machHeader() };
		const char* loadPath[] = { image->getPath() };
		notifyMonitoringDyld(true, 1, loadAddress, loadPath);
	}
}

根据函数调用栈,notifySingle下一步应该是load_images,但里面并没有,却发现了和image有关的(*sNotifyObjCInit)(image->getRealPath(), image->machHeader()),全局搜索sNotifyObjCInit,发现在如下函数赋值的

4.5.6 registerObjCNotifiers注册通知
void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
	// record functions to call
	sNotifyObjCMapped	= mapped;
	sNotifyObjCInit		= init;
	sNotifyObjCUnmapped = unmapped;
    ...
}

registerObjCNotifiers_dyld_objc_notify_register函数中被调用,这个函数只在运行时提供给objc使用

4.5.7 _dyld_objc_notify_register
void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped)
{
	dyld::registerObjCNotifiers(mapped, init, unmapped);
}

_dyld_objc_notify_register却再也没找到了,我们给_dyld_objc_notify_register下个符号断点

发现_dyld_objc_notify_register之前还调用了好多函数,回到递归初始化函数recursiveInitializationnotifySingle后面忽视了一个重要函数doInitialization

4.5.8 doInitialization遍历初始化
bool ImageLoaderMachO::doInitialization(const LinkContext& context)
{
	CRSetCrashLogMessage2(this->getPath());

	// mach-o has -init and static initializers
	doImageInit(context);
	doModInitFunctions(context);
	
	CRSetCrashLogMessage2(NULL);
	
	return (fHasDashInit || fHasInitializers);
}

里面有2个重要函数doImageInitdoModInitFunctions

void ImageLoaderMachO::doImageInit(const LinkContext& context)
{
	...
			switch (cmd->cmd) {
				case LC_ROUTINES_COMMAND:///🌹 🌹 🌹处理所有的镜像
					Initializer func = (Initializer)(((struct macho_routines_command*)cmd)->init_address + fSlide); 
					
					if ( ! dyld::gProcessInfo->libSystemInitialized ) {
						// <rdar://problem/17973316> libSystem initializer must run first
						dyld::throwf("-init function in image (%s) that does not link with libSystem.dylib\n", this->getPath());
					}
					{
						dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0);
						func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
					}
void ImageLoaderMachO::doModInitFunctions(const LinkContext& context)
{
	if ( cmd->cmd == LC_SEGMENT_COMMAND ) { /// 🌹 🌹 🌹只处理SEGMENT数据
		if ( type == S_MOD_INIT_FUNC_POINTERS ) {/// 🌹 🌹 🌹只处理_MOD_INIT_FUNCT数据
					
                	if ( ! dyld::gProcessInfo->libSystemInitialized ) {
                		// <rdar://problem/17973316> libSystem initializer must run first
                		const char* installPath = getInstallPath();
                	
                	bool haveLibSystemHelpersBefore = (dyld::gLibSystemHelpers != NULL);
			{
				dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0);
				func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
			}
							
                	bool haveLibSystemHelpersAfter = (dyld::gLibSystemHelpers != NULL);
                	/// haveLibSystemHelpersBefore 和 haveLibSystemHelpersAfter赋值相同,估计是上面大括号作用域中做了处理,改变了dyld::gLibSystemHelpers的值
                	if ( !haveLibSystemHelpersBefore && haveLibSystemHelpersAfter ) {
                		// now safe to use malloc() and other calls in libSystem.dylib
                		/// 🌹 🌹 🌹判断libSystem初始化完成 可以安全的调用libSystem.dylib里面的malloc()和其他函数了
                		dyld::gProcessInfo->libSystemInitialized = true;
                	}
		
}

doImageInit -> libSystem_initializer —> libdispatch_init —> _os_object_init —> _objc_init

doModInitFunctions内部调用__mod_init_funcs,也就是constructor方法(c++构造方法),这种构造方法存储在 Mach-O__DATA 段 , __mod_init_func节中

4.5.9 _objc_init

/***********************************************************************
* _objc_init
* Bootstrap initialization. Registers our image notifier with dyld.
* Called by libSystem BEFORE library initialization time
**********************************************************************/

void _objc_init(void)
{
    static bool initialized = false;
    if (initialized) return;
    initialized = true;
    
    // fixme defer initialization until an objc-using image is found?
    environ_init();
    tls_init();
    static_init();
    lock_init();
    exception_init();

    _dyld_objc_notify_register(&map_images, load_images, unmap_image);
}

_objc_init里面调用了_dyld_objc_notify_register,那么sNotifyObjCInit就是load_images的调用了

4.6 通知dyld的main,寻找主程序入口
notifyMonitoringDyldMain();

// find entry point for main executable
result = (uintptr_t)sMainExecutable->getEntryFromLC_MAIN();

四、总结