OC底层探索(十六):应用程序加载

932 阅读6分钟

应用程序加载

动态库和静态库

  • 库:就是可执行代码的二进制形式,被操作系统载入到内存中
  • 库在iOS系统中分为两种:静态库和动态库
  • 静态库形式:.a 和 .framework
  • 动态库形式:.framework 和 .dylib
  • 静态库:在链接时,是被完全的复制到可执行文件中,会导致重复
  • 动态库:在链接时,是由程序加载到内存中,只会加载一次,程序之间会共用,如:iOS系统的库基本都是动态库,这样可以节省内存 image.png
  • 动态库在打包时,需要检查签名,在debug时不会报错,签名必须包含TeamIdentifier,如果验证不通过,会报image not found的错误
  • 可以通过 codesign , 来验证
codesign -dv /path/to/YourApp.app
codesign -dv /path/to/youFramework.framework

image.png

编译过程

image.png

  • 想要将库加载到内存中,就需要了解苹果使用的链接器dyld

什么是dyld

  • 全称:the dynamic link editor, 是苹果的动态链接器,是苹果操作系统一个重要组成部分,在系统内核做好程序准备工作之后,交由dyld负责余下的工作。而且它是开源的,任何人可以通过苹果官网下载它的源码来阅读理解它的运作方式, 源码
  • 链接过程

image.png

  • app启动
  • dyld 加载系统库
  • objc_intit中注册

image.png

  • 递归加载image
  • 调用main函数

dyld流程

load 方法之前

  • 我们都知道 类的load方法在main函数之前,那么我们在load方法打一个断点,查看一下load之前的堆栈信息

image.png

  • _dyld_start
  • dyldbootstrap::start
  • dyld::initializeMainExecuttale
  • ImageLoader::runInitalizers
  • ImageLoader::processInitaializers
  • ImageLoader::recursiveInitialization
  • dyld::notifySingle
  • load_images
  • +load

objc_init 之前

image.png

  • 可以看出整个流程: 系统 -> _dyld_start -> dyldbootstrap::start -> dyld::_main -> dyld::initializeMainExecuttale -> ImageLoader::runInitalizers -> ImageLoader::processInitaializers -> ImageLoader::recursiveInitialization -> imageLoaderMachO::doInitialization -> imageLoaderMachO::doModInitFunctions -> libSystem_initializer -> libdispatch_init -> os_object_init -> _objc_init -> (environ_init(), tls_init(), static_init(), ``) _dyld_objc_notify_register ->
  • objc_init 同过 _dyld_objc_notify_register(&map_images, load_images, unmap_image)注册通知
  • 我们是在 objc_init 里注册的dyld通知,并把load_iamgesmap_images传了过去

image.png

流程图

  • 通过dyld先映射 image
  • 然后对链接并递归映射image的所有依赖
  • 然后 rebase,重定位image,也就对image的内存位子进行修正
  • 然后 bound,对重新定位的image进行绑定
  • 然后 递归初始化image,准备开始初始化,需要load_images,调用的是runtime源码的
  • 递归循环初始化image,判断有没有初始化libsystem没有的话就会去初始化libsystem, libdispatch, objc_init
  • 然后调用 _dyld_objc_notify_register,将 map_imagesload_images, unmap_images,交给dyld调用
  • 注册之后会立马map_images

dyld.png

_dyld_start -> dyldbootstrap::start

  • 看在最开始的入口_dyld_start, 查看一下汇编代码可以看到

image.png

  • 实际调用了 dyldstart,那么我们就可以根据dyld的源码去探究一下其具体的流程
  • 打开源码,发现dyld_start是用汇编实现的,根据不同cpu架构有不同的处理,我这里只拿出了arm架构下的代码
#if __arm__
    .text
    .align 2
__dyld_start:
    mov	r8, sp		// save stack pointer
    sub	sp, #16		// make room for outgoing parameters
    bic     sp, sp, #15	// force 16-byte alignment

    // call dyldbootstrap::start(app_mh, argc, argv, dyld_mh, &startGlue)
    ldr	r0, [r8]	// r0 = mach_header
    ldr	r1, [r8, #4]	// r1 = argc
    add	r2, r8, #8	// r2 = argv
    adr	r3, __dyld_start
    sub	r3 ,r3, #0x1000 // r3 = dyld_mh
    add	r4, sp, #12
    str	r4, [sp, #0]	// [sp] = &startGlue

    bl	__ZN13dyldbootstrap5startEPKN5dyld311MachOLoadedEiPPKcS3_Pm
    ldr	r5, [sp, #12]
    cmp	r5, #0
    bne	Lnew

    // traditional case, clean up stack and jump to result
    add	sp, r8, #4	// remove the mach_header argument.
    bx	r0		// jump to the program's entry point
  • 官方的注释也很清楚的标注了,会调用dyldbootstrap::startdyldbootstrap是命名空间,所以我们可以直接去查看对应的C++代码

dyldbootstrap::start

//
//  This is code to bootstrap dyld.  This work in normally done for a program by dyld and crt.
//  In dyld we have to do this manually.
//
uintptr_t start(const dyld3::MachOLoaded* appsMachHeader, int argc, const char* argv[],
				const dyld3::MachOLoaded* dyldsMachHeader, uintptr_t* startGlue)
{

    // Emit kdebug tracepoint to indicate dyld bootstrap has started <rdar://46878536>
    // 标记dyld 已启动
    dyld3::kdebug_trace_dyld_marker(DBG_DYLD_TIMING_BOOTSTRAP_START, 0, 0, 0, 0);

	// if kernel had to slide dyld, we need to fix up load sensitive locations
	// we have to do this before using any global variables
     //  重定位修复 dyld 的 locations
     // 在全局变量之前
    rebaseDyld(dyldsMachHeader);

	// kernel sets up env pointer to be just past end of agv array
        // 内核将env指针设置为agv数组的末端
	const char** envp = &argv[argc+1];
	
	// kernel sets up apple pointer to be just past end of envp array
        // 内核将apple指针设置为envp数组的末尾
	const char** apple = envp;
	while(*apple != NULL) { ++apple; }
	++apple;

	// set up random value for stack canary
        //  给stack canary 设置一个随机的值 
	__guard_setup(apple);

#if DYLD_INITIALIZER_SUPPORT
	// run all C++ initializers inside dyld
        //  运行 在dyld中运行所有c++初始化器
	runDyldInitializers(argc, argv, envp, apple);
#endif
      // lib.a 初始化
	_subsystem_init(apple);

	// now that we are done bootstrapping dyld, call dyld's main
        // 初始化好了,调用main函数
	uintptr_t appsSlide = appsMachHeader->getSlide();
	return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}
  • start中,dyld做了很多初始化
  • 重定向修复了dyld
  • 最后调用用了最为重要的main函数

dyld::_main函数

  • 进入到main函数,发现函数里的代码非常的长,不好阅读
  • 可以关注返回值result,往上推导,找到result赋值的地方,在进行了解
  • 通过result,发现sMainExecutable 跟其息息相关,所以就可以来探索sMainExecutable这个变量,从字面上的意思就可有看出是 主程序
  • 先找到sMainExecutable初始化的地方
static MainExecutablePointerType	sMainExecutable = NULL;

sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
  • addDyldImageToUUIDList(); 将 dyld image 添加到 UUID 列表中
  • mapSharedCache(mainExecutableSlide); 加载共享缓存
  • loadInsertedDylib(*lib);加载动态库
  • sMainExecutable->rebase(gLinkContext, -mainExecutableSlide); 重定位
  • sMainExecutable->weakBind(gLinkContext); 弱绑定后只把所有插入的image链接, `sMainExecutable->recursiveMakeDataReadOnly(gLinkContext);
  • initializeMainExecutable(); 初始化所有
  • notifyMonitoringDyldMain(); 通知进程即将进入main()函数
  • reuslt = (uintptr_t)sMainExecutable->getEntryFromLC_UNIXTHREAD();, 把入口设置给·main()函数

main总结

  • 设置运行环境
  • 加载共享缓存
  • 实例化主程序
  • 加载动态库
  • 弱绑定主程序
  • 链接插入的image
  • 执行初始化
  • 通知进程将要执行main()
  • 返回执行入口main()

dyld::main() -> initializeMainExecutable

void initializeMainExecutable()
{
	// record that we've reached this step
	gLinkContext.startedInitializingMainExecutable = true;

	// run initialzers for any inserted dylibs
	ImageLoader::InitializerTimingList initializerTimes[allImagesCount()];
	initializerTimes[0].count = 0;
	const size_t rootCount = sImageRoots.size();
        // 拿到所有镜像文件个数,遍历初始化
	if ( rootCount > 1 ) {
		for(size_t i=1; i < rootCount; ++i) {
			sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]);
		}
	}
	
	// run initializers for main executable and everything it brings up 
        // 初始化主程序和其相关的所有
	sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]);
	
	// register cxa_atexit() handler to run static terminators in all loaded images when this process exits
        // 注册 cxa_atexit()
	if ( gLibSystemHelpers != NULL ) 
		(*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL, NULL);

	// dump info if requested
        // 转储信息
	if ( sEnv.DYLD_PRINT_STATISTICS )
		ImageLoader::printStatistics((unsigned int)allImagesCount(), initializerTimes[0]);
	if ( sEnv.DYLD_PRINT_STATISTICS_DETAILS )
		ImageLoaderMachO::printStatisticsDetails((unsigned int)allImagesCount(), initializerTimes[0]);
}
  • 可以看点其内部会调用 runInitializers

initializeMainExecutable -> runInitializers

void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo)
{
    uint64_t t1 = mach_absolute_time();
    mach_port_t thisThread = mach_thread_self();
    ImageLoader::UninitedUpwards up;
    up.count = 1;
    up.imagesAndPaths[0] = { this, this->getPath() };
    (context, thisThread, timingInfo, up);
    //
    context.notifyBatch(dyld_image_state_initialized, false);
    mach_port_deallocate(mach_task_self(), thisThread);
    uint64_t t2 = mach_absolute_time();
    fgTotalInitTime += (t2 - t1);
}

runInitializers -> notifyBatch

static void notifyBatch(dyld_image_states state, bool preflightOnly)
{
	notifyBatchPartial(state, false, NULL, preflightOnly, false);
}

runInitializers -> processInitializers(

void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread,
									 InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images)
{
	uint32_t maxImageCount = context.imageCount()+2;
	ImageLoader::UninitedUpwards upsBuffer[maxImageCount];
	ImageLoader::UninitedUpwards& ups = upsBuffer[0];
	ups.count = 0;
	// Calling recursive init on all images in images list, building a new list of
	// uninitialized upward dependencies.
	for (uintptr_t i=0; i < images.count; ++i) {
        //开始递归初始化
		images.imagesAndPaths[i].first->recursiveInitialization(context, thisThread, images.imagesAndPaths[i].second, timingInfo, ups);
	}
	// If any upward dependencies remain, init them.
	if ( ups.count > 0 )
		processInitializers(context, thisThread, timingInfo, ups);
}

processInitializers -> recursiveInitialization

void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize,
										  InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
{
	recursive_lock lock_info(this_thread);
	recursiveSpinLock(lock_info);

	if ( fState < dyld_image_state_dependents_initialized-1 ) {
		uint8_t oldState = fState;
		// break cycles
		fState = dyld_image_state_dependents_initialized-1;
		try {
			// initialize lower level libraries first
			for(unsigned int i=0; i < libraryCount(); ++i) {
				ImageLoader* dependentImage = libImage(i);
				if ( dependentImage != NULL ) {
					// don't try to initialize stuff "above" me yet
					if ( libIsUpward(i) ) {
						uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) };
						uninitUps.count++;
					}
					else if ( dependentImage->fDepth >= fDepth ) {
						dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps);
					}
                }
			}
			
			// record termination order
                        // 记录终止菜单
			if ( this->needsTermination() )
				context.terminationRecorder(this);

			// let objc know we are about to initialize this image
                        // 让对象知道我们将于奥初始化这个image
			uint64_t t1 = mach_absolute_time();
			fState = dyld_image_state_dependents_initialized;
			oldState = fState;
			context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
			
			// initialize this image
                        // 初始化image
			bool hasInitializers = this->doInitialization(context);

			// let anyone know we finished initializing this image
                        // 通知已经结束初始化
			fState = dyld_image_state_initialized;
			oldState = fState;
			context.notifySingle(dyld_image_state_initialized, this, NULL);
			
			if ( hasInitializers ) {
				uint64_t t2 = mach_absolute_time();
				timingInfo.addTime(this->getShortName(), t2-t1);
			}
		}
		catch (const char* msg) {
			// this image is not initialized
			fState = oldState;
			recursiveSpinUnLock();
			throw;
		}
	}
	
	recursiveSpinUnLock();
}
  • dyld_image_state_dependents_initialized, 通知 objc将要初始化 image
  • doInitialization 初始化image
  • dyld_image_state_initialized, 通知objc已经结束初始化image