android10 so加载浅析

1,806 阅读10分钟

流程

java调用native的本质是什么?实质就是一个cpu跳转指令,如arm的bl指令,指令有个参数,得告诉指令跳转到什么地方,也就是内存地址。那如何获取方法内存地址,那首先方法得加载到内存吧,也就是方法所在的so得加载进内存,再根据so基址找到方法的内存地址。当然我们需要具备一些基础知识,如elf文件格式,我们才知道怎么加载内存,怎么找到对应的方法符号,鉴于elf格式资料网上很多,在此不做过多赘述

  1. Load:读取elf信息、加载PT_LOAD
  2. Link:预链接、重定位

java入口

System.loadLibrary(String libname)

​	-->Runtime.loadLibrary0(Class<?> fromClass, String libname)

​		-->Runtime.loadLibrary0(ClassLoader loader, Class<?> callerClass, String libname)

​			-->Runtime.nativeLoad(String filename, ClassLoader loader)

​				-->Runtime.nativeLoad(String filename, ClassLoader loader, Class<?> caller);

最后三个参数的nativeLoad为native方法,对应的是Runtime.c的Runtime_nativeLoad方法

static JNINativeMethod gMethods[] = {
  FAST_NATIVE_METHOD(Runtime, freeMemory, "()J"),
  FAST_NATIVE_METHOD(Runtime, totalMemory, "()J"),
  FAST_NATIVE_METHOD(Runtime, maxMemory, "()J"),
  NATIVE_METHOD(Runtime, nativeGc, "()V"),
  NATIVE_METHOD(Runtime, nativeExit, "(I)V"),
  NATIVE_METHOD(Runtime, nativeLoad,
                "(Ljava/lang/String;Ljava/lang/ClassLoader;Ljava/lang/Class;)"
                    "Ljava/lang/String;"),
};

由上我们可以看出java层最终跳到了native层的Runtime.c的Runtime_nativeLoad,接下来就进入native的世界

native层

基本流程如上,重点方法为**LoadNativeLibrary,**里面涉及了Jni_OnLoad的调用,最后由do_dlopen进入linker,在linker里面进行加载链接等操作

进入linker后,由find_libraries方法进行整体的分发操作,包括读取、加载、链接、一些其他处理的入口都在find_libraries方法里

find_libraries

  • Step 0: prepare.

    初始化准备工作:创建LoadTaskList,soinfos分配内存

  • Step 1: expand the list of load_tasks to include all DT_NEEDED libraries (do not load them just yet)

    遍历LoadTaskList,将所有DT_NEEDED导入LoadTaskList,进行so关键信息的读取,主要调用了find_library_internal

  • Step 2: Load libraries in random order (see b/24047022)

    主要间接调用了ElfReader::Load进行so的加载,将so加载进入内存

  • Step 3: pre-link all DT_NEEDED libraries in breadth first order.

    链接准备工作,主要读取.dynamic节区的一些关键数据,关键调用soinfo::prelink_image()

  • Step 4: Construct the global group. Note: DF_1_GLOBAL bit of a library is determined at step 3.

  • Step 5: Collect roots of local_groups.

  • Step 6: Link all local groups

    链接所有的本地groups 主要调用soinfo::link_image

  • Step 7: Mark all load_tasks as linked and increment refcounts for references between load_groups (at this point it does not matter if referenced load_groups were loaded by previous dlopen or as part of this one on step 6)

    一些收尾的工作,如一些flag设置

    链接器命名空间

bool find_libraries(android_namespace_t *ns,
                    soinfo *start_with,
                    const char *const library_names[],
                    size_t library_names_count,
                    soinfo *soinfos[],
                    std::vector<soinfo *> *ld_preloads,
                    size_t ld_preloads_count,
                    int rtld_flags,
                    const android_dlextinfo *extinfo,
                    bool add_as_children,
                    bool search_linked_namespaces,
                    std::vector<android_namespace_t *> *namespaces) {
    // Step 0: prepare.
    std::unordered_map<const soinfo *, ElfReader> readers_map;
    LoadTaskList load_tasks;

	  //遍历library_names_count,创建LoadTask
    for (size_t i = 0; i < library_names_count; ++i) {
        const char *name = library_names[i];
        load_tasks.push_back(LoadTask::create(name, start_with, ns, &readers_map));
    }

    // Step 1: expand the list of load_tasks to include
    // all DT_NEEDED libraries (do not load them just yet)
    for (size_t i = 0; i < load_tasks.size(); ++i) {
        LoadTask *task = load_tasks[i];
        soinfo *needed_by = task->get_needed_by();

      	// 判断是否为DT_DEEDED的库,根的deeded_by是从start_with
        bool is_dt_needed = needed_by != nullptr && (needed_by != start_with || add_as_children);
        task->set_extinfo(is_dt_needed ? nullptr : extinfo);
        task->set_dt_needed(is_dt_needed);

        // Note: start from the namespace that is stored in the LoadTask. This namespace
        // is different from the current namespace when the LoadTask is for a transitive
        // dependency and the lib that created the LoadTask is not found in the
        // current namespace but in one of the linked namespace.
      //读取
        if (!find_library_internal(const_cast<android_namespace_t *>(task->get_start_from()),
                                   task,
                                   &zip_archive_cache,
                                   &load_tasks,
                                   rtld_flags,
                                   search_linked_namespaces || is_dt_needed)) {
            return false;
        }

        soinfo *si = task->get_soinfo();

        if (is_dt_needed) {
            needed_by->add_child(si);
        }

        // When ld_preloads is not null, the first
        // ld_preloads_count libs are in fact ld_preloads.
        if (ld_preloads != nullptr && soinfos_count < ld_preloads_count) {
            ld_preloads->push_back(si);
        }

        if (soinfos_count < library_names_count) {
            soinfos[soinfos_count++] = si;
        }
    }

    // Step 2: Load libraries in random order (see b/24047022)
    LoadTaskList load_list;
    for (auto &&task : load_tasks) {
        soinfo *si = task->get_soinfo();
        auto pred = [&](const LoadTask *t) {
            return t->get_soinfo() == si;
        };

        if (!si->is_linked() &&
            std::find_if(load_list.begin(), load_list.end(), pred) == load_list.end()) {
            load_list.push_back(task);
        }
    }
    bool reserved_address_recursive = false;
    if (extinfo) {
        reserved_address_recursive = extinfo->flags & ANDROID_DLEXT_RESERVED_ADDRESS_RECURSIVE;
    }
    if (!reserved_address_recursive) {
        // Shuffle the load order in the normal case, but not if we are loading all
        // the libraries to a reserved address range.
        shuffle(&load_list);
    }

    // Set up address space parameters.
    address_space_params extinfo_params, default_params;
    size_t relro_fd_offset = 0;
    if (extinfo) {
        if (extinfo->flags & ANDROID_DLEXT_RESERVED_ADDRESS) {
            extinfo_params.start_addr = extinfo->reserved_addr;
            extinfo_params.reserved_size = extinfo->reserved_size;
            extinfo_params.must_use_address = true;
        } else if (extinfo->flags & ANDROID_DLEXT_RESERVED_ADDRESS_HINT) {
            extinfo_params.start_addr = extinfo->reserved_addr;
            extinfo_params.reserved_size = extinfo->reserved_size;
        }
    }

    for (auto &&task : load_list) {
        address_space_params *address_space =
                (reserved_address_recursive || !task->is_dt_needed()) ? &extinfo_params : &default_params;
      //加载进内存
        if (!task->load(address_space)) {
            return false;
        }
    }

    // Step 3: pre-link all DT_NEEDED libraries in breadth first order.
  	// 以广度优先的顺序预链接所有DT_NEEDED libraries(遍历获取.dynamic有用的信息)
    for (auto &&task : load_tasks) {
        soinfo *si = task->get_soinfo();
      	//链接
        if (!si->is_linked() && !si->prelink_image()) {
            return false;
        }
        register_soinfo_tls(si);
    }
  
      // Step 4: Construct the global group. Note: DF_1_GLOBAL bit of a library is
    // determined at step 3.

    .....
    // Step 4-2: Gather all DF_1_GLOBAL libs which were newly loaded during this
    // run. These will be the new member of the global group
    // 收集在此期间新加载的所有DF_1_GLOBAL libs,放至new_global_group_members全局组里面
    soinfo_list_t new_global_group_members;
    for (auto &&task : load_tasks) {
        soinfo *si = task->get_soinfo();
        if (!si->is_linked() && (si->get_dt_flags_1() & DF_1_GLOBAL) != 0) {
            new_global_group_members.push_back(si);
        }
    }

    // Step 4-3: Add the new global group members to all the linked namespaces
  	// 将所有链接的命名空间添加进全局组成员
    if (namespaces != nullptr) {
        for (auto linked_ns : *namespaces) {
            for (auto si : new_global_group_members) {
                if (si->get_primary_namespace() != linked_ns) {
                    linked_ns->add_soinfo(si);
                    si->add_secondary_namespace(linked_ns);
                }
            }
        }
    }

    // Step 5: Collect roots of local_groups.
    // Whenever needed_by->si link crosses a namespace boundary it forms its own local_group.
    // Here we collect new roots to link them separately later on. Note that we need to avoid
    // collecting duplicates. Also the order is important. They need to be linked in the same
    // BFS order we link individual libraries.
    std::vector<soinfo *> local_group_roots;
    if (start_with != nullptr && add_as_children) {
        local_group_roots.push_back(start_with);
    } else {
        CHECK(soinfos_count == 1);
        local_group_roots.push_back(soinfos[0]);
    }

    for (auto &&task : load_tasks) {
        soinfo *si = task->get_soinfo();
        soinfo *needed_by = task->get_needed_by();
        bool is_dt_needed = needed_by != nullptr && (needed_by != start_with || add_as_children);
        android_namespace_t *needed_by_ns =
                is_dt_needed ? needed_by->get_primary_namespace() : ns;

        if (!si->is_linked() && si->get_primary_namespace() != needed_by_ns) {
            auto it = std::find(local_group_roots.begin(), local_group_roots.end(), si);
            if (it == local_group_roots.end()) {
                local_group_roots.push_back(si);
            }
        }
    }

    // Step 6: Link all local groups
    for (auto root : local_group_roots) {
        soinfo_list_t local_group;
        android_namespace_t *local_group_ns = root->get_primary_namespace();

        walk_dependencies_tree(root,
                               [&](soinfo *si) {
                                   if (local_group_ns->is_accessible(si)) {
                                       local_group.push_back(si);
                                       return kWalkContinue;
                                   } else {
                                       return kWalkSkip;
                                   }
                               });

        soinfo_list_t global_group = local_group_ns->get_global_group();
        bool linked = local_group.visit([&](soinfo *si) {
            // Even though local group may contain accessible soinfos from other namespaces
            // we should avoid linking them (because if they are not linked -> they
            // are in the local_group_roots and will be linked later).
            if (!si->is_linked() && si->get_primary_namespace() == local_group_ns) {
                const android_dlextinfo *link_extinfo = nullptr;
                if (si == soinfos[0] || reserved_address_recursive) {
                    // Only forward extinfo for the first library unless the recursive
                    // flag is set.
                    link_extinfo = extinfo;
                }
                if (!si->link_image(global_group, local_group, link_extinfo, &relro_fd_offset) ||
                    !get_cfi_shadow()->AfterLoad(si, solist_get_head())) {
                    return false;
                }
            }

            return true;
        });

        if (!linked) {
            return false;
        }
    }

    // Step 7: Mark all load_tasks as linked and increment refcounts
    // for references between load_groups (at this point it does not matter if
    // referenced load_groups were loaded by previous dlopen or as part of this
    // one on step 6)
    if (start_with != nullptr && add_as_children) {
        start_with->set_linked();
    }

    for (auto &&task : load_tasks) {
        soinfo *si = task->get_soinfo();
        si->set_linked();
    }

    for (auto &&task : load_tasks) {
        soinfo *si = task->get_soinfo();
        soinfo *needed_by = task->get_needed_by();
        if (needed_by != nullptr &&
            needed_by != start_with &&
            needed_by->get_local_group_root() != si->get_local_group_root()) {
            si->increment_ref_count();
        }
    }


    return true;
}

find_library_internal

主要为读取so的section和segment等信息,调用load_library进行读取,虽然为load,但本质不是load,为真正的load做一些准备工作

static bool find_library_internal(android_namespace_t *ns,
                                  LoadTask *task,
                                  ZipArchiveCache *zip_archive_cache,
                                  LoadTaskList *load_tasks,
                                  int rtld_flags,
                                  bool search_linked_namespaces) {
    soinfo *candidate;

  	//如果已经加载过直接返回true
    if (find_loaded_library_by_soname(ns, task->get_name(), search_linked_namespaces, &candidate)) {
        task->set_soinfo(candidate);
        return true;
    }

    // Library might still be loaded, the accurate detection
    // of this fact is done by load_library.
  	// 库可能仍然被加载,这个事实的准确检测是由load_Library完成的。
    if (load_library(ns, task, zip_archive_cache, load_tasks, rtld_flags, search_linked_namespaces)) {
        return true;
    }

    // TODO(dimitry): workaround for http://b/26394120 (the grey-list)
  	//灰名单判断
    if (ns->is_greylist_enabled() && is_greylisted(ns, task->get_name(), task->get_needed_by())) {
        // For the libs in the greylist, switch to the default namespace and then
        // try the load again from there. The library could be loaded from the
        // default namespace or from another namespace (e.g. runtime) that is linked
        // from the default namespace.
        ns = &g_default_namespace;
        if (load_library(ns, task, zip_archive_cache, load_tasks, rtld_flags,
                         search_linked_namespaces)) {
            return true;
        }
    }
    // END OF WORKAROUND

    if (search_linked_namespaces) {
        // if a library was not found - look into linked namespaces
        // preserve current dlerror in the case it fails.
        DlErrorRestorer dlerror_restorer;
        for (auto &linked_namespace : ns->linked_namespaces()) {
            if (find_library_in_linked_namespace(linked_namespace, task)) {
                if (task->get_soinfo() == nullptr) {
                    // try to load the library - once namespace boundary is crossed
                    // we need to load a library within separate load_group
                    // to avoid using symbols from foreign namespace while.
                    //
                    // However, actual linking is deferred until when the global group
                    // is fully identified and is applied to all namespaces.
                    // Otherwise, the libs in the linked namespace won't get symbols from
                    // the global group.
                    if (load_library(linked_namespace.linked_namespace(), task, zip_archive_cache, load_tasks,
                                     rtld_flags, false)) {
                        LD_LOG(
                                kLogDlopen, "find_library_internal(ns=%s, task=%s): Found in linked namespace %s",
                                ns->get_name(), task->get_name(), linked_namespace.linked_namespace()->get_name());
                        return true;
                    }
                } else {
                    // lib is already loaded
                    return true;
                }
            }
        }
    }

    return false;
}

load_library

由上步find_library_internal调用

主要工作为:一些校验、soinfo分配读取section和segment遍历DT_NEEDED并创建LoadTask

DT_NEEDED为该so依赖的其它so库

static bool load_library(android_namespace_t *ns,
                         LoadTask *task,
                         LoadTaskList *load_tasks,
                         int rtld_flags,
                         const std::string &realpath,
                         bool search_linked_namespaces) {
    off64_t file_offset = task->get_file_offset();
    const char *name = task->get_name();
    const android_dlextinfo *extinfo = task->get_extinfo();
  
  	//一些校验
    .....
      
    //分配soinfo空间
    soinfo *si = soinfo_alloc(ns, realpath.c_str(), &file_stat, file_offset, rtld_flags);
    if (si == nullptr) {
        return false;
    }

    task->set_soinfo(si);

    // Read the ELF header and some of the segments.
  	// 读取elf头和一些段
    if (!task->read(realpath.c_str(), file_stat.st_size)) {
        soinfo_free(si);
        task->set_soinfo(nullptr);
        return false;
    }

    // find and set DT_RUNPATH and dt_soname
    // Note that these field values are temporary and are
    // going to be overwritten on soinfo::prelink_image
    // with values from PT_LOAD segments.
  	// 遍历.dynamic获取DT_RUNPATH、DT_SONAME
    const ElfReader &elf_reader = task->get_elf_reader();
    for (const ElfW(Dyn) *d = elf_reader.dynamic(); d->d_tag != DT_NULL; ++d) {
        if (d->d_tag == DT_RUNPATH) {
            si->set_dt_runpath(elf_reader.get_string(d->d_un.d_val));
        }
        if (d->d_tag == DT_SONAME) {
            si->set_soname(elf_reader.get_string(d->d_un.d_val));
        }
    }
  
  	// 遍历DT_NEEDED,创建LoadTask,push_back进load_tasks
  	// 注意LoadTask::create参数needed_by,为加载的soinfo
    for_each_dt_needed(task->get_elf_reader(), [&](const char *name) {
        load_tasks->push_back(LoadTask::create(name, si, ns, task->get_readers_map()));
    });

    return true;
}

soinfo_alloc

soinfo空间分配核心方法

soinfo *soinfo_alloc(android_namespace_t *ns, const char *name,
                     const struct stat *file_stat, off64_t file_offset,
                     uint32_t rtld_flags) {
    if (strlen(name) >= PATH_MAX) {
        async_safe_fatal("library name \"%s\" too long", name);
    }

    TRACE("name %s: allocating soinfo for ns=%p", name, ns);

    soinfo *si = new(g_soinfo_allocator.alloc()) soinfo(ns, name, file_stat, file_offset, rtld_flags);

    solist_add_soinfo(si);

    si->generate_handle();
    ns->add_soinfo(si);

    TRACE("name %s: allocated soinfo @ %p", name, si);
    return si;
}

ElfReader::Read

该方法为so的读取核心, 分辨是elf头的读取校验、程序头表读取、section头表读取、动态节区读取

读取的难点主要在对elf文件格式的了解,本质就是一些偏移而已

bool ElfReader::Read(const char *name, int fd, off64_t file_offset, off64_t file_size) {
    if (did_read_) {
        return true;
    }
    name_ = name;
    fd_ = fd;
    file_offset_ = file_offset;
    file_size_ = file_size;

    if (ReadElfHeader() &&
        VerifyElfHeader() &&
        ReadProgramHeaders() &&
        ReadSectionHeaders() &&
        ReadDynamicSection()) {
        did_read_ = true;
    }

    return did_read_;
}

for_each_dt_needed

根据动态节区遍历dt_needed

template<typename F>
static void for_each_dt_needed(const ElfReader &elf_reader, F action) {
    for (const ElfW(Dyn) *d = elf_reader.dynamic(); d->d_tag != DT_NULL; ++d) {
        if (d->d_tag == DT_NEEDED) {
            action(fix_dt_needed(elf_reader.get_string(d->d_un.d_val), elf_reader.name()));
        }
    }
}

由上可知,以find_library_internal为入口,为so加载做的一些准备工作,包括:elf信息读取、空间分配、dt_need的读取,西面就正式进入到so的加载

段加载

elf分为链接视图和加载视图,加载我们只需要关心sgemnt段

加载的本质:遍历需要加载的segment,mmap进内存内存,就这么简单。只是需要计算具体mmap到什么位置,权限是什么

因为elf加载进内存,每个段的rwx权限都不一样,所以需要进行内存对齐,构建elf文件的时候,为了节省空间,可能没进行文件对齐,需要在加载的时候注意对齐的处理,其实难点更多的是在内存对齐吧,具体细节不在此阐述。

以下函数为加载的关键方法,具体加载不做细究

bool ElfReader::LoadSegments() {
  for (size_t i = 0; i < phdr_num_; ++i) {
    const ElfW(Phdr)* phdr = &phdr_table_[i];

    // Segment addresses in memory.
    ElfW(Addr) seg_start = phdr->p_vaddr + load_bias_;
    ElfW(Addr) seg_end   = seg_start + phdr->p_memsz;

    ElfW(Addr) seg_page_start = PAGE_START(seg_start);
    ElfW(Addr) seg_page_end   = PAGE_END(seg_end);

    ElfW(Addr) seg_file_end   = seg_start + phdr->p_filesz;

    // File offsets.
    ElfW(Addr) file_start = phdr->p_offset;
    ElfW(Addr) file_end   = file_start + phdr->p_filesz;

    ElfW(Addr) file_page_start = PAGE_START(file_start);
    ElfW(Addr) file_length = file_end - file_page_start;

    if (file_length != 0) {
      int prot = PFLAGS_TO_PROT(phdr->p_flags);
      void* seg_addr = mmap64(reinterpret_cast<void*>(seg_page_start),
                            file_length,
                            prot,
                            MAP_FIXED|MAP_PRIVATE,
                            fd_,
                            file_offset_ + file_page_start);

    // if the segment is writable, and does not end on a page boundary,
    // zero-fill it until the page limit.
    if ((phdr->p_flags & PF_W) != 0 && PAGE_OFFSET(seg_file_end) > 0) {
      memset(reinterpret_cast<void*>(seg_file_end), 0, PAGE_SIZE - PAGE_OFFSET(seg_file_end));
    }

    seg_file_end = PAGE_END(seg_file_end);

    // seg_file_end is now the first page address after the file
    // content. If seg_end is larger, we need to zero anything
    // between them. This is done by using a private anonymous
    // map for all extra pages.
    if (seg_page_end > seg_file_end) {
      size_t zeromap_size = seg_page_end - seg_file_end;
      void* zeromap = mmap(reinterpret_cast<void*>(seg_file_end),
                           zeromap_size,
                           PFLAGS_TO_PROT(phdr->p_flags),
                           MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE,
                           -1,
                           0);
      prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, zeromap, zeromap_size, ".bss");
    }
  }
  return true;
}

由上so总算加载进了内存,其中包括了so依赖的库,当然接下来就是动态链接阶段,说白了就是对一些函数地址的修复

soinfo::prelink_image

动态链接准备工作,提取dynamic节区信息,本质就是遍历.dymaic节区

bool soinfo::prelink_image() {
    /* 提取dynamic节区信息 */
    ElfW(Word) dynamic_flags = 0;
    phdr_table_get_dynamic_section(phdr, phnum, load_bias, &dynamic, &dynamic_flags);

    /* We can't log anything until the linker is relocated */
  	/* 在重新定位链接器之前,我们无法记录任何内容 */
    bool relocating_linker = (flags_ & FLAG_LINKER) != 0;
    if (!relocating_linker) {
        INFO("[ Linking \"%s\" ]", get_realpath());
        DEBUG("si->base = %p si->flags = 0x%08x", reinterpret_cast<void *>(base), flags_);
    }

    // Extract useful information from dynamic section.
    // Note that: "Except for the DT_NULL element at the end of the array,
    // and the relative order of DT_NEEDED elements, entries may appear in any order."
    //
    // source: http://www.sco.com/developers/gabi/1998-04-29/ch5.dynamic.html
    uint32_t needed_count = 0;
    for (ElfW(Dyn) *d = dynamic; d->d_tag != DT_NULL; ++d) {
        DEBUG("d = %p, d[0](tag) = %p d[1](val) = %p",
              d, reinterpret_cast<void *>(d->d_tag), reinterpret_cast<void *>(d->d_un.d_val));
        switch (d->d_tag) {
            case DT_SONAME:
                // this is parsed after we have strtab initialized (see below).
                break;
            case DT_HASH:
                nbucket_ = reinterpret_cast<uint32_t *>(load_bias + d->d_un.d_ptr)[0];
                nchain_ = reinterpret_cast<uint32_t *>(load_bias + d->d_un.d_ptr)[1];
                bucket_ = reinterpret_cast<uint32_t *>(load_bias + d->d_un.d_ptr + 8);
                chain_ = reinterpret_cast<uint32_t *>(load_bias + d->d_un.d_ptr + 8 + nbucket_ * 4);
                break;
            case DT_STRTAB:
                strtab_ = reinterpret_cast<const char *>(load_bias + d->d_un.d_ptr);
                break;
            case DT_STRSZ:
                strtab_size_ = d->d_un.d_val;
                break;
            case DT_SYMTAB:
                symtab_ = reinterpret_cast<ElfW(Sym) *>(load_bias + d->d_un.d_ptr);
                break;
            case DT_SYMENT:
                if (d->d_un.d_val != sizeof(ElfW(Sym))) {
                    DL_ERR("invalid DT_SYMENT: %zd in \"%s\"",
                           static_cast<size_t>(d->d_un.d_val), get_realpath());
                    return false;
                }
                break;
            case DT_JMPREL:
                plt_rel_ = reinterpret_cast<ElfW(Rel) *>(load_bias + d->d_un.d_ptr);
                break;

            case DT_PLTRELSZ:
                plt_rel_count_ = d->d_un.d_val / sizeof(ElfW(Rel));
                break;
            case DT_RELR:
                relr_ = reinterpret_cast<ElfW(Relr) *>(load_bias + d->d_un.d_ptr);
                break;

            case DT_RELRSZ:
                relr_count_ = d->d_un.d_val / sizeof(ElfW(Relr));
                break;
            case DT_INIT:
                init_func_ = reinterpret_cast<linker_ctor_function_t>(load_bias + d->d_un.d_ptr);
                DEBUG("%s constructors (DT_INIT) found at %p", get_realpath(), init_func_);
                break;

            case DT_FINI:
                fini_func_ = reinterpret_cast<linker_dtor_function_t>(load_bias + d->d_un.d_ptr);
                DEBUG("%s destructors (DT_FINI) found at %p", get_realpath(), fini_func_);
                break;

            case DT_INIT_ARRAY:
                init_array_ = reinterpret_cast<linker_ctor_function_t *>(load_bias + d->d_un.d_ptr);
                DEBUG("%s constructors (DT_INIT_ARRAY) found at %p", get_realpath(), init_array_);
                break;

            case DT_INIT_ARRAYSZ:
                init_array_count_ = static_cast<uint32_t>(d->d_un.d_val) / sizeof(ElfW(Addr));
                break;

            case DT_FINI_ARRAY:
                fini_array_ = reinterpret_cast<linker_dtor_function_t *>(load_bias + d->d_un.d_ptr);
                DEBUG("%s destructors (DT_FINI_ARRAY) found at %p", get_realpath(), fini_array_);
                break;

            case DT_FINI_ARRAYSZ:
                fini_array_count_ = static_cast<uint32_t>(d->d_un.d_val) / sizeof(ElfW(Addr));
                break;
        }
    }

    // Sanity checks.
  	// 一些检查:hash表,字符串表、 字符表
    if (relocating_linker && needed_count != 0) {
        DL_ERR("linker cannot have DT_NEEDED dependencies on other libraries");
        return false;
    }
    if (nbucket_ == 0 && gnu_nbucket_ == 0) {
        DL_ERR("empty/missing DT_HASH/DT_GNU_HASH in \"%s\" "
               "(new hash type from the future?)", get_realpath());
        return false;
    }
    if (strtab_ == nullptr) {
        DL_ERR("empty/missing DT_STRTAB in \"%s\"", get_realpath());
        return false;
    }
    if (symtab_ == nullptr) {
        DL_ERR("empty/missing DT_SYMTAB in \"%s\"", get_realpath());
        return false;
    }

    // second pass - parse entries relying on strtab
  	// 通过上边获得的strtab表,解析so_name、runpath
    for (ElfW(Dyn) *d = dynamic; d->d_tag != DT_NULL; ++d) {
        switch (d->d_tag) {
            case DT_SONAME:
                set_soname(get_string(d->d_un.d_val));
                break;
            case DT_RUNPATH:
                set_dt_runpath(get_string(d->d_un.d_val));
                break;
        }
    }

    // Before M release linker was using basename in place of soname.
    // In the case when dt_soname is absent some apps stop working
    // because they can't find dt_needed library by soname.
    // This workaround should keep them working. (Applies only
    // for apps targeting sdk version < M.) Make an exception for
    // the main executable and linker; they do not need to have dt_soname.
    // TODO: >= O the linker doesn't need this workaround.
  	// 一些兼容的处理
    if (soname_ == nullptr &&
        this != solist_get_somain() &&
        (flags_ & FLAG_LINKER) == 0 &&
        get_application_target_sdk_version() < __ANDROID_API_M__) {
        soname_ = basename(realpath_.c_str());
        DL_WARN_documented_change(__ANDROID_API_M__,
                                  "missing-soname-enforced-for-api-level-23",
                                  "\"%s\" has no DT_SONAME (will use %s instead)",
                                  get_realpath(), soname_);

        // Don't call add_dlwarning because a missing DT_SONAME isn't important enough to show in the UI
    }
    return true;
}

soinfo::link_image

进入到动态链接环节,读取重定位信息,对内存进行修复工作

bool soinfo::link_image(const soinfo_list_t &global_group, const soinfo_list_t &local_group,
                        const android_dlextinfo *extinfo, size_t *relro_fd_offset) {
    if (is_image_linked()) {
        // already linked.
        return true;
    }

    local_group_root_ = local_group.front();
    if (local_group_root_ == nullptr) {
        local_group_root_ = this;
    }

    if ((flags_ & FLAG_LINKER) == 0 && local_group_root_ == this) {
        target_sdk_version_ = get_application_target_sdk_version();
    }

    VersionTracker version_tracker;

    if (!version_tracker.init(this)) {
        return false;
    }

#if !defined(__LP64__)
  	//DT_TEXTREL
    if (has_text_relocations) {
        // Fail if app is targeting M or above.
      	// 如果应用程序的目标是M或以上,则return false
        int app_target_api_level = get_application_target_sdk_version();
        if (app_target_api_level >= __ANDROID_API_M__) {
            return false;
        }
        // Make segments writable to allow text relocations to work properly. We will later call
        // phdr_table_protect_segments() after all of them are applied.
      	// 取消segments权限保护
        if (phdr_table_unprotect_segments(phdr, phnum, load_bias) < 0) {
            return false;
        }
    }

  	// DT_ANDROID_REL
    if (android_relocs_ != nullptr) {
        .....
    }

  	// DT_RELR
    if (relr_ != nullptr) {
        if (!relocate_relr()) {
            return false;
        }
    }

#if defined(USE_RELA)
	......
#else
    // DT_REL
    if (rel_ != nullptr) {
        if (!relocate(version_tracker,
                      plain_reloc_iterator(rel_, rel_count_), global_group, local_group)) {
            return false;
        }
    }
  	//DT_JMPREL
    if (plt_rel_ != nullptr) {
        if (!relocate(version_tracker,
                      plain_reloc_iterator(plt_rel_, plt_rel_count_), global_group, local_group)) {
            return false;
        }
    }
#endif
...
    DEBUG("[ finished linking %s ]", get_realpath());

#if !defined(__LP64__)
    if (has_text_relocations) {
        // All relocations are done, we can protect our segments back to read-only.
      	// 所有重定位已经完成,给segment恢复保护权限
        if (phdr_table_protect_segments(phdr, phnum, load_bias) < 0) {
            return false;
        }
    }
#endif

    ....

    notify_gdb_of_load(this);
    set_image_linked();
    return true;
}

关于调试

为了方便,可将linker的日志开启,方便研究

adb shell setprop debug.ld.all dlerror,dlopen

总结

至此,整个加载过程算是完成。

elf信息的读取:这块需要对elf格式有个很清晰的了解

elf加载:本质就是将需要load的segment加载进内存,其实最终就是一个mmap函数,也就是把elf文件的某段位置加载进内存,注意下对齐,和权限问题即可

动态链接:其实就是对elf进行一些修复工作,通过重定位表进行,因为前面工作已经将所有elf加载进了内存,所以很好找到每个函数和变量的地址,故而修复工作就简单了

elf加载完后,其实就是elf的初始化,也就是init_array以及JNI_OnLoad的调用,在此不做阐述,想了解的同学自行百度

文章为个人研究输出,望各位大佬批评指正