Mac OS X ABI Mach-O File Format Reference(3)

602 阅读34分钟

Mac OS X ABI Mach-O File Format Reference(1)

Mac OS X ABI Mach-O File Format Reference(2)

Symbol Table and Related Data Structures

Two load commands, LC_SYMTAB and LC_DYSYMTAB, describe the size and location of the symbol tables, along with additional metadata. The other data structures listed in this section represent the symbol tables themselves.

两个加载命令LC_SYMTABLC_DYSYMTAB描述了符号表的大小和位置以及其他元数据。本节中列出的其他数据结构表示符号表本身。

  • symtab_command

    Defines the attributes of the LC_SYMTAB load command. Describes the size and location of the symbol table data structures. Declared in /usr/include/mach-o/loader.h.

    定义lc_symtab load commands的属性。描述符号表数据结构的大小和位置。在/usr/include/mach-o/loader.h中声明。

    /*
     * The symtab_command contains the offsets and sizes of the link-edit 4.3BSD
     * "stab" style symbol table information as described in the header files
     * <nlist.h> and <stab.h>.
     */
    struct symtab_command {
    	uint32_t	cmd;		/* LC_SYMTAB */
    	uint32_t	cmdsize;	/* sizeof(struct symtab_command) */
    	uint32_t	symoff;		/* symbol table offset */
    	uint32_t	nsyms;		/* number of symbol table entries */
    	uint32_t	stroff;		/* string table offset */
    	uint32_t	strsize;	/* string table size in bytes */
    };
    

    Fields

    • cmd

    • cmdsize

    • symoff

      An integer containing the byte offset from the start of the file to the location of the symbol table entries. The symbol table is an array of nlist (page 39) data structures.

      包含从文件开始到符号表项位置的字节偏移量的整数。符号表是一个由nlist(第39页)数据结构组成的数组。

    • nsyms

      An integer indicating the number of entries in the symbol table.

      符号表中条目数

    • stroff

      An integer containing the byte offset from the start of the image to the location of the string table.

      包含从图像开始到字符串表位置的字节偏移量。

    • strsize

      An integer indicating the size (in bytes) of the string table.

      表示字符串表的大小(字节)。

    Discussion

    LC_SYMTAB should exist in both statically linked and dynamically linked file types.

    LC_SYMTAB应该同时存在于静态链接和动态链接的文件类型中。

  • nlist

    Describes an entry in the symbol table for 32-bit architectures. Declared in /usr/include/mach-o/nlist.h. See also nlist_64

    struct nlist {
    	union {
    #ifndef __LP64__
    		char *n_name;	/* for use when in-core */
    #endif
    		uint32_t n_strx;	/* index into the string table */
    	} n_un;
    	uint8_t n_type;		/* type flag, see below */
    	uint8_t n_sect;		/* section number or NO_SECT */
    	int16_t n_desc;		/* see <mach-o/stab.h> */
    	uint32_t n_value;	/* value of this symbol (or stab offset) */
    };
    
    /*
     * This is the symbol table entry structure for 64-bit architectures.
     */
    struct nlist_64 {
        union {
            uint32_t  n_strx; /* index into the string table */
        } n_un;
        uint8_t n_type;        /* type flag, see below */
        uint8_t n_sect;        /* section number or NO_SECT */
        uint16_t n_desc;       /* see <mach-o/stab.h> */
        uint64_t n_value;      /* value of this symbol (or stab offset) */
    };
    

    Fields

    • n_un

      A union that holds an index into the string table, n_strx. To specify an empty string (""), set this value to 0. The n_name field is not used in Mach-O files.

      共用体保存着在string table中的index,n_strx。若要指定空字符串(“”),请将此值设置为0。mach-o文件中不使用n_name字段。

    • n_type

      A byte value consisting of data accessed using four bit masks:

      一个字节值,由使用四位掩码访问的数据组成:

      /*
       * The n_type field really contains four fields:
       *	unsigned char N_STAB:3,
       *		      N_PEXT:1,
       *		      N_TYPE:3,
       *		      N_EXT:1;
       * which are used via the following masks.
       */
      #define	N_STAB	0xe0  /* if any of these bits set, a symbolic debugging entry */
      #define	N_PEXT	0x10  /* private external symbol bit */
      #define	N_TYPE	0x0e  /* mask for the type bits */
      #define	N_EXT	0x01  /* external symbol bit, set for external symbols */
      
      /*
       * Only symbolic debugging entries have some of the N_STAB bits set and if any
       * of these bits are set then it is a symbolic debugging entry (a stab).  In
       * which case then the values of the n_type field (the entire field) are given
       * in <mach-o/stab.h>
       */
      
      /*
       * Values for N_TYPE bits of the n_type field.
       */
      #define	N_UNDF	0x0		/* undefined, n_sect == NO_SECT */
      #define	N_ABS	0x2		/* absolute, n_sect == NO_SECT */
      #define	N_SECT	0xe		/* defined in section number n_sect */
      #define	N_PBUD	0xc		/* prebound undefined (defined in a dylib) */
      #define N_INDR	0xa		/* indirect */
      
      • N_STAB (0xe0)—If any of these 3 bits are set, the symbol is a symbolic debugging table (stab) entry. In that case, the entire n_type field is interpreted as a stab value. See /usr/include/mach-o/stab.h for valid stab values.

        如果设置了这3位中的任何一位,则符号是符号调试表(stab)项。在这种情况下,整个n_type字段被解释为stab值。有关有效的stab值,请参见/usr/include/mach-o/stab.h。

      • N_PEXT (0x10)—If this bit is on, this symbol is marked as having limited global scope. When the file is fed to the static linker, it clears the N_EXT bit for each symbol with the N_PEXT bit set. (The ld option -keep_private_externs turns off this behavior.) With Mac OS X GCC, you can use the private_extern function attribute to set this bit.

        如果该位为开,则该符号被标记为具有有限的全局范围。当文件被馈送到静态链接器时,它会为每个设置了n_-pext位的符号清除n_-ext位。(ld选项-keep_private_externs关闭此行为)使用mac os x gcc,可以使用u private_u extern_uu函数属性设置此位。

      • N_TYPE (0x0e)—These bits define the type of the symbol.

        定义符号的类型

      • N_EXT (0x01)—If this bit is on, this symbol is an external symbol, a symbol that is either defined outside this file or that is defined in this file but can be referenced by other files.

        如果该位为开,则该符号为外部符号,即在此文件外部定义的,或在此文件中定义但可以被其他文件引用的。

      Values for the N_TYPE field include:

      • N_UNDF (0x0)—The symbol is undefined. Undefined symbols are symbols referenced in this module but defined in a different module. The n_sect field is set to NO_SECT.

        符号未定义。未定义符号是指在此模块中引用但在其他模块中定义的符号。n_sect字段设置为NO_SECT。

      • N_ABS (0x2)—The symbol is absolute. The linker does not change the value of an absolute symbol. The n_sect field is set to NO_SECT.

        符号是绝对的。链接器不会更改绝对符号的值。n_sect字段设置为NO_SECT。

      • N_SECT (0xe)—The symbol is defined in the section number given in n_sect.

      • N_PBUD (0xc)—The symbol is undefined and the image is using a prebound value for the symbol. The n_sect field is set to NO_SECT.

      • N_INDR ( 0xa)—The symbol is defined to be the same as another symbol. The n_value field is an index into the string table specifying the name of the other symbol. When that symbol is linked, both this and the other symbol have the same defined type and value.

        该符号被定义为与另一个符号相同。n_value字段是字符串表的索引,用于指定另一个符号的名称。链接该符号时,此符号和其他符号都具有相同的定义类型和值。

    • n_sect

      An integer specifying the number of the section that this symbol can be found in, or NO_SECT

      if the symbol is not to be found in any section of this image. The sections are contiguously numbered across segments, starting from 1, according to the order they appear in the LC_SEGMENT load commands.

      一个整数,指定可以在其中找到该符号的节的数目,或NO_SECT 如果在图像的任何部分都找不到符号。根据它们在LC_SEGMENT load commands中出现的顺序,分段在分段之间连续编号,从1开始。

    • n_desc

      A 16-bit value providing additional information about the nature of this symbol for non-stab symbols. The reference flags can be accessed using the REFERENCE_TYPE mask (0xF) and are defined as follows:

      一个16位值,为非stab符号提供关于此符号性质的附加信息。可以使用REFERENCE_TYPE掩码(0xF)访问引用标志,定义如下:

    • n_value

      An integer that contains the value of the symbol. The format of this value is different for each type of symbol table entry (as specified by the n_type field). For the N_SECT symbol type, n_value is the address of the symbol. See the description of the n_type field for information on other possible values.

      包含符号值的整数。对于每种类型的符号表条目(由n_type字段指定),此值的格式是不同的。对于N_SECT符号类型,n_value是符号的地址。有关其他可能值的信息,请参见n_type字段的描述。

    Discussion

    Common symbols must be of type N_UNDF and must have the N_EXT bit set. The n_value for a common symbol is the size (in bytes) of the data of the symbol. In C, a common symbol is a variable that is declared but not initialized in this file. Common symbols can appear only in MH_OBJECT Mach-O files.

    公共符号必须是N_UNDF类型的,并且必须设置N_EXT位。公共符号的n_value是符号数据的大小(以字节为单位)。在C语言中,公共符号是在该文件中声明但未初始化的变量。通用符号只能出现在MH_OBJECT Mach-O文件中。

  • dysymtab_command

    The data structure for the LC_DYSYMTAB load command. It describes the sizes and locations of the parts of the symbol table used for dynamic linking. Declared in /usr/include/mach-o/loader.h.

    struct dysymtab_command {
        uint32_t cmd;	/* LC_DYSYMTAB */
        uint32_t cmdsize;	/* sizeof(struct dysymtab_command) */
    
        /*
         * The symbols indicated by symoff and nsyms of the LC_SYMTAB load command
         * are grouped into the following three groups:
         *    local symbols (further grouped by the module they are from)
         *    defined external symbols (further grouped by the module they are from)
         *    undefined symbols
         *
         * The local symbols are used only for debugging.  The dynamic binding
         * process may have to use them to indicate to the debugger the local
         * symbols for a module that is being bound.
         *
         * The last two groups are used by the dynamic binding process to do the
         * binding (indirectly through the module table and the reference symbol
         * table when this is a dynamically linked shared library file).
         */
        uint32_t ilocalsym;	/* index to local symbols */
        uint32_t nlocalsym;	/* number of local symbols */
    
        uint32_t iextdefsym;/* index to externally defined symbols */
        uint32_t nextdefsym;/* number of externally defined symbols */
    
        uint32_t iundefsym;	/* index to undefined symbols */
        uint32_t nundefsym;	/* number of undefined symbols */
    
        /*
         * For the for the dynamic binding process to find which module a symbol
         * is defined in the table of contents is used (analogous to the ranlib
         * structure in an archive) which maps defined external symbols to modules
         * they are defined in.  This exists only in a dynamically linked shared
         * library file.  For executable and object modules the defined external
         * symbols are sorted by name and is use as the table of contents.
         */
        uint32_t tocoff;	/* file offset to table of contents */
        uint32_t ntoc;	/* number of entries in table of contents */
    
        /*
         * To support dynamic binding of "modules" (whole object files) the symbol
         * table must reflect the modules that the file was created from.  This is
         * done by having a module table that has indexes and counts into the merged
         * tables for each module.  The module structure that these two entries
         * refer to is described below.  This exists only in a dynamically linked
         * shared library file.  For executable and object modules the file only
         * contains one module so everything in the file belongs to the module.
         */
        uint32_t modtaboff;	/* file offset to module table */
        uint32_t nmodtab;	/* number of module table entries */
    
        /*
         * To support dynamic module binding the module structure for each module
         * indicates the external references (defined and undefined) each module
         * makes.  For each module there is an offset and a count into the
         * reference symbol table for the symbols that the module references.
         * This exists only in a dynamically linked shared library file.  For
         * executable and object modules the defined external symbols and the
         * undefined external symbols indicates the external references.
         */
        uint32_t extrefsymoff;	/* offset to referenced symbol table */
        uint32_t nextrefsyms;	/* number of referenced symbol table entries */
    
        /*
         * The sections that contain "symbol pointers" and "routine stubs" have
         * indexes and (implied counts based on the size of the section and fixed
         * size of the entry) into the "indirect symbol" table for each pointer
         * and stub.  For every section of these two types the index into the
         * indirect symbol table is stored in the section header in the field
         * reserved1.  An indirect symbol table entry is simply a 32bit index into
         * the symbol table to the symbol that the pointer or stub is referring to.
         * The indirect symbol table is ordered to match the entries in the section.
         */
        uint32_t indirectsymoff; /* file offset to the indirect symbol table */
        uint32_t nindirectsyms;  /* number of indirect symbol table entries */
    
        /*
         * To support relocating an individual module in a library file quickly the
         * external relocation entries for each module in the library need to be
         * accessed efficiently.  Since the relocation entries can't be accessed
         * through the section headers for a library file they are separated into
         * groups of local and external entries further grouped by module.  In this
         * case the presents of this load command who's extreloff, nextrel,
         * locreloff and nlocrel fields are non-zero indicates that the relocation
         * entries of non-merged sections are not referenced through the section
         * structures (and the reloff and nreloc fields in the section headers are
         * set to zero).
         *
         * Since the relocation entries are not accessed through the section headers
         * this requires the r_address field to be something other than a section
         * offset to identify the item to be relocated.  In this case r_address is
         * set to the offset from the vmaddr of the first LC_SEGMENT command.
         * For MH_SPLIT_SEGS images r_address is set to the the offset from the
         * vmaddr of the first read-write LC_SEGMENT command.
         *
         * The relocation entries are grouped by module and the module table
         * entries have indexes and counts into them for the group of external
         * relocation entries for that the module.
         *
         * For sections that are merged across modules there must not be any
         * remaining external relocation entries for them (for merged sections
         * remaining relocation entries must be local).
         */
        uint32_t extreloff;	/* offset to external relocation entries */
        uint32_t nextrel;	/* number of external relocation entries */
    
        /*
         * All the local relocation entries are grouped together (they are not
         * grouped by their module since they are only used if the object is moved
         * from it staticly link edited address).
         */
        uint32_t locreloff;	/* offset to local relocation entries */
        uint32_t nlocrel;	/* number of local relocation entries */
    
    };	
    

    Fields

    • cmd

      Common to all load command structures. For this structure, set to LC_DYSYMTAB.

    • cmdsize

      Common to all load command structures. For this structure, set to sizeof(dysymtab_command).

    • ilocalsym

      An integer indicating the index of the first symbol in the group of local symbols.

      一个整数,表示本地符号组中第一个符号的索引

    • nlocalsym

      An integer indicating the total number of symbols in the group of local symbols.

      表示本地符号组中符号总数

    • iextdefsym

      An integer indicating the index of the first symbol in the group of defined external symbols.

      一个整数,表示定义的外部符号组中第一个符号的索引。

    • nextdefsym

      An integer indicating the total number of symbols in the group of defined external symbols.

      一个整数,表示定义的外部符号组中的符号总数。

    • iundefsym

      An integer indicating the index of the first symbol in the group of undefined external symbols.

      一个整数,表示未定义外部符号组中第一个符号的索引。

    • nundefsym

      An integer indicating the total number of symbols in the group of undefined external symbols

      表示未定义外部符号组中符号总数

    • tocoff

      An integer indicating the byte offset from the start of the file to the table of contents data.

      一个整数,表示从文件开始到目录数据的字节偏移量。

    • ntoc

      An integer indicating the number of entries in the table of contents.

      表示目录中条目数的整数。

    • modtaboff

      An integer indicating the byte offset from the start of the file to the module table data.

      一个整数,表示从文件开始到模块表数据的字节偏移量。

    • nmodtab

      An integer indicating the number of entries in the module table.

      表示模块表中条目数的整数。

    • extrefsymoff

      An integer indicating the byte offset from the start of the file to the external reference table data.

      一个整数,表示从文件开始到外部引用表数据的字节偏移量。

    • nextrefsyms

      An integer indicating the number of entries in the external reference table.

      表示外部引用表中条目数的整数。

    • indirectsymoff

      An integer indicating the byte offset from the start of the file to the indirect symbol table data.

      一个整数,表示从文件开头到间接符号表数据的字节偏移量。

    • nindirectsyms

      An integer indicating the number of entries in the indirect symbol table.

      指示间接符号表中条目数的整数。

    • extreloff

      An integer indicating the byte offset from the start of the file to the external relocation table data.

      一个整数,表示从文件开始到外部重定位表数据的字节偏移量

    • nextrel

      An integer indicating the number of entries in the external relocation table.

      一个整数,指示外部重新定位表中的条目数

    • locreloff

      An integer indicating the byte offset from the start of the file to the local relocation table data.

      一个整数,表示从文件开始到本地重定位表数据的字节偏移量

    • nlocrel

      An integer indicating the number of entries in the local relocation table.

      一个整数,指示本地重新定位表中的条目数

    Discussion

    The LC_DYSYMTAB load command contains a set of indexes into the symbol table and a set of file offsets that define the location of several other tables. Fields for tables not used in the file should be set to 0. These tables are described in “Dynamic Code Generation” in Mach-O Programming Topics. LC_DYSYMTAB load command包含符号表中的一组索引和一组文件偏移量,它们定义了其他几个表的位置。文件中未使用的表的字段应设置为0。这些表在Mach-O Programming Topics.中的“Dynamic Code Generation”中进行了描述。

  • dylib_table_of_contents

    Describes an entry in the table of contents of a dynamic shared library. Declared in /usr/include/mach-o/loader.h.

    /* a table of contents entry */
    struct dylib_table_of_contents {
        uint32_t symbol_index;	/* the defined external symbol
    				   (index into the symbol table) */
        uint32_t module_index;	/* index into the module table this symbol
    				   is defined in */
    };	
    

    Fields

    • symbol_index

      An index into the symbol table indicating the defined external symbol to which this entry refers.

      符号表中的索引,指示此项所引用的已定义外部符号。

    • module_index

      An index into the module table indicating the module in which this defined external symbol is defined.

      模块表中的索引,指示在其中定义此定义的外部符号的模块。

  • dylib_module

    Describes a module table entry for a dynamic shared library for 32-bit architectures. Declared in /usr/include/mach-o/loader.h.

    /* a module table entry */
    struct dylib_module {
        uint32_t module_name;	/* the module name (index into string table) */
    
        uint32_t iextdefsym;	/* index into externally defined symbols */
        uint32_t nextdefsym;	/* number of externally defined symbols */
        uint32_t irefsym;		/* index into reference symbol table */
        uint32_t nrefsym;		/* number of reference symbol table entries */
        uint32_t ilocalsym;		/* index into symbols for local symbols */
        uint32_t nlocalsym;		/* number of local symbols */
    
        uint32_t iextrel;		/* index into external relocation entries */
        uint32_t nextrel;		/* number of external relocation entries */
    
        uint32_t iinit_iterm;	/* low 16 bits are the index into the init
    				   section, high 16 bits are the index into
    			           the term section */
        uint32_t ninit_nterm;	/* low 16 bits are the number of init section
    				   entries, high 16 bits are the number of
    				   term section entries */
    
        uint32_t			/* for this module address of the start of */
    	objc_module_info_addr;  /*  the (__OBJC,__module_info) section */
        uint32_t			/* for this module size of */
    	objc_module_info_size;	/*  the (__OBJC,__module_info) section */
    };	
    
    /* a 64-bit module table entry */
    struct dylib_module_64 {
        uint32_t module_name;	/* the module name (index into string table) */
    
        uint32_t iextdefsym;	/* index into externally defined symbols */
        uint32_t nextdefsym;	/* number of externally defined symbols */
        uint32_t irefsym;		/* index into reference symbol table */
        uint32_t nrefsym;		/* number of reference symbol table entries */
        uint32_t ilocalsym;		/* index into symbols for local symbols */
        uint32_t nlocalsym;		/* number of local symbols */
    
        uint32_t iextrel;		/* index into external relocation entries */
        uint32_t nextrel;		/* number of external relocation entries */
    
        uint32_t iinit_iterm;	/* low 16 bits are the index into the init
    				   section, high 16 bits are the index into
    				   the term section */
        uint32_t ninit_nterm;      /* low 16 bits are the number of init section
    				  entries, high 16 bits are the number of
    				  term section entries */
    
        uint32_t			/* for this module size of */
            objc_module_info_size;	/*  the (__OBJC,__module_info) section */
        uint64_t			/* for this module address of the start of */
            objc_module_info_addr;	/*  the (__OBJC,__module_info) section */
    };
    

    Fields

    • module_name

      An index to an entry in the string table indicating the name of the module.

      指向字符串表中指示模块名称的项的索引。

    • iextdefsym

      The index into the symbol table of the first defined external symbol provided by this module.

      此模块提供的第一个定义的外部符号的符号表索引。

    • nextdefsym

      The number of defined external symbols provided by this module.

      此模块提供的已定义外部符号的数目

    • irefsym

      The index into the external reference table of the first entry provided by this module.

      此模块提供的第一个条目的外部引用表的索引

    • nrefsym

      The number of external reference entries provided by this module.

      此模块提供的外部引用条目数。

    • ilocalsym

      The index into the symbol table of the first local symbol provided by this module.

      此模块提供的第一个本地符号的符号表索引。

    • nlocalsym

      The number of local symbols provided by this module.

      此模块提供的本地符号数。

    • iextrel

      The index into the external relocation table of the first entry provided by this module.

      此模块提供的第一个条目的外部重新定位表的索引

    • nextrel

      The number of entries in the external relocation table that are provided by this module.

      此模块提供的外部重新定位表中的条目数。

    • iinit_iterm

      Contains both the index into the module initialization section (the low 16 bits) and the index into the module termination section (the high 16 bits) to the pointers for this module.

      包含指向模块初始化部分(低16位)的索引和指向此模块指针的模块终止部分(高16位)的索引。

    • ninit_nterm

      Contains both the number of pointers in the module initialization (the low 16 bits) and the number of pointers in the module termination section (the high 16 bits) for this module.

      包含模块初始化中的指针数(低16位)和此模块的模块终止部分中的指针数(高16位)。

    • objc_module_info_addr

      The statically linked address of the start of the data for this module in the __module_info section in the __OBJC segment.

      在objc段中的模块信息部分中,此模块的数据起始的静态链接地址。

    • objc_module_info_size

      The number of bytes of data for this module that are used in the __module_info section in the __OBJC segment.

      此模块在objc段中的模块信息部分中使用的数据字节数。

  • dylib_reference

    Defines the attributes of an external reference table entry for the external reference entries provided by a module in a shared library. Declared in /usr/include/mach-o/loader.h.

    /* 
     * The entries in the reference symbol table are used when loading the module
     * (both by the static and dynamic link editors) and if the module is unloaded
     * or replaced.  Therefore all external symbols (defined and undefined) are
     * listed in the module's reference table.  The flags describe the type of
     * reference that is being made.  The constants for the flags are defined in
     * <mach-o/nlist.h> as they are also used for symbol table entries.
     */
    struct dylib_reference {
        uint32_t isym:24,		/* index into the symbol table */
        		  flags:8;	/* flags to indicate the type of reference */
    };
    

    Fields

    • isym

      An index into the symbol table for the symbol being referenced

      符号表中被引用符号的索引

    • flags

      A constant for the type of reference being made. Use the same REFERENCE_FLAG constants as described in the nlist (page 39) structure description.

      引用类型的常量。使用与nlist(第39页)结构描述中描述的相同的REFERENCE_FLAG常量。

Relocation Data Structures

Relocation is the process of moving symbols to a different address. When the static linker moves a symbol (a function or an item of data) to a different address, it needs to change all the references to that symbol to use the new address. The relocation entries in a Mach-O file contain offsets in the file to addresses that need to be relocated when the contents of the file are relocated. The addresses stored in CPU instructions can be absolute or relative. Each relocation entry specifies the exact format of the address. When creating the intermediate object file, the compiler generates one or more relocation entries for every instruction that contains an address. Because relocation to symbols at fixed addresses, and to relative addresses for position independent references, does not occur at runtime, the static linker typically removes some or all the relocation entries when building the final product.

重定位是将符号移动到另一个地址的过程。当静态链接器将符号(函数或数据项)移动到另一个地址时,需要更改对该符号的所有引用,以使用新地址。Mach-O文件中的重定位条目包含文件中的偏移量,这些偏移量指向当文件内容重定位时需要重定位的地址。CPU指令中存储的地址可以是绝对地址,也可以是相对地址。每个重定位条目指定地址的确切格式。在创建中间对象文件时,编译器为包含地址的每条指令生成一个或多个重定位项。由于在运行时不会对固定地址的符号和位置独立引用的相对地址进行重定位,因此静态链接器通常会在构建最终产品时删除部分或所有重定位项。

Note: In the Mac OS X x86-64 environment scattered relocations are not used. Compiler-generated code uses mostly external relocations, in which the r_extern bit is set to 1 and the r_symbolnum field contains the symbol-table index of the target label.

注意:在Mac OS X x86-64环境中不使用分散重定位。编译器生成的代码主要使用外部重定位,其中r_extern位设置为1,r_symbolnum字段包含目标标签的符号表索引。

  • relocation_info

    Describes an item in the file that uses an address that needs to be updated when the address is changed. Declared in /usr/include/mach-o/reloc.h.

    描述文件中使用地址的项,该地址在地址更改时需要更新。在/usr/include/mach-o/reloc.h中声明。

    /*
     * Format of a relocation entry of a Mach-O file.  Modified from the 4.3BSD
     * format.  The modifications from the original format were changing the value
     * of the r_symbolnum field for "local" (r_extern == 0) relocation entries.
     * This modification is required to support symbols in an arbitrary number of
     * sections not just the three sections (text, data and bss) in a 4.3BSD file.
     * Also the last 4 bits have had the r_type tag added to them.
     */
    struct relocation_info {
       int32_t	r_address;	/* offset in the section to what is being
    				   relocated */
       uint32_t     r_symbolnum:24,	/* symbol index if r_extern == 1 or section
    				   ordinal if r_extern == 0 */
    		r_pcrel:1, 	/* was relocated pc relative already */
    		r_length:2,	/* 0=byte, 1=word, 2=long, 3=quad */
    		r_extern:1,	/* does not include value of sym referenced */
    		r_type:4;	/* if not 0, machine specific relocation type */
    };
    

    Fields

    • r_address

      In MH_OBJECT files, this is an offset from the start of the section to the item containing the address requiring relocation. If the high bit of this field is set (which you can check using the R_SCATTERED bit mask), the relocation_info structure is actually a scattered_relocation_info (page 52) structure.

      In images used by the dynamic linker, this is an offset from the virtual memory address of the data of the first segment_command (page 20) that appears in the file (not necessarily the one with the lowest address). For images with the MH_SPLIT_SEGS flag set, this is an offset from the virtual memory address of data of the first read/write segment_command (page 20).

      在MH_OBJECT文件中,这是从节的开始到包含需要重新定位的地址的项的偏移量。如果设置了该字段的高位(可以使用r_scatter位掩码检查),则relocation_info结构实际上是scattered_relocation_info(第52页)结构。 在动态链接器使用的图像中,这是与文件中出现的第一个segment_command(第20页)的数据的虚拟内存地址的偏移量(不一定是地址最低的那个)。对于设置了MH_SPLIT_SEGS标志的图像,这是第一个读/写段_command(第20页)数据的虚拟内存地址的偏移量。

    • r_symbolnum

      Indicates either an index into the symbol table (when the r_extern field is set to 1) or a section number (when the r_extern field is set to 0). As previously mentioned, sections are ordered from 1 to 255 in the order in which they appear in the LC_SEGMENT load commands. This field is set to R_ABS for relocation entries for absolute symbols, which need no relocation.

      表示符号表中的索引(当r_extern字段设置为1时)或节号(当r_extern字段设置为0时)。这个字段被设置为R_ABS,用于绝对符号的重新定位条目,绝对符号不需要重新定位。

    • r_pcrel

      Indicates whether the item containing the address to be relocated is part of a CPU instruction that uses PC-relative addressing.

      For addresses contained in PC-relative instructions, the CPU adds the address of the instruction to the address contained in the instruction.

      指示包含要重新定位的地址的项是否属于使用pc相对寻址的CPU指令的一部分。 对于pc相关指令中包含的地址,CPU将指令的地址添加到指令中包含的地址中。

    • r_length

      Indicates the length of the item containing the address to be relocated. The following table lists r_length values and the corresponding address length.

      指示包含要重新定位的地址的项的长度。下表列出了r_length值和相应的地址长度。

      ValueAddress length
      01 byte
      12 bytes
      24 bytes
      34 bytes. See description for the PPC_RELOC_BR14 r_type in
      scattered_relocation_info (page 52).
    • r_extern

      Indicates whether the r_symbolnum field is an index into the symbol table (1) or a section number (0).

      指示r_symbolnum字段是符号表(1)的索引还是节号(0)的索引。

    • r_type

  • scattered_relocation_info

    Describes an item in the file—using a nonzero constant in its relocatable expression or two addresses in its relocatable expression—that needs to be updated if the addresses that it uses are changed. This information is needed to reconstruct the addresses that make up the relocatable expression’s value in order to change the addresses independently of each other. Declared in /usr/include/mach-o/reloc.h.

    在可重定位表达式中使用非零常量或在可重定位表达式中使用两个地址来描述文件中的项——如果使用的地址发生更改,则需要更新这些常量。需要此信息来重构构成可重定位表达式值的地址,以便独立地更改地址。中声明/usr/include/mach-o/reloc.h.

    struct scattered_relocation_info {
    #ifdef __BIG_ENDIAN__
       uint32_t	r_scattered:1,	/* 1=scattered, 0=non-scattered (see above) */
    		r_pcrel:1, 	/* was relocated pc relative already */
    		r_length:2,	/* 0=byte, 1=word, 2=long, 3=quad */
    		r_type:4,	/* if not 0, machine specific relocation type */
       		r_address:24;	/* offset in the section to what is being
    				   relocated */
       int32_t	r_value;	/* the value the item to be relocated is
    				   refering to (without any offset added) */
    #endif /* __BIG_ENDIAN__ */
    #ifdef __LITTLE_ENDIAN__
       uint32_t
       		r_address:24,	/* offset in the section to what is being
    				   relocated */
    		r_type:4,	/* if not 0, machine specific relocation type */
    		r_length:2,	/* 0=byte, 1=word, 2=long, 3=quad */
    		r_pcrel:1, 	/* was relocated pc relative already */
    		r_scattered:1;	/* 1=scattered, 0=non-scattered (see above) */
       int32_t	r_value;	/* the value the item to be relocated is
    				   refering to (without any offset added) */
    #endif /* __LITTLE_ENDIAN__ */
    };
    

    Fields

    • r_scattered

      If this bit is 0, this structure is actually a relocation_info (page 49) structure.

      如果这个位是0,那么这个结构实际上是一个relocation_info(第49页)结构。

    • r_address

      In MH_OBJECT files, this is an offset from the start of the section to the item containing the address requiring relocation. If the high bit of this field is clear (which you can check using the R_SCATTERED bit mask), this structure is actually a relocation_info (page 49) structure.

      In images used by the dynamic linker, this is an offset from the virtual memory address of the data of the first segment_command (page 20) that appears in the file (not necessarily the one with the lowest address). For images with the MH_SPLIT_SEGS flag set, this is an offset from the virtual memory address of data of the first read/write segment_command (page 20).

      Since this field is only 24 bits long, the offset in this field can never be larger than 0x00FFFFFF, thus limiting the size of the relocatable contents of this image to 16 megabytes.

      在MH_OBJECT文件中,这是从节的开始到包含需要重新定位的地址的项的偏移量。如果这个字段的高位是清除的(可以使用r_scatter位掩码检查),那么这个结构实际上是一个relocation_info(第49页)结构。 在动态链接器使用的图像中,这是与文件中出现的第一个segment_command(第20页)的数据的虚拟内存地址的偏移量(不一定是地址最低的那个)。对于设置了MH_SPLIT_SEGS标志的图像,这是第一个读/写段_command(第20页)数据的虚拟内存地址的偏移量。 由于该字段只有24位长,因此该字段中的偏移量永远不能大于0x00FFFFFF,从而将此图像的可重定位内容的大小限制为16 mb。

    • r_pcrel

      Indicates whether the item containing the address to be relocated is part of a CPU instruction that uses PC-relative addressing.

      For addresses contained in PC-relative instructions, the CPU adds the address of the instruction to the address contained in the instruction.

      指示包含要重新定位的地址的项是否属于使用pc相对寻址的CPU指令的一部分。 对于pc相关指令中包含的地址,CPU将指令的地址添加到指令中包含的地址中。

    • r_length

      Indicates the length of the item containing the address to be relocated. A value of 0 indicates a single byte; a value of 1 indicates a 2-byte address, and a value of 2 indicates a 4-byte address.

      指示包含要重新定位的地址的项的长度。值为0表示单个字节;值1表示2字节地址,值2表示4字节地址。

    • r_type

      Indicates the type of relocation to be performed. Possible values for this field are shared between this structure and the relocation_info data structure; see the description of the r_type field in the relocation_info (page 49) data structure for more details.

      指示要执行的重定位类型。这个字段的可能值在这个结构和relocation_info数据结构之间共享;有关详细信息,请参阅relocation_info(第49页)数据结构中r_type字段的描述。

    • r_value

      The address of the relocatable expression for the item in the file that needs to be updated if the address is changed. For relocatable expressions with the difference of two section addresses, the address from which to subtract (in mathematical terms, the minuend) is contained in the first relocation entry and the address to subtract (the subtrahend) is contained in the second relocation entry.

      如果地址更改,则需要更新文件中项的可重定位表达式的地址。对于两个节地址不同的可重定位表达式,要减去的地址(数学术语为被减数)包含在第一个重定位项中,要减去的地址(减数)包含在第二个重定位项中。

    Discussion

    Mach-O relocation data structures support two types of relocatable expressions in machine code and data:

    Mach-O重定位数据结构支持两种类型的可重定位表达式:

    • Symbol address + constant. The most typical form of relocation is referencing a symbol’s address with no constant added. In this case, the value of the constant expression is 0.

      最典型的重定位形式是引用没有添加常量的符号地址。在本例中,常量表达式的值为0。

    • Address of section y address of section x + constant. The section difference form of relocation. This form of relocation supports position-independent code.

      节差形式的移位。这种形式的重新定位支持与位置无关的代码。

Static Archive Libraries

This section describes the file format used for static archive libraries. Mac OS X uses a format derived from the original BSD static archive library format, with a few minor additions. See the discussion for the ranlib data structure for more information.

本节描述用于静态存档库的文件格式。Mac OS X使用的格式是从原始的BSD静态存档库格式派生出来的,只添加了一些小功能。有关更多信息,请参阅ranlib数据结构的讨论。

  • ranlib

    Defines the attributes of a static archive library symbol table entry. Declared in /usr/include/mach-o/ranlib.h.

    /*
     * Structure of the __.SYMDEF table of contents for an archive.
     * __.SYMDEF begins with a uint32_t giving the size in bytes of the ranlib
     * structures which immediately follow, and then continues with a string
     * table consisting of a uint32_t giving the number of bytes of strings which
     * follow and then the strings themselves.  The ran_strx fields index the
     * string table whose first byte is numbered 0.
     */
    struct	ranlib {
        union {
    	uint32_t	ran_strx;	/* string table index of */
    #ifndef __LP64__
    	char		*ran_name;	/* symbol defined by */
    #endif
        } ran_un;
        uint32_t		ran_off;	/* library member at this offset */
    };
    

    Fields

    • ran_strx

      The index number (zero-based) of the string in the string table that follows the array of ranlib data structures.

      在ranlib数据结构数组后面的字符串表中字符串的索引号(从零开始)。

    • ran_name

      The byte offset, from the start of the file, at which the symbol name can be found. This field is not used in Mach-O files.

      从文件开始的字节偏移量,在该偏移量处可以找到符号名。此字段在Mach-O文件中不使用。

    • ran_off

      The byte offset, from the start of the file, at which the header line for the member containing this symbol can be found.

      从文件开始的字节偏移量,在此位置可以找到包含此符号的成员的头行。

    Discussion

    A static archive library begins with the file identifier string !, followed by a newline character (ASCII value 0x0A). The file identifier string is followed by a series of member files. Each member consists of a fixed-length header line followed by the file data. The header line is 60 bytes long and is divided into five fixed-length fields, as shown in this example header line:

    静态存档库以文件标识符字符串!开始,后跟一个换行符(ASCII值0x0A)。文件标识符字符串后面跟着一系列成员文件。每个成员由一个固定长度的头行和文件数据组成。头行为60字节长,分为5个固定长度的字段,如下例头行所示:

              grapple.c       999514211   501   20    100644  167       `
    

    The last 2 bytes of the header line are a grave accent (`) character (ASCII value 0x60) and a newline character. All header fields are defined in ASCII and padded with spaces to the full length of the field. All fields are defined in decimal notation, except for the file mode field, which is defined in octal. These are the descriptions for each field:

    头行最后两个字节是一个重重音(')字符(ASCII值0x60)和一个换行字符。所有头字段都是用ASCII定义的,并用空格填充到字段的完整长度。所有字段都是用十进制记数法定义的,文件模式字段除外,它是用八进制定义的。以下是每个领域的描述:

    • The name field (16 bytes) contains the name of the file. If the name is either longer than 16 bytes or contains a space character, the actual name should be written directly after the header line and the name field should contain the string #1/ followed by the length. To keep the archive entries aligned to 8 byte boundaries, the length of the name that follows the #1/ is rounded to 8 bytes and the name that follows the header is padded with null bytes.

      name字段(16字节)包含文件的名称。如果名称大于16字节或包含空格字符,则实际名称应该直接写在标题行之后,name字段应该包含字符串#1/后跟长度。为了使归档条目对齐到8个字节的边界,#1/后面的名称的长度四舍五入为8个字节,头部后面的名称用空字节填充。

    • The modified date field (12 bytes) is taken from the st_time field returned by the stat system call.

      修改后的日期字段(12字节)取自stat系统调用返回的st_time字段。

    • The user ID field (6 bytes) is taken from the st_uid field returned by the stat system call.

      用户ID字段(6字节)取自stat系统调用返回的st_uid字段。

    • The group ID field (6 bytes) is taken from the st_gid field returned by the stat system call.

      组ID字段(6字节)取自stat系统调用返回的st_gid字段。

    • The file mode field (8 bytes) is taken from the st_mode field returned by the stat system call. This field is written in octal notation.

      文件模式字段(8字节)取自stat系统调用返回的st_mode字段。这个字段用八进制符号表示。

    • The file size field (8 bytes) is taken from the st_size field returned by the stat system call.

      文件大小字段(8字节)取自stat系统调用返回的st_size字段。

    The first member in a static archive library is always the symbol table describing the contents of the rest of the member files. This member is always called either .SYMDEF or .SYMDEF SORTED (note the two leading underscores and the period). The name used depends on the sort order of the symbol table. The older variant—.SYMDEF—contains entries in the same order that they appear in the object files. The newer variant—.SYMDEF SORTED— contains entries in alphabetical order, which allows the static linker to load the symbols faster.

    静态存档库中的第一个成员总是描述其余成员文件内容的符号表。这个成员总是被称为__。SYMDEF或__。SYMDEF排序(注意两个前导下划线和句号)。使用的名称取决于符号表的排序顺序。__年长的变体。symdef -包含与它们在目标文件中出现的顺序相同的条目。__新的变体。按字母顺序包含条目,这允许静态链接器更快地加载符号。

    The __.SYMDEF and .__SORTED SYMDEF archive members contain an array of ranlib data structures preceded by the length in bytes (a long integer, 4 bytes) of the number of items in the array. The array is followed by a string table of null-terminated strings, which are preceded by the length in bytes of the entire string table (again, a 4-byte long integer).

    __。已排序的SYMDEF archive成员包含一个ranlib数据结构数组,前面是数组中项数的字节长度(一个长整数,4个字节)。数组后面是一个以null结尾的字符串字符串表,它的前面是整个字符串表的字节长度(同样是一个4字节长的整数)。

    The string table is an array of C strings, each terminated by a null byte. The ranlib declarations can be found in /usr/include/mach-o/ranlib.h.

    Special Considerations

    Prior to the advent of libtool, a tool called ranlib was used to generate the symbol table. ranlib has since been integrated into libtool. See the man page for libtool for more information.

    在libtool出现之前,使用了一个名为ranlib的工具来生成符号表。ranlib已经集成到libtool中。有关更多信息,请参见libtool的手册页。

Universal Binaries 32-bit/64-bit PowerPC Binaries

The standard development tools accept as parameters two kinds of binaries:

标准开发工具接受两种二进制文件作为参数:

  • Object files targeted at one architecture. These include Mach-O files, static libraries, and dynamic libraries.

    目标文件针对一个体系结构。其中包括Mach-O文件、静态库和动态库。

  • Binaries targeted at more than one architecture. These binaries contain compiled code and data for one of these system types:

    针对多个体系结构的二进制文件。这些二进制文件包含以下系统类型的编译代码和数据:

    • PowerPC-based (32-bit and 64-bit) Macintosh computers. Binaries that contain code for both 32-bit and 64-bit PowerPC-based Macintosh computers are are known as PPC/PPC64 binaries.

      基于powerpc(32位和64位)的Macintosh计算机。包含两者代码的二进制文件 基于32位和64位powerpc的Macintosh计算机被称为PPC/PPC64二进制文件

    • Intel-based and PowerPC-based (32-bit, 64-bit, or both) Macintosh computers. Binaries that contain code for both Intel-based and PowerPC-based Macintosh computers are known as universal binaries.

      基于intel和基于powerpc(32位、64位或两者都有)的Macintosh计算机。二进制文件 包含基于intel和基于powerpc的Macintosh计算机的代码 通用二进制文件

    Each object file is stored as a continuous set of bytes at an offset from the beginning of the binary. They use a simple archive format to store the two object files with a special header at the beginning of the file to allow the various runtime tools to quickly find the code appropriate for the current architecture.

    每个目标文件都存储为一组连续的字节,以二进制文件开头的偏移量为单位。它们使用一种简单的归档格式来存储这两个目标文件,并在文件的开头使用一个特殊的头,以允许各种运行时工具快速找到适合当前体系结构的代码。

A binary that contains code for more than one architecture always begins with a fat_header (page 56) data structure, followed by two fat_arch (page 56) data structures and the actual data for the architectures contained in the file. All data in these data structures is stored in big-endian byte order.

包含多个体系结构代码的二进制文件总是以fat_header(第56页)数据结构开始,然后是两个fat_arch(第56页)数据结构和文件中包含的体系结构的实际数据。这些数据结构中的所有数据都以大端字节顺序存储。

  • fat_header

    Defines the layout of a binary that contains code for more than one architecture. Declared in the header /usr/include/mach-o/fat.h.

    #define FAT_MAGIC	0xcafebabe
    #define FAT_CIGAM	0xbebafeca	/* NXSwapLong(FAT_MAGIC) */
    
    struct fat_header {
    	uint32_t	magic;		/* FAT_MAGIC or FAT_MAGIC_64 */
    	uint32_t	nfat_arch;	/* number of structs that follow */
    };
    

    Fields

    • magic

      An integer containing the value 0xCAFEBABE in big-endian byte order format. On a big-endian host CPU, this can be validated using the constant FAT_MAGIC; on a little-endian host CPU, it can be validated using the constant FAT_CIGAM.

      包含值0xCAFEBABE的整数,采用大端字节顺序格式。在大端主机CPU上,可以使用常量FAT_MAGIC验证这一点;在little-endian主机CPU上,可以使用常量FAT_CIGAM验证它。

    • nfat_arch

      An integer specifying the number of fat_arch (page 56) data structures that follow. This is the number of architectures contained in this binary.

      一个整数,指定后面的fat_arch(第56页)数据结构的数量。这是这个二进制文件中包含的体系结构的数量。

    Discussion

    The fat_header data structure is placed at the start of a binary that contains code for multiple architectures. Directly following the fat_header data structure is a set of fat_arch (page 56) data structures, one for each architecture included in the binary.

    Regardless of the content this data structure describes, all its fields are stored in big-endian byte order.

    fat_header数据结构位于包含多个体系结构代码的二进制文件的开头。直接跟随fat_header数据结构的是一组fat_arch(第56页)数据结构,每个结构都包含在二进制文件中。 不管这个数据结构描述的内容是什么,它的所有字段都以大端字节顺序存储。

  • fat_arch

    Describes the location within the binary of an object file targeted at a single architecture. Declared in /usr/include/mach-o/fat.h.

    描述以单一架构为目标的目标文件二进制文件中的位置。

    struct fat_arch {
    	cpu_type_t	cputype;	/* cpu specifier (int) */
    	cpu_subtype_t	cpusubtype;	/* machine specifier (int) */
    	uint32_t	offset;		/* file offset to this object file */
    	uint32_t	size;		/* size of this object file */
    	uint32_t	align;		/* alignment as a power of 2 */
    };
    

    Fields

    • cputype

      An enumeration value of type cpu_type_t. Specifies the CPU family.

      类型cpu_type_t的枚举值。指定CPU族

    • cpusubtype

      An enumeration value of type cpu_subtype_t. Specifies the specific member of the CPU family on which this entry may be used or a constant specifying all members.

      类型cpu_subtype_t的枚举值。指定可以使用此条目的CPU家族的特定成员,或指定所有成员的常量。

    • offset

      Offset to the beginning of the data for this CPU.

      偏移到此CPU的数据开头。

    • size

      Size of the data for this CPU.

      这个CPU的数据大小。

    • align

      The power of 2 alignment for the offset of the object file for the architecture specified in cputype

      within the binary. This is required to ensure that, if this binary is changed, the contents it retains are correctly aligned for virtual memory paging and other uses.

      对cputype中指定的体系结构的目标文件的偏移量进行2对齐的能力 在二进制。这是为了确保,如果这个二进制文件被更改,它保留的内容被正确对齐,以用于虚拟内存分页和其他用途。

    Discussion

    An array of fat_arch data structures appears directly after the fat_header (page 56) data structure of a binary that contains object files for multiple architectures.

    Regardless of the content this data structure describes, all its fields are stored in big-endian byte order.

    fat_arch数据结构数组直接出现在包含多个体系结构目标文件的二进制文件的fat_header(第56页)数据结构之后。 不管这个数据结构描述的内容是什么,它的所有字段都以大端字节顺序存储。