LLVM DataLayout String Syntax in BNF

487 阅读2分钟

A DataLayout string is used to determine the datat layout of a target machine when constructing a specific target machine, e.g. SparcTargetMachine.

SparcTargetMachine::SparcTargetMachine(const Module &M, const std::string &FS)
  : DataLayout("E-p:32:32-f128:128:128"),
    Subtarget(M, FS), InstrInfo(Subtarget),
    FrameInfo(TargetFrameInfo::StackGrowsDown, 8, 0) {
}

The string "E-p:32:32-f128:128:128" is the target DataLayout description to specify the data layout of SparcTargetMachine of this example.

What that means? Here is the quote from LLVM Document.

Hyphens separate portions of the DataLayout description string.

  1. An upper-case “E” in the string indicates a big-endian target data model. A lower-case “e” indicates little-endian.

  2. p:” is followed by pointer information: size, ABI alignment, and preferred alignment. If only two figures follow “p:”, then the first value is pointer size, and the second value is both ABI and preferred alignment.

  3. Then a letter for numeric type alignment: “i”, “f”, “v”, or “a” (corresponding to integer, floating point, vector, or aggregate). “i”, “v”, or “a” are followed by ABI alignment and preferred alignment. “f” is followed by three values: the first indicates the size of a long double, then ABI alignment, and then ABI preferred alignment.

 But what if I want to know what is the full specification of target DataLayout description string? Unfortunately, I cannot get a corresponding document for it. So do it the hard way, reading the code parsing target DataLayout description string.

Here is the code:

/* /wtsc/llvm-project/llvm/lib/IR/DataLayout.cpp */
...
void DataLayout::parseSpecifier(StringRef Desc) {
  StringRepresentation = std::string(Desc);
  while (!Desc.empty()) {
    // Split at '-'.
    std::pair<StringRef, StringRef> Split = split(Desc, '-');
    Desc = Split.second;
    // Split at ':'.
    Split = split(Split.first, ':');
    // Aliases used below.
    StringRef &Tok  = Split.first;  // Current token.
    StringRef &Rest = Split.second; // The rest of the string.
    ...
    char Specifier = Tok.front();
    Tok = Tok.substr(1);
    switch (Specifier) {
    case 's':
      // Deprecated, but ignoring here to preserve loading older textual llvm
      // ASM file
      break;
    ...
    }}}
...

Those code is quite long, but at the end it just parses the input target DataLayout description according to following syntax in simplified BNF (Backus–Naur form).

DescriptionString ::= DescList
DescList          ::= DescItem DescListRest
DescListRest      ::= '-' DescItem DescListRest | ''
DescItem          ::= NonIntergralAddressSpaces
                  |   ASMFile
                  |   BigEndian
                  |   LitterEndian
                  |   Pointer
                  |   Interger
                  |   Vector
                  |   FloatingPoint
                  |   Aggregate
                  |   NativeIntergerTypes
                  |   StackNaturalAlignment
                  |   FunctionPointer
                  |   FunctionAddressSpace
                  |   DefaultStackAllocaAddressSpace
                  |   Mangling

ASMFile        ::= 's'
BigEndian      ::= 'E'
LittlerEndiant ::= 'e'
Pointer        ::= 'p' AddressSpace ':' Size ':' ABIAlignment ':' PreferedAlignment                  ':' IndexSize
Interger       ::= 'i' Size ':' ABIAlignment ':' PreferedAlignment
Vector         ::= 'v' Size ':' ABIAlignment ':' PreferedAlignment
FloatingPoint  ::= 'f' Size ':' ABIAlignment ':' PreferedAlignment
Aggregate      ::= 'a' Size ':' ABIAlignment ':' PreferedAlignment

NativeIntergerTypes     ::= 'n' LegalIntWidths
LegalIntWidths          ::= Size LegalIntWidthsRest
LegalIntWidthsRest      ::= ':' Size LegalIntWidthsRest
StackNaturalAlignment   ::= 'S' Size
FunctionPointer         ::= 'F' FunctionPtrAlignType Size
FunctionPtrAlignType    ::= Independent | MultipleOfFunctionAlign
Independent             ::= 'i'
MultipleOfFunctionAlign ::= 'n'

FunctionAddressSpace            ::= 'P' Size
DefaultStatckAllocaAddressSpace ::= 'A' Size

Mangling        ::= 'm' | ManglingType
ManglingType    ::= MM_ELF | MM_MachO | MM_Mips 
                |   MM_WinCOFF | MM_WinCOFFX86 | MM_XCOFF
MM_ELF          ::= 'e'
MM_MachO        ::= 'o'
MM_Mips         ::= 'm'
MM_WinCOFF      ::= 'w'
MM_WinCOFFX86   ::= 'x'
MM_XCOFF        ::= 'a'
Size            ::= interger
ABIAlignment    ::= interger
PreferAlignment ::= interger
IndexSize       ::= interger

That is whole story of Target DataLayout Description String Specification.