Steps for Creating an LLVM Backend

594 阅读4分钟

Steps for Creating an LLVM Backend

This aritcle is to show the steps for creating an LLVM Backend.

1 Register a new LLVM Backend.

Details in LLVM Target Registration Minimal Setup After this setup, the LLVM can regconize the new added backend. However it cannot allocate a target machine for it. Therefore the next step is to add our target machine and related parts.

2 Setup Target machine structures.

2.1 classes

  1. RISCVTargetMachine:LLVMTargetMachine:TargetMachine

    /// Primary interface to the complete machine description for the target
    /// machine.  All target-specific information should be accessible through this
    /// interface.
    class TargetMachine {};
    

    /// This class describes a target machine that is implemented with the LLVM /// target-independent code generator. class LLVMTargetMachine : public TargetMachine {};

  2. RISCVPassConfig:TargetPassConfig:ImmutablePass:ModulePass:Pass

    /// Target-Independent Code Generator Pass Configuration Options.
    ///
    /// This is an ImmutablePass solely for the purpose of exposing CodeGen options
    /// to the internals of other CodeGen passes.
    class TargetPassConfig : public ImmutablePass {}
    
  3. RISCVSubtarget:RISCVGenSubtargetInfo:TargetSubtargetInfo:MCSubtargetInfo

    /// TargetSubtargetInfo - Generic base class for all target subtargets.  All
    /// Target-specific options that control code generation and printing should
    /// be exposed through a TargetSubtargetInfo-derived class.
    ///
    class TargetSubtargetInfo : public MCSubtargetInfo {};
    
  4. RISCVInstrInfo:RISCVGenInstrInfo:TargetInstrInfo:MCInstrInfo

    /// TargetInstrInfo - Interface to description of machine instruction set
    ///
    class TargetInstrInfo : public MCInstrInfo {};
    
  5. RISCVRegisterInfo:RISCVGenRegisterInfo:TargetRegisterInfo:MCRegisterInfo

    /// TargetRegisterInfo base class - We assume that the target defines a static
    /// array of TargetRegisterDesc objects that represent all of the machine
    /// registers that the target has.  As such, we simply have to track a pointer
    /// to this array so that we can turn register number into a register
    /// descriptor.
    ///
    class TargetRegisterInfo : public MCRegisterInfo {};
    
  6. RISCVDAGToDAGISel:SelectionDAGISel:MachineFunctionPass

    /// SelectionDAGISel - This is the common base class used for SelectionDAG-based
    /// pattern-matching instruction selectors.
    class SelectionDAGISel : public MachineFunctionPass {};
    
  7. RISCVTargetLowering:TargetLowering:TargetLoweringBase

    /// This class defines information used to lower LLVM code to legal SelectionDAG
    /// operators that the target instruction selector can accept natively.
    ///
    /// This class also defines callbacks that targets must implement to lower
    /// target-specific constructs to SelectionDAG operators.
    class TargetLowering : public TargetLoweringBase {};
    
  8. RISCVFrameLowering:TargetFrameLowering

    /// Information about stack frame layout on the target.  It holds the direction
    /// of stack growth, the known stack alignment on entry to each function, and
    /// the offset to the locals area.
    ///
    /// The offset to the local area is the offset from the stack pointer on
    /// function entry to the first location where function data (local variables,
    /// spill locations) can be stored.
    class TargetFrameLowering {};
    
  9. RISCVELFTargetObjectFile:TargetLoweringObjectFileELF:TargetLoweringObjectFile:MCObjectFileInfo
  10. RISCVMachineFunctionInfo:MachineFunctionInfo

    /// MachineFunctionInfo - This class can be derived from and used by targets to
    /// hold private target-specific information for each MachineFunction.  Objects
    /// of type are accessed/created with MF::getInfo and destroyed when the
    /// MachineFunction is destroyed.
    struct MachineFunctionInfo {};
    

2.2 Target Description files (.td)

  1. RISCV Scheduling Definitions
  2. RISCV Instrution Info
  3. RISCV Register files
  4. RISCV Calling Conventions

    // The RISC-V calling convention is handled with custom code in
    // RISCVISelLowering.cpp (CC_RISCV).
    def CSR_ILP32_LP64
    : CalleeSavedRegs<(add X1, X3, X4, X8, X9, (sequence "X%u", 18, 27))>;
    

    def CSR_ILP32F_LP64F : CalleeSavedRegs<(add CSR_ILP32_LP64, F8_F, F9_F, (sequence "F%u_F", 18, 27))>;

    def CSR_ILP32D_LP64D : CalleeSavedRegs<(add CSR_ILP32_LP64, F8_D, F9_D, (sequence "F%u_D", 18, 27))>;

  5. RISCV Processors
  6. RISCV Target

2.3 Afterward:

After this setup, the LLVM can convert IR to Machine instruction, but cannot output MachineInstr into Assembly or Object file.

3 Setup MC Layer

3.1 MC Layer: MC Project

  1. The major components.
    1. the instruction printer (MCInstPrinter API) MCInst –> Assembly line in .s file
    2. the instruction encoder (MCCodeEmitter API) MCInst –> Binary Code with relocations output to a rawostream
    3. the instruction parser (MCTargetAsmParser:MCAsmParserExtension API) Assembly line in .s file –> MCInst
    4. the instruction decoder (MCDisassembler API) Binary Code –> MCInst
    5. the assembly parser .s file –> MCStreamer
    6. the assembler backend (MCAsmStreamer:MCStreamer MCAssembler MCAsmBackend API)
    7. the compiler integration
  2. MCStreamer MCAssembler –> MCSection –> MCFragment –> MCInst

3.2 classes

  1. RISCVMCExpr:MCTargetExpr:MCExpr

    /// This is an extension point for target-specific MCExpr subclasses to
    /// implement.
    ///
    /// NOTE: All subclasses are required to have trivial destructors because
    /// MCExprs are bump pointer allocated and not destructed.
    class MCTargetExpr : public MCExpr {};
    
  2. RISCVMCAsmInfo:MCAsmInfoELF:MCAsmInfo

    /// This class is intended to be used as a base class for asm
    /// properties and features specific to the target.
    class MCAsmInfo {};
    
  3. RISCVAsmBackend:MCAsmBackend

    /// Generic interface to target specific assembler backends.
    class MCAsmBackend {};
    
  4. RISCVTargetELFStreamer:RISCVTargetStreamer:MCTargetStreamer

    /// Target specific streamer interface. This is used so that targets can
    /// implement support for target specific assembly directives.
    ///
    /// If target foo wants to use this, it should implement 3 classes:
    /// * FooTargetStreamer : public MCTargetStreamer
    /// * FooTargetAsmStreamer : public FooTargetStreamer
    /// * FooTargetELFStreamer : public FooTargetStreamer
    ///
    /// FooTargetStreamer should have a pure virtual method for each directive. For
    /// example, for a ".bar symbol_name" directive, it should have
    /// virtual emitBar(const MCSymbol &Symbol) = 0;
    ///
    /// The FooTargetAsmStreamer and FooTargetELFStreamer classes implement the
    /// method. The assembly streamer just prints ".bar symbol_name". The object
    /// streamer does whatever is needed to implement .bar in the object file.
    ///
    /// In the assembly printer and parser the target streamer can be used by
    /// calling getTargetStreamer and casting it to FooTargetStreamer:
    ///
    /// MCTargetStreamer &TS = OutStreamer.getTargetStreamer();
    /// FooTargetStreamer &ATS = static_cast<FooTargetStreamer &>(TS);
    ///
    /// The base classes FooTargetAsmStreamer and FooTargetELFStreamer should
    /// *never* be treated differently. Callers should always talk to a
    /// FooTargetStreamer.
    class MCTargetStreamer {};
    
  5. RISCVInstPrinter:MCInstPrinter

    /// This is an instance of a target assembly language printer that
    /// converts an MCInst to valid target assembly syntax.
    class MCInstPrinter {};
    
  6. RISCVTargetStreamer:MCTargetStreamer
  7. RISCVELFObjectWriter:MCELFObjectTargetWriter:MCObjectTargetWriter

    /// Base class for classes that define behaviour that is specific to both the
    /// target and the object format.
    class MCObjectTargetWriter {};
    
  8. RISCVCodeEmitter:MCCodeEmitter

    /// MCCodeEmitter - Generic instruction encoding interface.
    class MCCodeEmitter {};
    

3.3 Afterward:

This time, the compiler can compile a IR module into RISCV assembly and ELF object file.

4 Setup Asm Parser:

4.1 classes:

  1. RISCVAsmParser:MCTargetAsmParser:MCAsmParserExtension

    /// MCTargetAsmParser - Generic interface to target specific assembly parsers.
    class MCTargetAsmParser : public MCAsmParserExtension {};
    
  2. RISCVOperand:MCParsedAsmOperand

    /// MCParsedAsmOperand - This abstract class represents a source-level assembly
    /// instruction operand.  It should be subclassed by target-specific code.  This
    /// base class is used by target-independent clients and is the interface
    /// between parsing an asm instruction and recognizing it.
    class MCParsedAsmOperand {};
    

4.2 Afterward:

You can parse RISCV assembly into MCLayer, later on turn it into object file.

5 Setup Disassembler:

5.1 classes:

  1. RISCVDisassembler:MCDisassembler

    /// Superclass for all disassemblers. Consumes a memory region and provides an
    /// array of assembly instructions.
    class MCDisassembler {};
    

5.2 Afterward:

You can disassemble an object file into MCLayer, later on turn it into assembly file.