LLVM Backend Structure

337 阅读1分钟

There is an excellent article explaining the backend structure of LLVM, i.e. 
Tutorial: Creating an LLVM Backend for the Cpu0 Architecture. The following is my understanding of that article and summary.

LLVM Backend is to translate LLVM IR into a targeted Instruction Set Architecture, e.g. ARM, MIPS etc. Therefore what are ISA consisted of ? It impacts how the LLVM Backend is constructed.

First of all is a set of native instructions that can be executed by the CPU implementing that ISA.

Second is a set of registers that those instructions can operate on.

Third is the memory model that describes how to access memory of on that ISA.

Four is the calling convention that controls how a routine calls another routine as subroutine with how to pass arguments and get the return result.

Five is the scheduling of instructions. Because there are some different functional units in a CPU, which make those instructions execute in parallel or sequencial and with different CPU cycle. Based on those scheduling information, the compiler will twist the order of sequence of instructions to gain more performance.

To sum up:

  1. Instruction Set

  2. Register Set

  3. Memory Model

  4. Calling Convention

  5. Instruction Itineraries

So we need to tell the LLVM Backend those information by some target description files (*.td), and supply corresponding class to facilitate the use of those information with some custom behaviors at different compile stages. The following images are constructed based on above article, with some refinement.

Here is the target description files and their relations.

And the following image show the related classes:

By these two images, we can know what are the minimal works needed to be done to make a simple LLVM Backend. 

In next post, I will use those two as guide to write a simplified version of LLVM Backend targeted to LCC(Low Cost Computer) ISA, which is the computer model in the Book, C and C++ under the Hood.

LCC is a modification and extension of the LC-3 model of Yale N. Patt and Sanjay J. Patel in their book Introduction to Computing Systems. --C and C++ under the Hood