Notes about Tablegen and its Data Structures

319 阅读3分钟

1 Notes about Tablegen

  1. Tablegen has two parts, frontend and backend.
  2. The frontend is to parse tablegen source files(.td) then collect classes and records.
  3. Class refers to an abstract record class, declared with `class` keyword.
  4. Record refer to a concrete record, defined with `def` keyword, and mostly is instantiated from classes.
  5. The semantics of classes and records are defined by the backend of Tablegen.
  6. After source files parsed, the collected classes and records are kept in an instance of RecordKeeper. Then a particular backend is to do something on those data. E.g. CodeGenDAGISel uses those data to generate MatcherTable.
  7. The syntax of tablegen source file facilitate generation of classes and records.
  8. The syntax of class is similar to the C++ one, with instance data member only. In addition, class has template like C++, with value parameter only, and multiple inheritance with data value overwritten. Those features let the concept described by class can be parameterised.
  9. The syntax of record is to instantiate a record from classes, or just a plain record without class. That is a collection of fields of data.
  10. A field has name, value and type of value.

2 Data Structures

2.1 RecordKeeper

An instance of the RecordKeeper class acts as the container for all the classes and records parsed and collected by TableGen.

class RecordKeeper {
  using RecordMap = std::map<std::string, std::unique_ptr<Record>, std::less<>>;
  RecordMap Classes, Defs;
  std::map<std::string, Init *, std::less<>> ExtraGlobals;
  unsigned AnonCounter = 0;}

2.2 Record

Each class or record built by TableGen is represented by an instance of the Record class. The RecordKeeper instance contains one map for the classes and one for the records. The primary data members of a record are the record name, the vector of field names and their values, and the vector of superclasses of the record.

class Record {
  static unsigned LastID;
  Init *Name;
  // Location where record was instantiated, followed by the location of
  // multiclass prototypes used.
  SmallVector<SMLoc, 4> Locs;
  SmallVector<Init *, 0> TemplateArgs;
  SmallVector<RecordVal, 0> Values;
  // All superclasses in the inheritance forest in reverse preorder (yes, it
  // must be a forest; diamond-shaped inheritance is not allowed).
  SmallVector<std::pair<Record *, SMRange>, 0> SuperClasses;
  // Tracks Record instances. Not owned by Record.
  RecordKeeper &TrackedRecords;
  DefInit *TheInit = nullptr;
  // Unique record ID.
  unsigned ID;
  bool IsAnonymous;
  bool IsClass;}

2.3 RecordVal (better name: RecordField)

Each field of a record is stored in an instance of the RecordVal class. A RecordVal instance contains the name of the field, stored in an Init instance. It also contains the value of the field, likewise stored in an Init.

class RecordVal {
  Init *Name;
  PointerIntPair<RecTy *, 1, bool> TyAndPrefix;
  Init *Value;}

2.4 RecTy (better name: FieldTy)

The RecTy class is used to represent the types of field values. It is the base class for a series of subclasses, one for each of the available field types.

class RecTy {
  /// Subclass discriminator (for dyn_cast<> et al.)
  enum RecTyKind {
    BitRecTyKind,
    BitsRecTyKind,
    CodeRecTyKind,
    IntRecTyKind,
    StringRecTyKind,
    ListRecTyKind,
    DagRecTyKind,
    RecordRecTyKind
  };
  RecTyKind Kind;
  ListRecTy *ListTy = nullptr;};

class BitRecTy : public RecTy { static BitRecTy Shared; BitRecTy() : RecTy(BitRecTyKind) {}};

2.5 Init (better name: FiledVal)

The Init class is used to represent TableGen values. The name derives from initialization value. The Init class is the base class for a series of subclasses, one for each of the available value types.

class Init {
protected:
/// Discriminator enum (for isa<>, dyn_cast<>, et al.)
///
/// This enum is laid out by a preorder traversal of the inheritance
/// hierarchy, and does not contain an entry for abstract classes, as per
/// the recommendation in docs/HowToSetUpLLVMStyleRTTI.rst.
///
/// We also explicitly include "first" and "last" values for each
/// interior node of the inheritance tree, to make it easier to read the
/// corresponding classof().
///
/// We could pack these a bit tighter by not having the IK_FirstXXXInit
/// and IK_LastXXXInit be their own values, but that would degrade
/// readability for really no benefit.
enum InitKind : uint8_t {
IK_First, // unused; silence a spurious warning
IK_FirstTypedInit,
IK_BitInit,
IK_BitsInit,
IK_CodeInit,
IK_DagInit,
IK_DefInit,
IK_FieldInit,
IK_IntInit,
IK_ListInit,
IK_FirstOpInit,
IK_BinOpInit,
IK_TernOpInit,
IK_UnOpInit,
IK_LastOpInit,
IK_CondOpInit,
IK_FoldOpInit,
IK_IsAOpInit,
IK_StringInit,
IK_VarInit,
IK_VarListElementInit,
IK_VarBitInit,
IK_VarDefInit,
IK_LastTypedInit,
IK_UnsetInit
};
const InitKind Kind;
uint8_t Opc; // Used by UnOpInit, BinOpInit, and TernOpInit
};
/// This is the common super-class of types that have a specific,
/// explicit, type.
class TypedInit : public Init {
RecTy *Ty;

protected: explicit TypedInit(InitKind K, RecTy *T, uint8_t Opc = 0) : Init(K, Opc), Ty(T) {}}; /// '?' - Represents an uninitialized value class UnsetInit : public Init { UnsetInit() : Init(IK_UnsetInit) {}}; /// "foo" - Represent an initialization by a string value. class StringInit : public TypedInit { StringRef Value;

explicit StringInit(StringRef V) : TypedInit(IK_StringInit, StringRecTy::get()), Value(V) {}};