LLVM SDNodeDuring the DAG-based Instruction Selection in LLV

During the DAG-based Instruction Selection in LLVM, each IR instruction will be turned into a DAG node (SDNode, short of SelectionDAG Node), which contains its operation code (opcode for short), operands ( or uses), and results (or defs, values). Furthermore, types for its operands and results.

Here is an example of Generated DAG image of add IR instruction.

You can see the add SDNode has two references to its operands and the defined value refered by other SDNode. Its opcode is add nsw, and its node id is 13, the type of its value is i32. More completed example can be seen from this Post, LLVM Selection DAG Image Generation.

Here is the reduced SDNode class.

/* llvm-project/llvm/include/llvm/CodeGen/SelectionDAGNodes.h */
/// Represents one node in the SelectionDAG.
class SDNode {
private:
  /// The operation that this node performs.
  int16_t NodeType;
private:
  /// Unique id per SDNode in the DAG.
  int NodeId = -1;
  /// The values that are used by this operation.
  SDUse *OperandList = nullptr;
  /// The types of the values this node defines.  SDNode's may
  /// define multiple values simultaneously.
  const EVT *ValueList;
  /// List of uses for this SDNode.
  SDUse *UseList = nullptr;
  /// The number of entries in the Operand/Value list.
  unsigned short NumOperands = 0;
  unsigned short NumValues;
  // The ordering of the SDNodes. It roughly corresponds to the ordering of the
  // original LLVM instructions.
  // This is used for turning off scheduling, because we'll forgo
  // the normal scheduling algorithms and output the instructions according to
  // this ordering.
  unsigned IROrder;
};

For the some specific IR operations, there are corresponding derived SDNodes, such as LoadSDNode :: LSBaseSDNode :: MemSDNode :: SDNode, to contain more information.

The memory value type and machine memory reference information are added by MemSDNode, and some value is set by LoadSDNode, because for more specific SDNode, we can determine some values inside.

Moreover, the type of Operands is SDUse.

Represents a use of a SDNode. This class holds an SDValue, which records the SDNode being used and the result number, a pointer to the SDNode using the value, and Next and Prev pointers, which link together all the uses of an SDNode.

A SDValue is describe as following:

Unlike LLVM values, Selection DAG nodes may return multiple values as the result of a computation. Many nodes return multiple values, from loads (which define a token and a return value) to ADDC (which returns a result and a carry value), to calls (which may return an arbitrary number of values). As such, each use of a SelectionDAG computation must indicate the node that computes it as well as which return value to use from that node. This pair of information is represented with the SDValue value type. Which contains the node defining the value we are using and the result number indicating which return value of the node we are using.

Here is how a load SDNode was built for a SelectionDAG.

void SelectionDAGBuilder::visitLoad(const LoadInst &I) {
  ...
  const TargetLowering &TLI = DAG.getTargetLoweringInfo();
  const Value *SV = I.getOperand(0);
  ...
  SDValue Ptr = getValue(SV);
  Type *Ty = I.getType();
  Align Alignment = I.getAlign();
  ...  
  SmallVector<EVT, 4> ValueVTs, MemVTs;
  SmallVector<uint64_t, 4> Offsets;
  ComputeValueVTs(TLI, DAG.getDataLayout(), Ty, ValueVTs, &MemVTs, &Offsets);
  unsigned NumValues = ValueVTs.size();
  if (NumValues == 0)
    return;
  bool isVolatile = I.isVolatile();
  ...
  SDLoc dl = getCurSDLoc();
  ...
  // An aggregate load cannot wrap around the address space, so offsets to its
  // parts don't wrap either.
  SDNodeFlags Flags;
  Flags.setNoUnsignedWrap(true);
  SmallVector<SDValue, 4> Values(NumValues);
  SmallVector<SDValue, 4> Chains(std::min(MaxParallelChains, NumValues));
  EVT PtrVT = Ptr.getValueType();
  MachineMemOperand::Flags MMOFlags
    = TLI.getLoadMemOperandFlags(I, DAG.getDataLayout());
  unsigned ChainI = 0;
  for (unsigned i = 0; i != NumValues; ++i, ++ChainI) {
    ...
    SDValue A = DAG.getNode(ISD::ADD, dl,
                            PtrVT, Ptr,
                            DAG.getConstant(Offsets[i], dl, PtrVT),
                            Flags);
    SDValue L = DAG.getLoad(MemVTs[i], dl, Root, A,
                            MachinePointerInfo(SV, Offsets[i]), Alignment,
                            MMOFlags, AAInfo, Ranges);
    ...
    Values[i] = L;
  }
  ...
}

That is, for each IR instruction in a BasicBlock, the SelectionDAG builder will build a SDNode for it. Together, those IR instructions will be turned into a SelectionDAG for that BasicBlock.

Later on, after the IR SelectionDAG is built, during Instruction Selection, the target description files for instruction will describe the properties and constraints for each IR instruction, and the patterns for mapping those IR SDNode to target-specific instruction SDNode. That is, you can see the target description for a particular IR SDNode, to grasp its properties and constraints for pattern matching.

Here is the target description for Load SDNode.

// Selection DAG Node Type constraint.
class SDTypeConstraint<int opnum> {
  int OperandNum = opnum;
}
class SDTCisPtrTy<int OpNum> : SDTypeConstraint<OpNum>;

// Selection DAG Node Type Profile.
class SDTypeProfile<int numresults, int numoperands,
                    list<SDTypeConstraint> constraints> {
  int NumResults = numresults;
  int NumOperands = numoperands;
  list<SDTypeConstraint> Constraints = constraints;
}
def SDTLoad : SDTypeProfile<1, 1, [         // load
  SDTCisPtrTy<1>
]>;
// Selection DAG Node Property.
class SDNodeProperty;
// Selection DAG Pattern Operations
class SDPatternOperator {
  list<SDNodeProperty> Properties = [];
}
//===----------------------------------------------------------------------===//
// Selection DAG Node Properties.
//
// Note: These are hard coded into tblgen.
//
def SDNPCommutative : SDNodeProperty;   // X op Y == Y op X
def SDNPAssociative : SDNodeProperty;   // (X op Y) op Z == X op (Y op Z)
def SDNPHasChain    : SDNodeProperty;   // R/W chain operand and result
def SDNPOutGlue     : SDNodeProperty;   // Write a flag result
def SDNPInGlue      : SDNodeProperty;   // Read a flag operand
def SDNPOptInGlue   : SDNodeProperty;   // Optionally read a flag operand
def SDNPMayStore    : SDNodeProperty;   // May write to memory, sets 'mayStore'.
def SDNPMayLoad     : SDNodeProperty;   // May read memory, sets 'mayLoad'.
def SDNPSideEffect  : SDNodeProperty;   // Sets 'HasUnmodelledSideEffects'.
def SDNPMemOperand  : SDNodeProperty;   // Touches memory, has assoc MemOperand
def SDNPVariadic    : SDNodeProperty;   // Node has variable arguments.
def SDNPWantRoot    : SDNodeProperty;   // ComplexPattern gets the root of match
def SDNPWantParent  : SDNodeProperty;   // ComplexPattern gets the parent
// Selection DAG Node.
class SDNode<string opcode, SDTypeProfile typeprof,
             list<SDNodeProperty> props = [], string sdclass = "SDNode">
             : SDPatternOperator {
  string Opcode  = opcode;
  string SDClass = sdclass;
  let Properties = props;
  SDTypeProfile TypeProfile = typeprof;
}
// define the ld Selection DAG Node.
def ld         : SDNode<"ISD::LOAD"       , SDTLoad,
    [SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;

But the ld SDNode is not used directly. It will be used as part of PatFrag as a small part of Pattern. Here is an example of using ld as a PatFrag to define an instruction.

// Selection DAG Pattern Fragments.
// Pattern fragments are reusable chunks of dags that match specific things.
// They can take arguments and have C++ predicates that control whether they
// match.  They are intended to make the patterns for common instructions more
// compact and readable.
/// PatFrags - Represents a set of pattern fragments.  Each single fragment
/// can match something on the DAG, from a single node to multiple nested other
/// fragments.   The whole set of fragments matches if any of the single
/// fragments match.  This allows e.g. matching and "add with overflow" and
/// a regular "add" with the same fragment set.
class PatFrags<dag ops, list<dag> frags, code pred = [{}],
               SDNodeXForm xform = NOOP_SDNodeXForm> : SDPatternOperator {
  dag Operands = ops;
  list<dag> Fragments = frags;
  code PredicateCode = pred;
  code GISelPredicateCode = [{}];
  code ImmediateCode = [{}];
  SDNodeXForm OperandTransform = xform;
  // When this is set, the PredicateCode may refer to a constant Operands
  // vector which contains the captured nodes of the DAG, in the order listed
  // by the Operands field above.
  // This is useful when Fragments involves associative / commutative
  // operators: a single piece of code can easily refer to all operands even
  // when re-associated / commuted variants of the fragment are matched.
  bit PredicateCodeUsesOperands = 0;
  // Define a few pre-packaged predicates. This helps GlobalISel import
  // existing rules from SelectionDAG for many common cases.
  // They will be tested prior to the code in pred and must not be used in
  // ImmLeaf and its subclasses.
  // Is the desired pre-packaged predicate for a load?
  bit IsLoad = ?;
  // Is the desired pre-packaged predicate for a store?
  bit IsStore = ?;
  // Is the desired pre-packaged predicate for an atomic?
  bit IsAtomic = ?;
  // cast<LoadSDNode>(N)->getAddressingMode() == ISD::UNINDEXED;
  // cast<StoreSDNode>(N)->getAddressingMode() == ISD::UNINDEXED;
  bit IsUnindexed = ?;
  // cast<LoadSDNode>(N)->getExtensionType() != ISD::NON_EXTLOAD
  bit IsNonExtLoad = ?;
  // cast<LoadSDNode>(N)->getExtensionType() == ISD::EXTLOAD;
  bit IsAnyExtLoad = ?;
  // cast<LoadSDNode>(N)->getExtensionType() == ISD::SEXTLOAD;
  bit IsSignExtLoad = ?;
  // cast<LoadSDNode>(N)->getExtensionType() == ISD::ZEXTLOAD;
  bit IsZeroExtLoad = ?;
  // !cast<StoreSDNode>(N)->isTruncatingStore();
  // cast<StoreSDNode>(N)->isTruncatingStore();
  bit IsTruncStore = ?;
  // cast<MemSDNode>(N)->getAddressSpace() ==
  // If this empty, accept any address space.
  list<int> AddressSpaces = ?;
  // cast<MemSDNode>(N)->getAlignment() >=
  // If this is empty, accept any alignment.
  int MinAlignment = ?;
  // cast<AtomicSDNode>(N)->getOrdering() == AtomicOrdering::Monotonic
  bit IsAtomicOrderingMonotonic = ?;
  // cast<AtomicSDNode>(N)->getOrdering() == AtomicOrdering::Acquire
  bit IsAtomicOrderingAcquire = ?;
  // cast<AtomicSDNode>(N)->getOrdering() == AtomicOrdering::Release
  bit IsAtomicOrderingRelease = ?;
  // cast<AtomicSDNode>(N)->getOrdering() == AtomicOrdering::AcquireRelease
  bit IsAtomicOrderingAcquireRelease = ?;
  // cast<AtomicSDNode>(N)->getOrdering() == AtomicOrdering::SequentiallyConsistent
  bit IsAtomicOrderingSequentiallyConsistent = ?;
  // isAcquireOrStronger(cast<AtomicSDNode>(N)->getOrdering())
  // !isAcquireOrStronger(cast<AtomicSDNode>(N)->getOrdering())
  bit IsAtomicOrderingAcquireOrStronger = ?;
  // isReleaseOrStronger(cast<AtomicSDNode>(N)->getOrdering())
  // !isReleaseOrStronger(cast<AtomicSDNode>(N)->getOrdering())
  bit IsAtomicOrderingReleaseOrStronger = ?;
  // cast<LoadSDNode>(N)->getMemoryVT() == MVT::<VT>;
  // cast<StoreSDNode>(N)->getMemoryVT() == MVT::<VT>;
  ValueType MemoryVT = ?;
  // cast<LoadSDNode>(N)->getMemoryVT().getScalarType() == MVT::<VT>;
  // cast<StoreSDNode>(N)->getMemoryVT().getScalarType() == MVT::<VT>;
  ValueType ScalarMemoryVT = ?;
}
// PatFrag - A version of PatFrags matching only a single fragment.
class PatFrag<dag ops, dag frag, code pred = [{}],
              SDNodeXForm xform = NOOP_SDNodeXForm>
  : PatFrags<ops, [frag], pred, xform>;
// load fragments.
def unindexedload : PatFrag<(ops node:$ptr), (ld node:$ptr)> {
let IsLoad = 1;
let IsUnindexed = 1;
}
def load : PatFrag<(ops node:$ptr), (unindexedload node:$ptr)> {
let IsLoad = 1;
let IsNonExtLoad = 1;
}
class AlignedLoad<PatFrag Node> :
  PatFrag<(ops node:$ptr), (Node node:$ptr), [{
  LoadSDNode *LD = cast<LoadSDNode>(N);
  return LD->getMemoryVT().getSizeInBits()/16 <= LD->getAlignment();
}]>;
def load_a          : AlignedLoad<load>;
class FMem<bits<4> op, dag outs, dag ins, string asmstr, list<dag> pattern,
          InstrItinClass itin>: FC<op, outs, ins, asmstr, pattern, itin> {
  bits<9> addr;
  let imm9 = addr;
  let DecoderMethod = "DecodeMem";
}
class LoadM<bits<4> op, string instr_asm, PatFrag OpNode, RegisterClass RC,
            Operand MemOpnd, bit Pseudo>:
  FMem<op, (outs RC:$ra), (ins MemOpnd:$addr),
     !strconcat(instr_asm, "\t$ra, $addr"),
     [(set RC:$ra, (OpNode addr:$addr))], IILoad> {
  let isPseudo = Pseudo;
}
def LD: LoadM<0b0010, "ld", load_a, GPROut, mem, Pseudo>;

That is, a PatFrag can be used as an SDNode, because each single fragment can match something on the DAG, from a single node to multiple nested other fragments.