WTSC 4: Compiler Driver

218 阅读4分钟

The Swift Compiler does something more than transforming source code into executable or libraries. It needs to interact with users, and do what users ask to do. And the compiler driver is exactly the one who manage the compilation process according users' demands.

In this post, we are going to see how compiler driver to its work.

We know that the swift compiler driver (swift-frontend) is an ordinary executable written in C++, and it is built by cmake, therefore, in cmake scripts (CMakeLists.txt) there shouold be an command that build as executable named swift-frontend. So we can use git-grep to search over cmake scripts to find out which file contain this kind of command.

git -C /wtsc/swift grep --line-number "(swift-fronted"

There would show a bunch of results. By picking the most probable one, files should be CMakeLists.txt and not dependencies, not symbolic link and so, then we can see this one.

### /wtsc/swift/tools/driver/CMakeLists.txt

add_swift_host_tool(swift-frontend
  driver.cpp
  autolink_extract_main.cpp
  modulewrap_main.cpp
  swift_indent_main.cpp
  swift_symbolgraph_extract_main.cpp
  SWIFT_COMPONENT compiler
)
...
add_swift_tool_symlink(swift swift-frontend compiler)
add_swift_tool_symlink(swiftc swift-frontend compiler)
add_swift_tool_symlink(swift-symbolgraph-extract swift-frontend compiler)
add_swift_tool_symlink(swift-autolink-extract swift-frontend autolink-driver)
add_swift_tool_symlink(swift-indent swift-frontend editor-integration)

By checkout that file, we can guess swift-frontend is built with other .cpp files in add_swift_host_tool function, which eventually call add_executable function of cmake builtin functions to make an executable. And the last five commands on above code snappet, we can see the swift, swiftc, etc, in bin directories is symbolic link to swift-frontend and they are made there.

To conform add_swift_host_tool would eventually call add_executable, we can use git-grep again until we get to add_executable function.

git -C /wtsc/swift grep --line-number "function(add_swift_host_tool"
cmake/modules/AddSwift.cmake:function(add_swift_host_tool executable)


## /wtsc/swift/cmake/modules/AddSwift.cmake
...
function(add_swift_host_tool executable)
  ...
  add_executable(${executable} ${ASHT_UNPARSED_ARGUMENTS})
  ...
endfunction()

Here we know that in /wtsc/swift/tools/driver/CMakeLists.txt there is an command call add_swift_host_tool, which call add_executable, to create swift-frontend executable.

Now we know somehow swift-frontend is built, but beforewe dive deeper to know how the logic is implemented in source code. We are going to check out what are the functionalities of compiler driver. To see this, we start from our helloworld program.

mkdir /wtsc/helloworld
cd /wtsc/helloworld
echo 'print("Hello world!\n")' > Helloworld.swift
PATH=/wtsc/usr/bin:$PATH
swiftc Helloworld.swift -o helloworld
./helloworld
Hello world!

Here we use swiftc to compile Helloworld.swift to an executable named helloworld. To see the series of jobs have been done during this compilation process, we can add -driver-print-jobs argument to swiftc.

swiftc -driver-print-jobs Helloworld.swift -o helloworld
/wtsc/usr/bin/swift-frontend \
    -frontend -c \
    -primary-file Helloworld.swift \
    -target x86_64-unknown-linux-gnu \
    -disable-objc-interop -color-diagnostics \
    -module-name helloworld \
    -o /tmp/Helloworld-bc2603.o
/wtsc/usr/bin/swift-autolink-extract /tmp/Helloworld-bc2603.o \
    -o /tmp/Helloworld-a15069.autolink
/wtsc/usr/bin/clang \
    -target x86_64-unknown-linux-gnu \
    -fuse-ld=gold -pie \
    -Xlinker -rpath \
    -Xlinker /wtsc/usr/lib/swift/linux \
    /wtsc/usr/lib/swift/linux/x86_64/swiftrt.o \
    /tmp/Helloworld-bc2603.o \
    @/tmp/Helloworld-a15069.autolink \
    -L /wtsc/usr/lib/swift/linux \
    -lswiftCore \
    --target=x86_64-unknown-linux-gnu \
    -o helloworld

Becuase it will delete /tmp files, if we want to see the imtermediate result, we can pass -save-temps argument to swiftc, because of following code in libswiftDriver.

/* /wtsc/swift/lib/Driver/Compilation.cpp */
...
int Compilation::performJobs(std::unique_ptr<TaskQueue> &&TQ) {
 ...
  if (!SaveTemps) {
    for (const auto &pathPair : TempFilePaths) {
      if (!abnormalExit || pathPair.getValue() == PreserveOnSignal::No)
        (void)llvm::sys::fs::remove(pathPair.getKey());
    }
  }
...
}

Or we can instead execute those three commands above on bash, and replace /tmp with other directory, such as /wtsc/helloworld, then the imtermediate result would be placed at /wtsc/helloworld.

The first command is to compile source code file into object file, and the last one is to call clang to link object files and libraries into an executable. What does the second one do? From swift documents, we get this description.

Autolinking: 
Swift object files encode information about what libraries they depend on. 
On Apple platforms the linker can read this information directly; 
on other platforms it's extracted using the swift-autolink-extract helper tool. 
Of course the build system can also provide manual link commands too.

Moreover we can checkout what is inside the .autolink file.

cat /tmp/Helloworld-a15069.autolink
-lswiftSwiftOnoneSupport
-lswiftCore

There is one more question at third command, why there is @ prefixes at autolink file. We can find the answer by checking out the help of ld.

ld --help
...
@FILE                       Read options from FILE
...

Now, it is quite clear what is the purpose of each job.

And by using -driver-print-actions, we can see its corresponding actions which will be implemented as jobs above.

swiftc -driver-print-actions Helloworld.swift -o helloworld
0: input, "Helloworld.swift", swift
1: compile, {0}, object
2: swift-autolink-extract, {1}, autolink
3: link, {1, 2}, image

Here we know clearly that the compiler driver will split up a driver invocation into smaller jobs to finish user's demand. But why we invoke swift-frontend by its alias swiftc not swift or other symbolic link or swift-frontend directly? It is the question we would ask why we need different alias to invoke swift-frontend. To answer this question, we can checkout libswiftDriver source code, Driver.h and Driver.cpp.

/* /wtsc/swift/include/swift/Driver/Driver.h */
...
class Driver {
public:
  /// DriverKind determines how later arguments are parsed, as well as the
  /// allowable OutputInfo::Mode values.
  enum class DriverKind {
    Interactive,     // swift
    Batch,           // swiftc
    AutolinkExtract, // swift-autolink-extract
    SwiftIndent,     // swift-indent
    SymbolGraph      // swift-symbolgraph
  };
....
}
....

/* /wtsc/swift/lib/Driver/Driver.cpp */
...
void Driver::parseDriverKind(ArrayRef<const char *> Args) {
  ...
  Optional<DriverKind> Kind =
  llvm::StringSwitch<Optional<DriverKind>>(DriverName)
  .Case("swift", DriverKind::Interactive)
  .Case("swiftc", DriverKind::Batch)
  .Case("swift-autolink-extract", DriverKind::AutolinkExtract)
  .Case("swift-indent", DriverKind::SwiftIndent)
  .Case("swift-symbolgraph-extract", DriverKind::SymbolGraph)
  .Default(None);
  ...
}
...

From these two code snippets, we know that based on the driver name, the invoked program name, the swift-frontend will determine its kind, then accept different arguments and perform different actions. 

To sum up a bit, first, the symbolic links to swift-frontend will invoke swift-frontend in different mode then accept different arguments and perform different actions. Second, the swift-frontend executable will call functions in libswiftDriver library, and then call libswiftFrontendTool, libswiftFrontend to finish compilation. Third, swift-frontend as compiler driver will split up an invocation into small jobs to finish compilation.

Next, we can dive into source code to see how driver implements, with the guide of Driver Design & Internals from apple swift open source project.

There are six main stages.

1. Checks whether it needs to be split up:

2. swift::Driver and swift::Compilation.

4. Splits up the work of a Compilation into a graph of swift::Action objetcs:

5. Instantiates a list of swift::Job based on those actions:

6. Executes each of the list of swift::Job:

Stage 1: Splitting up or not

First of all, we need to get to its entry point,

int main(int argc, char *argv[])

or its variants. We use git-grep again.

git -C /wtsc/swift grep --line-number "int main(int"

There are a bunch of results. By chosing the files which are listed in add_swift_host_tool arguments,  then we can see /wtsc/swift/tools/driver/driver.cpp  is the one we need.

/* /wtsc/swift/tools/driver/driver.cpp */
...
int main(int argc_, const char **argv_) {
...
// Check if this invocation should execute a subcommand.
  StringRef ExecName = llvm::sys::path::stem(argv[0]);
  SmallString<256> SubcommandName;
  bool isRepl = false;
  if (shouldRunAsSubcommand(ExecName, SubcommandName, argv, isRepl)) {
  ...
  // Execute the subcommand.
    subCommandArgs.push_back(nullptr);
    ExecuteInPlace(SubcommandPath.c_str(), subCommandArgs.data());
  ...
  }
...
  if (isRepl) {
    ...
    return run_driver(ExecName, replArgs);
  } else {
    return run_driver(ExecName, argv);
  }
...
}
...

The code snippet above shows that the swift-frontend program will check whether the command is subcommand, e.g. swift build, swift test, etc, which should match the subcommand form of "swift subcommand [argument...]". 

If yes, it will invoke a program named swift-subcommand with all  of it receiving arguments except the first two and prepend swift-subcommand as program name in those arguments. Therefore, there is a trick by exploiting swift compiler's subcommand mechanism.

ln -s /wtsc/helloworld/helloworld /wtsc/usr/bin/swift-hello
swift hello
Hello world!

If not, the program gets into run_driver function to further check the command line, including invoked program and arguments, if it need not be splitted and performs its job. The following code snipet shows that with -frontend, -modulewrap, etc, or the driver kind is of Driver::DriverKind::AutolinkExtract, Driver::DriverKind::SwiftIndent, Driver::DriverKind::SymbolGraph , the driver should perform its job without futher seperation.

/* /wtsc/swift/tools/driver/driver.cpp */static int run_driver(StringRef ExecName,
                       const ArrayRef<const char *> argv) {
  // Handle integrated tools.
  if (argv.size() > 1) {
    StringRef FirstArg(argv[1]);
    if (FirstArg == "-frontend") {
      return performFrontend(llvm::makeArrayRef(argv.data()+2,
                                                argv.data()+argv.size()),
                             argv[0], (void *)(intptr_t)getExecutablePath);
    }
    if (FirstArg == "-modulewrap") {
      return modulewrap_main(llvm::makeArrayRef(argv.data()+2,
                                                argv.data()+argv.size()),
                             argv[0], (void *)(intptr_t)getExecutablePath);
    }

    // Run the integrated Swift frontend when called as "swift-frontend" but
    // without a leading "-frontend".
    if (!FirstArg.startswith("--driver-mode=")
        && ExecName == "swift-frontend") {
      return performFrontend(llvm::makeArrayRef(argv.data()+1,
                                                argv.data()+argv.size()),
                             argv[0], (void *)(intptr_t)getExecutablePath);
    }
  }
...

  Driver TheDriver(Path, ExecName, argv, Diags);
  switch (TheDriver.getDriverKind()) {
  case Driver::DriverKind::AutolinkExtract:
    return autolink_extract_main(
      TheDriver.getArgsWithoutProgramNameAndDriverMode(argv),
      argv[0], (void *)(intptr_t)getExecutablePath);
  case Driver::DriverKind::SwiftIndent:
    return swift_indent_main(
      TheDriver.getArgsWithoutProgramNameAndDriverMode(argv),
      argv[0], (void *)(intptr_t)getExecutablePath);
  case Driver::DriverKind::SymbolGraph:
      return swift_symbolgraph_extract_main(TheDriver.getArgsWithoutProgramNameAndDriverMode(argv), argv[0], (void *)(intptr_t)getExecutablePath);
  default:
    break;
  }
...
}

Stage 2: swift::Driver and swift::Compilation

In the late of previous stage, we have an instance of driver. And because of the driver kind,  and the driver kind is not the inseperation one, the driver will parse the arguments and determine what are the actions needing to do and designate what jobs to which action to accomplish that action.

/* /wtsc/swift/tools/driver/driver.cpp */
static int run_driver(StringRef ExecName,
                       const ArrayRef<const char *> argv) {
...
std::unique_ptr<llvm::opt::InputArgList> ArgList =
    TheDriver.parseArgStrings(ArrayRef<const char*>(argv).slice(1));
  if (Diags.hadAnyError())
    return 1;

  std::unique_ptr<ToolChain> TC = TheDriver.buildToolChain(*ArgList);
  if (Diags.hadAnyError())
    return 1;

  std::unique_ptr<Compilation> C =
      TheDriver.buildCompilation(*TC, std::move(ArgList));

  if (Diags.hadAnyError())
    return 1;

  if (C) {
    std::unique_ptr<sys::TaskQueue> TQ = TheDriver.buildTaskQueue(*C);
    if (!TQ)
        return 1;
    return C->performJobs(std::move(TQ));
  }

  return 0;
}

The above code snippet does the following work.

1. Parsing string arguments into instances of InputArgList, and doing some validations on arguments based on different commands by swift::Driver::parseArgStrings method.

2. Make a tool chain based on the target platform.

3. Make a Compilation representing the process of compilation.

4. perform jobs on task queue.

But conceptually, it does more things behind, particular inside swift::Driver::buildCompilation method.

1. Parse: Option parsing

2. Pipeline: Converting Args into Actions

3. Build: Translating Actions into Jobs using a Toolchain

4. Schedule: Ordering and skipping jobs by dependency analysis

5. Batch: Optionally combine similar jobs

6. Execute: Running the Jobs in a Compilation using TaskQueue.

The swift document, Driver Internals, states that work conceptually clear.  Here I wanna show the code snippet corresponding each part of the whole picture. I will use 6 more posts for showing the above stages in details.

Part1. Parse: Option parsing

We need some terms before we move on into option parsing. What are command line, command, option, argument?

As we know, when we tell shell on linux what to do, we input a string of characters then type 'Enter' key, and the 'Enter' key on shell is interpreted as new line character, '\n', that means a line is finished, moreover the shell would accept input line by line. If you want to use multiple lines for an input string, before you clike 'Enter', you need to add a backlash, '\', to escape the normal meaning of 'Enter' key. Then the shell will know you want to continue the same line but display it on next, in other words, the string you input is still one line that will not cause shell to finish accepting input. That is command line. a whole string line that cause shell finishing input acception and do what it means.

After the shell accepts a command line, it will seperate the whole string by whitespace, e.g. space key, tab key, etc, into smaller strings without whitespace except the string enclosed by double quotion or single quotion which will be treated as one parts.

The first sub-string is command, and all sub-strings are arguments. That means the fisrt argument is the string representing the command. A command can be an executable binary, an executable script, builtin function of shell, an alias/symlink to other executable. The arguments are passed to those executables from command line by shell. 

An option is a type of argument modifying the behavior of a command. As their names suggets, options are usually optional. Some arguments are options, usually prefixed by '-' or '--', and some arguments are arguments for options not the command, which are usually following options. Or some options have form as "-option=value" that sets value for that option, if an option is not assigned value, that option is a flag, which can be on (present) or off(not exist).  In following example,

swiftc -o helloworld Helloworld.swift

The command line is the whole string.

The command is swiftc, which is a symbolic link to swift-frontend. By the way, the shell will search the request command, swiftc, from its PATH variable, which is semicolon-seperated list string encoding which directories for searching command in order.

The arguments are 'swiftc', '-o', 'helloworld' and 'Helloworld.swift'.

The option is '-o'.

The arugment for option '-o' is 'helloworld'.

Beside options and arguments for options, the rest of arguments, 'Helloworld.swift', are for command.

Based on the above knowledge, there are some classes in llvm Opiton library representing those concepts and their relationship.

Particularly, class Arg, class ArgList, class InputArgList, class Option, class OptTable

/// A concrete instance of a particular driver option.
///
/// The Arg class encodes just enough information to be able to
/// derive the argument values efficiently.
class Arg

/// ArgList - Ordered collection of driver arguments.
///
/// The ArgList class manages a list of Arg instances as well as
/// auxiliary data and convenience methods to allow Tools to quickly
/// check for the presence of Arg instances for a particular Option
/// and to iterate over groups of arguments.
class ArgList
class InputArgList final : public ArgList

/// Option - Abstract representation for a single form of driver
/// argument.
///
/// An Option class represents a form of option that the driver
/// takes, for example how many arguments the option has and how
/// they can be provided. Individual option instances store
/// additional information about what group the option is a member
/// of (if any), if the option is an alias, and a number of
/// flags. At runtime the driver parses the command line into
/// concrete Arg instances, each of which corresponds to a
/// particular Option instance.
class Option

/// Provide access to the Option info table.
///
/// The OptTable class provides a layer of indirection which allows Option
/// instance to be created lazily. In the common case, only a few options will
/// be needed at runtime; the OptTable class maintains enough information to
/// parse command lines without instantiating Options, while letting other
/// parts of the driver still use Option instances where convenient.
class OptTable {
...
/// Entry for a single option instance in the option data table.
  struct Info
...
private:
  /// The option information table.
  std::vector<Info> OptionInfos;
...
}

However, different command requires different options and its accepting arguments. How to specify the entire set of options and arguments to a command is up to the command author, because those options and arguments when representing on shell are just a whitespace-seperated string line. How to parse it into options and aguments is the job of command. Moreover, llvm offers a framework for doing the arugment parsing in order to reduce replicate works for command writer. So the above classes are common parts which need not be changed when writing a new program using llvm argument parsing framework, the way to specify your own set of options and arguments are to write a td (target descritption) file which describe those options and arugments.  And llvm infrastructure will call tablegen program to parse the td files and convert them into cpp files which will be integreted into your program source code, then together produce the whole program.

We will see how td file look like, how tablegen parses and converts it, how to integret generate cpp file.

There some documents from llvm.org, TableGen Overview, TableGen Programmer's Reference, TableGen Manual, that can show us how tablegen works. 

Over all, the td file uses compat syntax to define records for collecting information describing an instance of a type. And tablegen parses it and use those information to generate different required format or files.

For example, a car.  

/* /wtsc/car/car.td */
class Car <string b, string c, int yyyy> {
      string brand = b;
      string country = c;
      int productionYear = yyyy;
}

def Ford: Car<"Ford", "USA", 1960>;

/wtsc/build/Ninja-ReleaseAssert+swift-DebugAssert/llvm-linux-x86_64/bin/llvm-tblgen -print-records car.td
------------- Classes -----------------
class Car<string Car:b = ?, string Car:c = ?, int Car:yyyy = ?> {
  string brand = Car:b;
  string country = Car:c;
  int productionYear = Car:yyyy;
}
------------- Defs -----------------
def Ford {	// Car
  string brand = "Ford";
  string country = "USA";
  int productionYear = 1960;
}

Because tablegen doesn't know how to generate something from our car, class Car. So we can't generate a cpp file from our car td file.

So let us have a look the swift compiler's td files for options and how to use tablegen to convert td files into cpp files.

Here is Swift's option files (.td files):

ll /wtsc/swift/include/swift/Option/*.td
 inode Permissions Links Size User Date Modified Name
272809 .rw-rw-r--      1  34k k    13 Oct 18:55  /wtsc/swift/include/swift/Option/FrontendOptions.td
272811 .rw-rw-r--      1  54k k    13 Oct 18:55  /wtsc/swift/include/swift/Option/Options.td

The -driver-print-jobs option, which we use to list jobs needed to be done, is defined in Options.td:

def driver_print_jobs : Flag<["-"], "driver-print-jobs">, InternalDebugOpt,
  HelpText<"Dump list of jobs to execute">;

We can use llvm-tblgen to transform the file in which it's defined.

/wtsc/build/Ninja-ReleaseAssert+swift-DebugAssert/llvm-linux-x86_64/bin/llvm-tblgen \
 -I /wtsc/llvm-project/llvm/include \
 -I /wtsc/swift/include/swift/Option \
 /wtsc/swift/include/swift/Option/Options.td \
 -gen-opt-parser-defs
...
OPTION(prefix_1,                                    //PREFIX
       &"-driver-print-jobs"[1],                    //NAME
       driver_print_jobs,                           //ID
       Flag,                                        //KIND
       internal_debug_Group,                        //GOUP
       INVALID,                                     //ALIAS
       nullptr,                                     //ALIASARGS
       HelpHidden | DoesNotAffectIncrementalBuild,  //FLAGS
       0,                                           //PARAM
       "Dump list of jobs to execute",              //HELPTEXT
       nullptr,                                     //METAVAR
       nullptr)                                     //VALUES
...

By checking out Options.h file, we can see the meaning of each item in option.

/* /wtsc/swift/include/swift/Option/options.h */
...
#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
               HELPTEXT, METAVAR, VALUES)                                      \
...

If you want to make more sure about it, you can checkout llvm-tblgen's source code file to see what it means for each item.

/* /wtsc/llvm-project/llvm/utils/TableGen/OptParserEmitter.cpp */
...
/// OptParserEmitter - This tablegen backend takes an input .td file
/// describing a list of options and emits a data structure for parsing and
/// working with those options when given an input command line.
namespace llvm {
void EmitOptParser(RecordKeeper &Records, raw_ostream &OS) {
...
OS << "//////////\n";
  OS << "// Options\n\n";
  auto WriteOptRecordFields = [&](raw_ostream &OS, const Record &R) {
    // The option prefix;
    std::vector<StringRef> prf = R.getValueAsListOfStrings("Prefixes");
    OS << Prefixes[PrefixKeyT(prf.begin(), prf.end())] << ", ";
    // The option string.
    emitNameUsingSpelling(OS, R);
    // The option identifier name.
    OS << ", " << getOptionName(R);
    // The option kind.
    OS << ", " << R.getValueAsDef("Kind")->getValueAsString("Name");
    // The containing option group (if any).
    OS << ", ";
    const ListInit *GroupFlags = nullptr;
    if (const DefInit *DI = dyn_cast<DefInit>(R.getValueInit("Group"))) {
      GroupFlags = DI->getDef()->getValueAsListInit("Flags");
      OS << getOptionName(*DI->getDef());
    } else
      OS << "INVALID";
    // The option alias (if any).
    OS << ", ";
    if (const DefInit *DI = dyn_cast<DefInit>(R.getValueInit("Alias")))
      OS << getOptionName(*DI->getDef());
    else
      OS << "INVALID";
    // The option alias arguments (if any).
    // Emitted as a \0 separated list in a string, e.g. ["foo", "bar"]
    // would become "foo\0bar\0". Note that the compiler adds an implicit
    // terminating \0 at the end.
    OS << ", ";
    std::vector<StringRef> AliasArgs = R.getValueAsListOfStrings("AliasArgs");
    if (AliasArgs.size() == 0) {
      OS << "nullptr";
    } else {
      OS << "\"";
      for (size_t i = 0, e = AliasArgs.size(); i != e; ++i)
        OS << AliasArgs[i] << "\\0";
      OS << "\"";
    }
    // The option flags.
    OS << ", ";
    int NumFlags = 0;
    const ListInit *LI = R.getValueAsListInit("Flags");
    for (Init *I : *LI)
      OS << (NumFlags++ ? " | " : "") << cast<DefInit>(I)->getDef()->getName();
    if (GroupFlags) {
      for (Init *I : *GroupFlags)
        OS << (NumFlags++ ? " | " : "")
           << cast<DefInit>(I)->getDef()->getName();
    }
    if (NumFlags == 0)
      OS << '0';
    // The option parameter field.
    OS << ", " << R.getValueAsInt("NumArgs");
    // The option help text.
    if (!isa<UnsetInit>(R.getValueInit("HelpText"))) {
      OS << ",\n";
      OS << "       ";
      write_cstring(OS, R.getValueAsString("HelpText"));
    } else
      OS << ", nullptr";
    // The option meta-variable name.
    OS << ", ";
    if (!isa<UnsetInit>(R.getValueInit("MetaVarName")))
      write_cstring(OS, R.getValueAsString("MetaVarName"));
    else
      OS << "nullptr";
    // The option Values. Used for shell autocompletion.
    OS << ", ";
    if (!isa<UnsetInit>(R.getValueInit("Values")))
      write_cstring(OS, R.getValueAsString("Values"));
    else
      OS << "nullptr";
  };
  std::vector<std::unique_ptr<MarshallingKindInfo>> OptsWithMarshalling;
  for (unsigned I = 0, E = Opts.size(); I != E; ++I) {
    const Record &R = *Opts[I];
    // Start a single option entry.
    OS << "OPTION(";
    WriteOptRecordFields(OS, R);
    OS << ")\n";
    if (!isa<UnsetInit>(R.getValueInit("MarshallingKind")))
      OptsWithMarshalling.push_back(MarshallingKindInfo::create(R));
  }
  OS << "#endif // OPTION\n";
...
}
...

Overall, the tablegen would takes all option records (of class Option) in Options.td and outputs them as calls to a C macro named OPTION.

 Use our -driver-print-jobs as example, it is defined as instance of Flag, which is subclass of Option.

class Flag<list<string> prefixes, string name>
  : Option<prefixes, name, KIND_FLAG>;

If we pass -print-records argument to llvm-tblgen instead of -gen-opt-parser-defs, we can see driver-print-jobs is of class Option, Flag, Group, etc, which are showed after double forelash, '//'. 

/wtsc/build/Ninja-ReleaseAssert+swift-DebugAssert/llvm-linux-x86_64/bin/llvm-tblgen \
 -I /wtsc/llvm-project/llvm/include \
 -I /wtsc/swift/include/swift/Option \
 /wtsc/swift/include/swift/Option/Options.td \
 -print-records
...
def driver_print_jobs {	// Option Flag Group Flags InternalDebugOpt HelpText
  string EnumName = ?;
  list<string> Prefixes = ["-"];
  string Name = "driver-print-jobs";
  OptionKind Kind = KIND_FLAG;
  int NumArgs = 0;
  string HelpText = "Dump list of jobs to execute";
  string MetaVarName = ?;
  string Values = ?;
  code ValuesCode = ?;
  list<OptionFlag> Flags = [HelpHidden, DoesNotAffectIncrementalBuild];
  OptionGroup Group = internal_debug_Group;
  Option Alias = ?;
  list<string> AliasArgs = [];
  string MarshallingKind = ?;
  code KeyPath = ?;
  code DefaultValue = ?;
  bit ShouldAlwaysEmit = 0;
  bit IsPositive = 1;
  code NormalizerRetTy = ?;
  code NormalizedValuesScope = [{}];
  code Normalizer = [{}];
  code Denormalizer = [{}];
  list<code> NormalizedValues = ?;
}
...

And the C macro line of -driver-print-jobs has been showed above.

Next, we gonna see how swift use this mechanism to use td files to specify its accepting options and arguments.

In the swift driver executable, its main function calls run_driver function, which instantiate an swift::Driver instance from libswiftDriver library and call its swift::Driver::parseArgString method to parse string arguments into llvm:opt::InputArgList, which calls llvm::opt::OptTable::ParseArgs method of Driver's instance data member opts which is std::unique_ptrllvm::opt::OptTable from libllvmOption library. And the opts is created in the construct of swift::Driver by createSwiftOptTable() which is from libswiftOption library.

Therefore we can see how libswiftOption is built in its CMakeLists.txt file.

/* /wtsc/swift/lib/Option/CMakeLists.txt */
add_swift_host_library(swiftOption STATIC
  Options.cpp
  SanitizerOptions.cpp)
add_dependencies(swiftOption
  SwiftOptions)
target_link_libraries(swiftOption PRIVATE
  swiftBasic)

It shows that libswiftOption depends on SwiftOptions, so we wanna have a look what is SwiftOptions with git-grep.

git -C /wtsc/swift grep -n "(SwiftOptions"
include/swift/Option/CMakeLists.txt:3:swift_add_public_tablegen_target(SwiftOptions)

/* /wtsc/swift/include/swift/Option/CMakeLists.txt */
set(LLVM_TARGET_DEFINITIONS Options.td)
swift_tablegen(Options.inc -gen-opt-parser-defs)
swift_add_public_tablegen_target(SwiftOptions)

Cool, we get to Options.td file, which defines all information about swift's options and arguments.

So let us find out what these three cmake command do. With git-grep and checkout swift_tablegen and swift_add_public_tablegen_target definition, we got the following code snippet that satisfy our purpose.

# /wtsc/llvm-project/llvm/cmake/modules/TableGen.cmake 
...
# ofn means output file name
function(tablegen project ofn)
...
  add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/${ofn}
    COMMAND ${${project}_TABLEGEN_EXE} ${ARGN} -I ${CMAKE_CURRENT_SOURCE_DIR}
    ${tblgen_includes}
    ${LLVM_TABLEGEN_FLAGS}
    ${LLVM_TARGET_DEFINITIONS_ABSOLUTE}
    ${tblgen_change_flag}
    ${additional_cmdline}
    # The file in LLVM_TARGET_DEFINITIONS may be not in the current
    # directory and local_tds may not contain it, so we must
    # explicitly list it here:
    DEPENDS ${${project}_TABLEGEN_TARGET} ${${project}_TABLEGEN_EXE}
      ${local_tds} ${global_tds}
    ${LLVM_TARGET_DEFINITIONS_ABSOLUTE}
    COMMENT "Building ${ofn}..."
    )

  # `make clean' must remove all those generated files:
  set_property(DIRECTORY APPEND PROPERTY ADDITIONAL_MAKE_CLEAN_FILES ${ofn})

  set(TABLEGEN_OUTPUT ${TABLEGEN_OUTPUT} ${CMAKE_CURRENT_BINARY_DIR}/${ofn} PARENT_SCOPE)
  set_source_files_properties(${CMAKE_CURRENT_BINARY_DIR}/${ofn} PROPERTIES
    GENERATED 1)
endfunction()

# Creates a target for publicly exporting tablegen dependencies.
function(add_public_tablegen_target target)
...
  add_custom_target(${target}
    DEPENDS ${TABLEGEN_OUTPUT})
  if(LLVM_COMMON_DEPENDS)
    add_dependencies(${target} ${LLVM_COMMON_DEPENDS})
  endif()
  set_target_properties(${target} PROPERTIES FOLDER "Tablegenning")
  set(LLVM_COMMON_DEPENDS ${LLVM_COMMON_DEPENDS} ${target} PARENT_SCOPE)
endfunction()
...

The above code sinppet says that create a custom command to generate a cpp file, and create a custom target which can be depended by other target. When target A depends on target B, B will be built before A, and A can use the output of B. That is what we need. Our libswiftOption target depends on SwiftOption custom target which depends on TABLEGEN_OUTPUT, the output files generated by tablegen custom command, then transitively depends on the tablegen custom command which generate cpp files from td files according to our needs.

By the way, if we use cmake directly to build SwiftOption target, the cmake will do what we have done above to call tablegen with -gen-opt-parser-defs.

cmake --build \
    /wtsc/build/Ninja-ReleaseAssert+swift-DebugAssert/swift-linux-x86_64 \    
    --target SwiftOpionts

As a result of above procedure, building libswiftOption results in SwiftOptions being built first, which means llvm-tblgen, with -gen-opt-parser-defs argument, is run on /wtsc/swift/include/swift/Option/Options.td to produce the file /wtsc/build/Ninja-ReleaseAssert+swift-DebugAssert/swift-linux-x86_64/include/swift/Option/Options.inc which is populated with one call to an OPTION macro for each option, then be integreted into libswiftOption library.

There two place in libswiftOption library using Options.inc file, with git-grep to find.

git -C /wtsc/swift grep -n '#include "swift/Option/Options.inc"'
include/swift/Option/Options.h:47:#include "swift/Option/Options.inc"
lib/Option/Options.cpp:23:#include "swift/Option/Options.inc"
lib/Option/Options.cpp:31:#include "swift/Option/Options.inc"

The first one is for swift::options::ID.

/* /wtsc/swift/include/swift/Option/Options.h */
...
namespace swift {
namespace options {
...
  enum ID {
    OPT_INVALID = 0, // This is not an option ID.
#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
               HELPTEXT, METAVAR, VALUES)                                      \
  OPT_##ID,
#include "swift/Option/Options.inc"
    LastOption
#undef OPTION
  };
} //end namespace options
...
} // end namespace swift

The second one is for InfoTable, which is used by SwiftOptTable created by swift::createSwiftOptTable function, which has been mentioned above.

/* /wtsc/swift/lib/Option/Options.cpp */
...
#define PREFIX(NAME, VALUE) static const char *const NAME[] = VALUE;
#include "swift/Option/Options.inc"
#undef PREFIX
static const OptTable::Info InfoTable[] = {
#define OPTION(PREFIX, NAME, ID, KIND, GROUP, ALIAS, ALIASARGS, FLAGS, PARAM,  \
               HELPTEXT, METAVAR, VALUES)                                      \
  {PREFIX, NAME,  HELPTEXT,    METAVAR,     OPT_##ID,  Option::KIND##Class,    \
   PARAM,  FLAGS, OPT_##GROUP, OPT_##ALIAS, ALIASARGS, VALUES},
#include "swift/Option/Options.inc"
#undef OPTION
};
namespace {
class SwiftOptTable : public OptTable {
public:
  SwiftOptTable() : OptTable(InfoTable) {}
};
} // end anonymous namespace
std::unique_ptr<OptTable> swift::createSwiftOptTable() {
  return std::unique_ptr<OptTable>(new SwiftOptTable());
}


/* /wtsc/llvm-project/llvm/include/llvm/Option/OptTable.h */
...
/// Provide access to the Option info table.
///
/// The OptTable class provides a layer of indirection which allows Option
/// instance to be created lazily. In the common case, only a few options will
/// be needed at runtime; the OptTable class maintains enough information to
/// parse command lines without instantiating Options, while letting other
/// parts of the driver still use Option instances where convenient.
class OptTable {
public:
  /// Entry for a single option instance in the option data table.
  struct Info {
    /// A null terminated array of prefix strings to apply to name while
    /// matching.
    const char *const *Prefixes;
    const char *Name;
    const char *HelpText;
    const char *MetaVar;
    unsigned ID;
    unsigned char Kind;
    unsigned char Param;
    unsigned short Flags;
    unsigned short GroupID;
    unsigned short AliasID;
    const char *AliasArgs;
    const char *Values;
  };
protected:
  OptTable(ArrayRef<Info> OptionInfos, bool IgnoreCase = false);
..
public:
...
  /// Parse an list of arguments into an InputArgList.
  ///
  /// The resulting InputArgList will reference the strings in [\p ArgBegin,
  /// \p ArgEnd), and their lifetime should extend past that of the returned
  /// InputArgList.
  ///
  /// The only error that can occur in this routine is if an argument is
  /// missing values; in this case \p MissingArgCount will be non-zero.
  ///
  /// \param MissingArgIndex - On error, the index of the option which could
  /// not be parsed.
  /// \param MissingArgCount - On error, the number of missing options.
  /// \param FlagsToInclude - Only parse options with any of these flags.
  /// Zero is the default which includes all flags.
  /// \param FlagsToExclude - Don't parse options with this flag.  Zero
  /// is the default and means exclude nothing.
  /// \return An InputArgList; on error this will contain all the options
  /// which could be parsed.
  InputArgList ParseArgs(ArrayRef<const char *> Args, unsigned &MissingArgIndex,
                         unsigned &MissingArgCount, unsigned FlagsToInclude = 0,
                         unsigned FlagsToExclude = 0) const;
...
}
...

 The swift driver eventually calls OptTable::ParseArgs to parse arguments. There is the whole logic for driver to parsing arguments as well.

Part2. Pipeline:Converting Args into Actions

Part3. Build: Translating Actions into Jobs using a Toolchain

Part4. Schedule: Ordering and skipping jobs by dependency analysis

Part5. Batch: Optionally combine similar jobs

Part6. Execute: Running the Jobs in a Compilation using TaskQueue.