WTSC 7: Semantic Analysis

482 阅读5分钟

In previous Paser post, we get the AST of our hello world example as following:

(lldb) expr Result.get()->dump()
(call_expr type='<null>' arg_labels=_:
  (unresolved_decl_ref_expr type='<null>' name=print function_ref=unapplied)
  (paren_expr type='<null>'
    (string_literal_expr type='<null>' encoding=utf8 value="Hello world!" builtin_initializer=**NULL** initializer=**NULL**)))

And our helloworld example looks like this:

/* /wtsc/helloworld/Helloworld.swift */
print("Hello world!")

Before we continue, we should know where we are and where to go. So the following code show us where we stop in previous post.

/* /wtsc/swift/lib/Frontend/Frontend.cpp */
void CompilerInstance::performSema() {
  performParseAndResolveImportsOnly();
...
  forEachFileToTypeCheck([&](SourceFile &SF) { performTypeChecking(SF);});
...
}

The performParseAndResolveImportsOnly()  method completes parsing source code into AST without type information, and the following performTypeChecking() method is to fill in type information into AST. So in between, we can inject code to show how the AST looks like at that time point.

void CompilerInstance::performSema() {
  performParseAndResolveImportsOnly();
...
  forEachFileToTypeCheck([&](SourceFile &SF) {
    for (auto D : SF.getTopLevelDecls()) { D->dump();}
  });
  llvm::errs() << "---------------------------------------------------\n";

  forEachFileToTypeCheck([&](SourceFile &SF) { performTypeChecking(SF); });
...
}

After we add the AST showing code, we needed to rebuild swift-frontend to take effect, using following commands.

cmake --build /wtsc/build/Ninja-ReleaseAssert+swift-DebugAssert/swift-linux-x86_64 --target swift-frontend
cmake --install /wtsc/build/Ninja-ReleaseAssert+swift-DebugAssert/swift-linux-x86_64 --prefix /wtsc/usr

Afterward we can see the result as following:

swiftc -dump-ast Helloworld.swift
(top_level_code_decl range=[Helloworld.swift:1:1 - line:1:21]
  (brace_stmt implicit range=[Helloworld.swift:1:1 - line:1:21]
    (call_expr type='<null>' arg_labels=_:
      (unresolved_decl_ref_expr type='<null>' name=print function_ref=unapplied)
      (paren_expr type='<null>'
        (string_literal_expr type='<null>' encoding=utf8 value="Hello world!" builtin_initializer=**NULL** initializer=**NULL**)))))
---------------------------------------------------
(source_file "Helloworld.swift"
  (top_level_code_decl range=[Helloworld.swift:1:1 - line:1:21]
    (brace_stmt implicit range=[Helloworld.swift:1:1 - line:1:21]
      (call_expr type='()' location=Helloworld.swift:1:1 range=[Helloworld.swift:1:1 - line:1:21] nothrow arg_labels=_:
        (declref_expr type='(Any..., String, String) -> ()' location=Helloworld.swift:1:1 range=[Helloworld.swift:1:1 - line:1:1] decl=Swift.(file).print(_:separator:terminator:) function_ref=single)
        (tuple_expr type='(Any..., separator: String, terminator: String)' location=Helloworld.swift:1:6 range=[Helloworld.swift:1:6 - line:1:21] names='',separator,terminator
          (vararg_expansion_expr implicit type='[Any]' location=Helloworld.swift:1:7 range=[Helloworld.swift:1:7 - line:1:7]
            (array_expr implicit type='[Any]' location=Helloworld.swift:1:7 range=[Helloworld.swift:1:7 - line:1:7] initializer=**NULL**
              (erasure_expr implicit type='Any' location=Helloworld.swift:1:7 range=[Helloworld.swift:1:7 - line:1:7]
                (string_literal_expr type='String' location=Helloworld.swift:1:7 range=[Helloworld.swift:1:7 - line:1:7] encoding=utf8 value="Hello world!" builtin_initializer=Swift.(file).String extension.init(_builtinStringLiteral:utf8CodeUnitCount:isASCII:) initializer=**NULL**))))
          (default_argument_expr implicit type='String' location=Helloworld.swift:1:6 range=[Helloworld.swift:1:6 - line:1:6] default_args_owner=Swift.(file).print(_:separator:terminator:) param=1)
          (default_argument_expr implicit type='String' location=Helloworld.swift:1:6 range=[Helloworld.swift:1:6 - line:1:6] default_args_owner=Swift.(file).print(_:separator:terminator:) param=2))))))

Now the situation is clear that above the hypen line is where we are and under the hypen line is where to go.

Thus, let us dive in swift source code to find out how the transformation is performed, which is performed in performTypeChecking() method which in turn gets to evaluator::SideEffect TypeCheckSourceFileRequest::evaluate(Evaluator &eval, SourceFile *SF) const  then to void TypeChecker::typeCheckTopLevelCodeDecl(TopLevelCodeDecl *TLCD)  then to template<typename StmtTy> bool typeCheckStmt(StmtTy *&S)    then to StmtRetTy visit(Stmt *S, Args... AA) , which is the LLVM Visitor pattern to traver AST to type check each node. 

And known from our destination AST that it will visit brace statement, brace_stmt, so we can set a break point inside the first line of Stmt *StmtChecker::visitBraceStmt(BraceStmt *BS) to see what is the calls in backtracking stack, by using following commands.

lldb -- \
     /wtsc/usr/bin/swift-frontend \
     -frontend \
     -c \
     -primary-file /wtsc/helloworld/Helloworld.swift \
     -target x86_64-unknown-linux-gnu \
     -disable-objc-interop \
     -color-diagnostics \
     -module-name helloworld \
     -o /tmp/Helloworld-7ada2e.o
b TypeCheckStmt.cpp:1576
run
bt

Then we can see the backtracking on that break point showed as following with some reduction to focus the point I want to show. If readers want to see the whole result, do it by yourselves.

frame #0: swift-frontend`(anonymous namespace)::StmtChecker::visitBraceStmt(this=0x00007fffffff8708, BS=0x00000000094ada80) at TypeCheckStmt.cpp:1576:7
frame #1: swift-frontend`swift::ASTVisitor<(anonymous namespace)::StmtChecker, void, swift::Stmt*, void, void, void, void>::visit(this=0x00007fffffff8708, S=0x00000000094ada80) at StmtNodes.def:47:1
frame #2: swift-frontend`bool (anonymous namespace)::StmtChecker::typeCheckStmt<swift::BraceStmt>(this=0x00007fffffff8708, S=0x00007fffffff8720) at TypeCheckStmt.cpp:677:39
frame #3: swift-frontend`swift::TypeChecker::typeCheckTopLevelCodeDecl(TLCD=0x00000000094ad940) at TypeCheckStmt.cpp:2111:21
frame #4: swift-frontend`swift::TypeCheckSourceFileRequest::evaluate(this=0x00007fffffff8b60, eval=0x0000000009294260, SF=0x00000000094ad100) const at TypeChecker.cpp:312:9
frame #11: swift-frontend`swift::performTypeChecking(SF=0x00000000094ad100) at TypeChecker.cpp:277:16
frame #12: swift-frontend`swift::CompilerInstance::performSema(this=0x00007fffffff8cc0, SF=0x00000000094ad100)::$_6::operator()(swift::SourceFile&) const at Frontend.cpp:972:48
frame #15: swift-frontend`swift::CompilerInstance::forEachFileToTypeCheck(this=0x0000000009291c50, fn=function_ref<void (swift::SourceFile &)> @ 0x00007fffffff8c80)>) at Frontend.cpp:1042:7
frame #16: swift-frontend`swift::CompilerInstance::performSema(this=0x0000000009291c50) at Frontend.cpp:972:3
frame #17: swift-frontend`withSemanticAnalysis(Instance=0x0000000009291c50, observer=0x0000000000000000, cont=function_ref<bool (swift::CompilerInstance &)> @ 0x00007fffffff8d98)>) at FrontendTool.cpp:1029:12
frame #18: swift-frontend`performAction(Instance=0x0000000009291c50, ReturnValue=0x00007fffffff9050, observer=0x0000000000000000) at FrontendTool.cpp:1171:12
frame #19: swift-frontend`performCompile(Instance=0x0000000009291c50, ReturnValue=0x00007fffffff9050, observer=0x0000000000000000) at FrontendTool.cpp:1223:19
frame #20: swift-frontend`swift::performFrontend(Args=ArrayRef<const char *> @ 0x00007fffffffa288, Argv0="/wtsc/usr/bin/swift-frontend", MainAddr=0x00000000004831c0, observer=0x0000000000000000) at FrontendTool.cpp:2047:19
frame #21: swift-frontend`run_driver(ExecName=(Data = "swift-frontend", Length = 14), argv=const llvm::ArrayRef<const char *> @ 0x00007fffffffbc08) at driver.cpp:153:14
frame #22: swift-frontend`main(argc_=13, argv_=0x00007fffffffdbf8) at driver.cpp:348:12

And visitBraceStmt() method looks like:

/* /wtsc/swift/lib/Sema/TypeCheckStmt.cpp */
...
Stmt *StmtChecker::visitBraceStmt(BraceStmt *BS) {
  ...
  for (auto &elem : BS->getElements())
    typeCheckASTNode(elem);
  return BS;
}
...
void StmtChecker::typeCheckASTNode(ASTNode &node) {
  // Type check the expression
  if (auto *E = node.dyn_cast<Expr *>()) {
  ...
    auto resultTy = TypeChecker::typeCheckExpression(E, DC, /*contextualInfo=*/{}, options);
    ...
  }

  // Type check the statement.
  if (auto *S = node.dyn_cast<Stmt *>()) {
    ...
    typeCheckStmt(S);
    ...
  }

  // Type check the declaration.
  if (auto *D = node.dyn_cast<Decl *>()) {
    TypeChecker::typeCheckDecl(D);
    ...
  }
...
}
...

That is to type check each element inside the brace, which is { xxx; xxx; }. So let us go on. The print("Hello world!") is an expression which will be evaluated to a value, so we show check the type check expression part.

/* /wtsc/swift/lib/Sema/TypeCheckConstraints.cpp */
#pragma mark High-level entry points
Type TypeChecker::typeCheckExpression(Expr *&expr, DeclContext *dc,
                                      ContextualTypeInfo contextualInfo,
                                      TypeCheckExprOptions options) {
...
auto resultTarget = typeCheckExpression(target, options);
...
}

Optional<SolutionApplicationTarget>
TypeChecker::typeCheckExpression(SolutionApplicationTarget &target,
    TypeCheckExprOptions options) {
...
  ConstraintSystem cs(dc, csOptions);
...
  // Attempt to solve the constraint system.
  auto viable = cs.solve(target, allowFreeTypeVariables);
  if (!viable) {
    target.setExpr(expr);
    return None;
  }
...
  // Apply the solution to the expression.
  auto resultTarget = cs.applySolution(solution, target);
...
  return *resultTarget;
}

   Inside the method of Optional<SolutionApplicationTarget> TypeChecker::typeCheckExpression( SolutionApplicationTarget &target, TypeCheckExprOptions options) , it implements what is said in Swift TypeChecker document, which said there are three main stages, Constraint Generation, Constraint Solving, and Solution Application. The first two stages is implemented in Optional<std::vector<Solution>> ConstraintSystem::solve()  which in turn calls its imp method. The constraint generation action is performed there by calling bool ConstraintSystem::generateConstraints() . After constraint generated, calls bool ConstraintSystem::solve(SmallVectorImpl<Solution> &solutions, FreeTypeVariableBinding allowFreeTypeVariables)  to solve constraints and get solutions back as said in document.

One more thing to note is that type checking is performed on single expression or statement level.

Swift limits the scope of type inference to a single expression or statement, for purely practical reasons: we expect that we can provide better performance and vastly better diagnostics when the problem is limited in scope.

That is mainly the structure for Semantic Analysis for swift. Next, we will see the SIL codegen based on the typechecked AST we get in this post.