Rust编译器原理-第7章 Trait 静态分发：零成本抽象的编译器实现第7章 Trait 静态分发：零成本抽象的编译器

《Rust 编译器原理》完整目录

前言

第1章编译管线全景：从源码到机器码的完整旅程

第2章所有权系统：编译期内存管理的核心机制

第3章借用检查器：编译器如何证明内存安全

第4章生命周期：编译器如何推断引用的有效范围

第5章内存布局：编译器如何排列数据

第6章单态化：泛型的编译期展开

第7章 Trait 静态分发：零成本抽象的编译器实现（当前）

第8章 Trait Object 与虚表：运行时多态的内存布局

第9章 async/await：状态机的编译器变换

第10章 Pin、Waker 与 Future：异步运行时的三大支柱

第11章闭包：匿名函数的编译器实现

第12章 unsafe：安全抽象的逃生舱

第13章 FFI：与 C 世界的桥梁

第14章宏系统：编译期的元编程引擎

第15章 MIR 优化：编译器的中间表示与优化管线

第16章 LLVM 代码生成：从 MIR 到机器码

第17章增量编译：让重编译只做必要的事

第18章设计哲学与架构决策

第7章 Trait 静态分发：零成本抽象的编译器实现

"零成本抽象不是没有代价——代价在编译期全部付清。理解编译器如何在编译期解析每一个 trait 方法调用，你就理解了 Rust 性能承诺的底层机制。"

:::tip 本章要点

静态分发的本质：编译器在编译期将 trait 方法调用解析为确定的函数地址，运行时零间接跳转
Trait 选择算法：SelectionContext 通过候选组装、求值筛选、确认三阶段找到唯一匹配的 impl
与单态化的衔接：第6章的单态化生成代码副本，本章的 trait 选择决定每份副本中调用哪个 impl
Impl 查找优化：TraitImpls 使用 SimplifiedType 索引实现快速筛选
一致性规则：编译器在 coherence 阶段证明全局不存在重叠 impl
特化：nightly 特性，允许更具体的 impl 覆盖更一般的 impl
关联类型解析：投影机制将 <T as Trait>::Assoc 在编译期规范化为具体类型
性能分析：静态分发生成的汇编与手写代码完全一致 :::

7.1 静态分发的本质：编译期方法解析

当你写下带有 trait bound 的泛型函数时，编译器面临核心问题：x.method() 到底调用哪个函数？

trait Shape { fn area(&self) -> f64; }

struct Circle { radius: f64 }
struct Rectangle { width: f64, height: f64 }

impl Shape for Circle {
    fn area(&self) -> f64 { std::f64::consts::PI * self.radius * self.radius }
}
impl Shape for Rectangle {
    fn area(&self) -> f64 { self.width * self.height }
}

fn print_area<T: Shape>(shape: &T) {
    println!("面积: {}", shape.area());  // 调用哪个 area？
}

fn main() {
    print_area(&Circle { radius: 5.0 });               // 编译器知道 T=Circle
    print_area(&Rectangle { width: 3.0, height: 4.0 }); // 编译器知道 T=Rectangle
}

在 C++ 虚函数中，shape.area() 在运行时通过 vtable 查找。而 Rust 的静态分发中，编译器在编译期就完成了全部决策——调用点 A 直接生成对 <Circle as Shape>::area 的调用，调用点 B 直接生成对 <Rectangle as Shape>::area 的调用。

graph TB
    A["trait 方法调用<br/>shape.area()"] --> B{"编译器判断分发方式"}

    B -->|"泛型 / impl Trait<br/>编译期已知具体类型"| C["静态分发路径"]
    B -->|"dyn Trait<br/>类型被擦除"| D["动态分发路径"]

    C --> C1["Trait 选择 → 候选组装 → 求值筛选 → 确认"]
    C1 --> C2["单态化：生成具体函数"]
    C2 --> C3["直接调用指令 call concrete_fn"]

    D --> D1["构建 fat pointer + 运行时查 vtable"]
    D1 --> D2["间接调用指令 call [vtable+offset]"]

    style C fill:#059669,color:#fff,stroke:none
    style C3 fill:#10b981,color:#fff,stroke:none
    style D fill:#d97706,color:#fff,stroke:none
    style D2 fill:#f59e0b,color:#fff,stroke:none

7.2 Trait 选择算法：三阶段流程

Trait 选择是静态分发的核心，回答根本问题：对于给定的类型 T 和 trait Trait，应该使用哪个 impl？ 在 rustc 中由 SelectionContext 驱动，分为三阶段。

7.2.1 SelectionContext 与候选种类

// compiler/rustc_trait_selection/src/traits/select/mod.rs
pub struct SelectionContext<'cx, 'tcx> {
    pub infcx: &'cx InferCtxt<'tcx>,          // 类型推断上下文
    freshener: TypeFreshener<'cx, 'tcx>,       // 递归检测用
    intercrate_ambiguity_causes: Option<FxIndexSet<IntercrateAmbiguityCause<'tcx>>>,
    query_mode: TraitQueryMode,
}

rustc 定义了丰富的候选种类体系：

// compiler/rustc_middle/src/traits/select.rs
pub enum SelectionCandidate<'tcx> {
    SizedCandidate,                                     // 内建 Sized
    BuiltinCandidate,                                   // 内建 Copy/Clone 等
    ParamCandidate(ty::PolyTraitPredicate<'tcx>),       // where 子句
    ImplCandidate(DefId),                               // 用户 impl
    AutoImplCandidate,                                  // Send/Sync 自动实现
    ProjectionCandidate { idx: usize, kind: AliasBoundKind }, // 投影 bound
    ClosureCandidate { is_const: bool },                // 闭包 Fn trait
    ObjectCandidate(usize),                             // dyn Trait supertrait
    // ... CoroutineCandidate, FutureCandidate 等
}

graph LR
    subgraph "阶段1: 候选组装"
        A1["收集 impl/where子句/内建"] --> A2["SelectionCandidateSet"]
    end
    subgraph "阶段2: 求值筛选"
        B1["evaluate_candidate"] --> B2["winnow_candidates"]
        B2 --> B3["选出唯一候选"]
    end
    subgraph "阶段3: 确认"
        C1["统一类型参数 + 检查嵌套义务"] --> C2["ImplSource"]
    end
    A2 --> B1
    B3 --> C1
    style A2 fill:#3b82f6,color:#fff,stroke:none
    style B3 fill:#8b5cf6,color:#fff,stroke:none
    style C2 fill:#059669,color:#fff,stroke:none

7.2.2 阶段一：候选组装

assemble_candidates 根据 trait 种类采用不同策略。对于 Sized/Copy 等语言级 trait 使用内建规则，对于一般 trait 则遍历 impl 和 where 子句：

// candidate_assembly.rs（简化）
pub(super) fn assemble_candidates<'o>(
    &mut self, stack: &TraitObligationStack<'o, 'tcx>,
) -> Result<SelectionCandidateSet<'tcx>, SelectionError<'tcx>> {
    // Self 是推断变量？直接返回歧义
    if obligation.predicate.skip_binder().self_ty().is_ty_var() {
        return Ok(SelectionCandidateSet { vec: vec![], ambiguous: true });
    }
    let mut candidates = SelectionCandidateSet { vec: Vec::new(), ambiguous: false };

    match tcx.as_lang_item(def_id) {
        Some(LangItem::Sized) => self.assemble_builtin_sized_candidate(..),
        Some(LangItem::Copy | LangItem::Clone) => {
            self.assemble_candidates_from_impls(obligation, &mut candidates);
            self.assemble_builtin_copy_clone_candidate(..);
        }
        _ => {
            self.assemble_candidates_from_impls(obligation, &mut candidates);
            self.assemble_candidates_from_caller_bounds(stack, &mut candidates)?;
        }
    }
    Ok(candidates)
}

优先级规则至关重要：where 子句候选优先于 impl 候选（保证泛型上下文可预测），内建候选优先于用户 impl，更具体 impl 优先于一般 impl（特化基础）。

7.2.3 阶段二：求值与筛选

编译器对每个候选求值，保留"可能适用"的，再从中选出最佳：

// select/mod.rs（简化关键逻辑）
fn candidate_from_obligation_no_cache(&mut self, stack: &TraitObligationStack<'o, 'tcx>) {
    let candidate_set = self.assemble_candidates(stack)?;
    let mut candidates = self.filter_impls(candidates, stack.obligation);

    if candidates.len() == 1 { return candidates.pop().unwrap(); }

    // 多个候选：逐一求值，过滤掉不适用的
    let evaluated: Vec<_> = candidates.into_iter()
        .filter_map(|c| {
            let eval = self.evaluate_candidate(stack, &c).ok()?;
            eval.may_apply().then(|| EvaluatedCandidate { candidate: c, evaluation: eval })
        }).collect();

    // winnow_candidates 从中选出唯一最佳候选
    self.winnow_candidates(has_non_region_infer, preference_mode, evaluated)
}

TraitObligationStack 维护调用栈检测递归。对于 auto trait（Send/Sync），coinductive 循环被允许——reached_depth 字段追踪循环深度。

7.2.4 阶段三：确认

选出唯一候选后，确认阶段统一类型参数并收集嵌套义务：

// select/confirmation.rs（关键分支）
pub(super) fn confirm_candidate(&mut self, obligation: &PolyTraitObligation<'tcx>,
    candidate: SelectionCandidate<'tcx>) -> Result<Selection<'tcx>, SelectionError<'tcx>> {
    Ok(match candidate {
        ParamCandidate(param) => ImplSource::Param(
            self.confirm_param_candidate(obligation, param.map_bound(|t| t.trait_ref))
        ),
        ImplCandidate(impl_def_id) => ImplSource::UserDefined(
            self.confirm_impl_candidate(obligation, impl_def_id)
        ),
        BuiltinCandidate => ImplSource::Builtin(BuiltinImplSource::Misc,
            self.confirm_builtin_candidate(obligation)
        ),
        // ... 闭包、协程等其他候选的确认
    })
}

7.3 Impl 查找：两级索引的高效设计

一个 trait 可能有成百上千个 impl（如 Display、From）。编译器不能逐一做完整统一检查。

7.3.1 TraitImpls 数据结构

rustc 使用精巧的两级索引：

// compiler/rustc_middle/src/ty/trait_def.rs
pub struct TraitImpls {
    blanket_impls: Vec<DefId>,                              // impl<T: Bound> Trait for T
    non_blanket_impls: FxIndexMap<SimplifiedType, Vec<DefId>>, // 按类型"指纹"索引
}

SimplifiedType 是类型的粗粒度指纹：i32 → Int(I32)，Vec<String> → Adt(vec_def_id)，类型参数 T → 无法简化（归入 blanket）。

7.3.2 for_each_relevant_impl：快速筛选

// compiler/rustc_middle/src/ty/trait_def.rs
pub fn for_each_relevant_impl(self, trait_def_id: DefId, self_ty: Ty<'tcx>, mut f: impl FnMut(DefId)) {
    let impls = self.trait_impls_of(trait_def_id);
    // blanket impls 总是潜在候选
    for &impl_def_id in impls.blanket_impls.iter() { f(impl_def_id); }
    // non-blanket impls：按 SimplifiedType 过滤
    if let Some(simp) = fast_reject::simplify_type(self, self_ty, TreatParams::AsRigid) {
        if let Some(impls) = impls.non_blanket_impls.get(&simp) {
            for &impl_def_id in impls { f(impl_def_id); }
        }
    } else {
        // 无法简化（推断变量等）：检查所有
        for &impl_def_id in impls.non_blanket_impls.values().flatten() { f(impl_def_id); }
    }
}

查找 Vec<i32>: Display 时，Vec<i32> 简化为 Adt(vec_def_id)，只检查 Self 类型同为 Vec 的 impls，impl Display for i32 等无关 impl 被直接跳过。

注意 TreatParams 的不对称：构建索引时用 InstantiateWithInfer（使 blanket impl 正确归类），查询时用 AsRigid（精确匹配）。如果把 impl<T> Trait for T 中的 T 也简化了，就会错误地放入某个具体类型的桶中，导致查找遗漏。

7.3.3 impl 索引的构建

impl 索引通过查询系统 trait_impls_of 懒加载并缓存。构建时遍历当前 crate 和所有依赖 crate 的 impl：

// compiler/rustc_middle/src/ty/trait_def.rs
pub(super) fn trait_impls_of_provider(tcx: TyCtxt<'_>, trait_id: DefId) -> TraitImpls {
    let mut impls = TraitImpls::default();
    // 从依赖 crate 的元数据收集
    if !trait_id.is_local() {
        for &cnum in tcx.crates(()).iter() {
            for &(impl_def_id, simplified_self_ty) in
                tcx.implementations_of_trait((cnum, trait_id)).iter() {
                match simplified_self_ty {
                    Some(st) => impls.non_blanket_impls.entry(st).or_default().push(impl_def_id),
                    None => impls.blanket_impls.push(impl_def_id),
                }
            }
        }
    }
    // 从当前 crate 收集，对 Self 类型做 simplify_type 分类
    for &impl_def_id in tcx.local_trait_impls(trait_id) {
        let impl_self_ty = tcx.type_of(impl_def_id.to_def_id()).instantiate_identity();
        match fast_reject::simplify_type(tcx, impl_self_ty, TreatParams::InstantiateWithInfer) {
            Some(st) => impls.non_blanket_impls.entry(st).or_default().push(impl_def_id.to_def_id()),
            None => impls.blanket_impls.push(impl_def_id.to_def_id()),
        }
    }
    impls
}

7.4 与单态化的衔接

第6章的单态化为每个具体类型生成函数副本。trait 选择决定副本中的方法调用指向哪里。衔接点在 resolve_associated_item：

// compiler/rustc_ty_utils/src/instance.rs
fn resolve_associated_item<'tcx>(tcx: TyCtxt<'tcx>, trait_item_id: DefId,
    typing_env: ty::TypingEnv<'tcx>, trait_id: DefId, rcvr_args: GenericArgsRef<'tcx>,
) -> Result<Option<Instance<'tcx>>, ErrorGuaranteed> {
    let trait_ref = ty::TraitRef::from_assoc(tcx, trait_id, rcvr_args);
    let vtbl = tcx.codegen_select_candidate(typing_env.as_query_input(trait_ref))?;

    match vtbl {
        traits::ImplSource::UserDefined(impl_data) => {
            let trait_def = tcx.trait_def(trait_def_id);
            // 通过特化图找到最具体的实现（叶节点）
            let leaf_def = trait_def.ancestors(tcx, impl_data.impl_def_id)?
                .leaf_def(tcx, trait_item_id).unwrap();

            // 非特化项总是可用；特化默认项需等到 PostAnalysis
            let eligible = leaf_def.is_final() || (
                matches!(typing_env.typing_mode(), ty::TypingMode::PostAnalysis)
                && !trait_ref.still_further_specializable()
            );
            if !eligible { return Ok(None); }

            // translate_args：将泛型参数从 trait 空间转换到 impl 空间
            let args = rcvr_args.rebase_onto(tcx, trait_def_id, impl_data.args);
            let args = translate_args(&infcx, param_env, impl_data.impl_def_id, args, leaf_def.defining_node);
            Some(ty::Instance::new_raw(leaf_def.item.def_id, args))
        }
        // ...
    }
}

translate_args 处理 trait 与 impl 之间不同的泛型参数结构。例如 trait Foo<A> + impl<T: Clone> Foo<T> for Vec<T>，解析 <Vec<String> as Foo<String>>::bar 时需将 trait 参数 (Self=Vec<String>, A=String) 映射到 impl 参数 T=String。

sequenceDiagram
    participant Mono as 单态化
    participant TSelect as Trait 选择
    participant Lookup as Impl 查找
    participant Codegen as 代码生成

    Mono->>TSelect: 解析 <String as Greet>::hello
    TSelect->>Lookup: 查找 String: Greet
    Lookup->>Lookup: SimplifiedType::Adt(String)
    Lookup-->>TSelect: ImplCandidate(impl_def_id)
    TSelect->>TSelect: confirm → ImplSource::UserDefined
    TSelect-->>Mono: {impl_def_id, args}
    Mono->>Codegen: resolve_associated_item
    Codegen->>Codegen: translate_args + Instance::new_raw
    Note over Codegen: 生成 call <String as Greet>::hello（直接调用）

7.5 一致性规则与孤儿规则

一致性保证关键不变量：对任意 (Type, Trait) 对，全局最多一个 impl。

7.5.1 孤儿规则

要写 impl Trait for Type，要么 Trait 定义在当前 crate，要么 Type 定义在当前 crate。#[fundamental] 类型（&T、Box<T>）可以"看穿"——你可以为 &MyType 实现外部 trait。

impl Display for MyType { ... }           // 合法：MyType 是本地类型
impl Display for MyWrapper<Vec<i32>> { }  // 合法：MyWrapper 是本地类型
// impl Display for Vec<i32> { ... }      // 非法！两者都不是本地类型

7.5.2 重叠检查

编译器必须证明不存在两个 impl 同时适用于同一类型：

// compiler/rustc_trait_selection/src/traits/coherence.rs
pub fn overlapping_trait_impls(tcx: TyCtxt<'_>, impl1_def_id: DefId, impl2_def_id: DefId,
    skip_leak_check: SkipLeakCheck, overlap_mode: OverlapMode,
) -> Option<OverlapResult<'_>> {
    // 快速排除：DeepRejectCtxt 比较类型结构，不创建推断上下文
    let impl1_args = tcx.impl_trait_ref(impl1_def_id).skip_binder().args;
    let impl2_args = tcx.impl_trait_ref(impl2_def_id).skip_binder().args;
    if !DeepRejectCtxt::relate_infer_infer(tcx).args_may_unify(impl1_args, impl2_args) {
        return None; // 类型结构不同，不可能重叠
    }
    // 完整检查：创建推断上下文，尝试统一两个 impl 的参数
    overlapping_impls(tcx, impl1_def_id, impl2_def_id, skip_leak_check, overlap_mode, true)
}

DeepRejectCtxt 是高效的预筛选器——impl Trait for Vec<i32> vs impl Trait for HashMap<K,V> 外层类型不同，直接排除；vs impl<T> Trait for Vec<T> 外层相同，需要完整检查。

ImplHeader 包含 impl 头部的全部信息（Self 类型、trait ref、谓词），一致性检查还要保守地考虑下游 crate 未来可能添加的 impl。

7.6 特化：更具体的 impl 覆盖更一般的

特化（#![feature(specialization)]/#![feature(min_specialization)]）允许重叠 impl，选择更具体的：

#![feature(specialization)]
trait Serialize { fn serialize(&self) -> Vec<u8>; }

impl<T: Debug> Serialize for T {
    default fn serialize(&self) -> Vec<u8> { format!("{:?}", self).into_bytes() }
}
impl Serialize for String {
    fn serialize(&self) -> Vec<u8> { self.as_bytes().to_vec() }  // 更高效
}

7.6.1 特化图与 specializes 检查

编译器通过特化图管理 impl 优先级——树状结构，根节点是 trait 定义，子节点是更具体的 impl：

// compiler/rustc_trait_selection/src/traits/specialize/specialization_graph.rs
enum Inserted<'tcx> {
    BecameNewSibling(Option<FutureCompatOverlapError<'tcx>>),
    ReplaceChildren(Vec<DefId>),    // 新 impl 特化了现有子节点
    ShouldRecurseOn(DefId),         // 新 impl 是某子节点的进一步特化
}

specializes 函数通过蕴含检查验证特化关系：在 specializing impl 的 where 子句环境下，证明 parent impl 也成立。直觉上，A 特化 B 意味着 A 适用的所有类型 B 也适用，但 B 适用的某些类型 A 不适用。

// compiler/rustc_middle/src/ty/trait_def.rs
pub enum TraitSpecializationKind {
    None,            // 默认：不允许特化
    Marker,          // 无方法的 marker trait，安全特化
    AlwaysApplicable, // 生命周期无关的"总是适用" trait
}

min_specialization 比 specialization 更保守——只允许对特定标记 trait 特化，避免健全性问题。标准库内部大量使用它优化性能关键路径。

7.7 关联类型解析

编译器需要将 <T as Iterator>::Item 解析为具体类型，称为投影规范化，代码在 project.rs：

// compiler/rustc_trait_selection/src/traits/project.rs
enum ProjectionCandidate<'tcx> {
    ParamEnv(ty::PolyProjectionPredicate<'tcx>), // where 子句
    TraitDef(ty::PolyProjectionPredicate<'tcx>), // trait 定义中的 bound
    Object(ty::PolyProjectionPredicate<'tcx>),   // dyn Trait bound
    Select(Selection<'tcx>),                      // 来自 impl
}

ParamEnv 候选始终优先于 impl 候选。这保证了泛型上下文正确性：

trait Foo { type Output; }
impl<T> Foo for T { type Output = T; }
fn bar<T: Foo<Output = String>>(x: T) {
    // <T as Foo>::Output 应该是 String（where 子句），不是 T（blanket impl）
}

规范化流程：先通过 trait 选择找到 impl → 从 impl 获取关联类型定义 → 与期望类型统一。

// project.rs（简化核心流程）
fn project_and_unify_term<'cx, 'tcx>(selcx: &mut SelectionContext<'cx, 'tcx>,
    obligation: &ProjectionObligation<'tcx>) -> ProjectAndUnifyResult<'tcx> {
    // 第1步：通过 trait 选择找到 impl，获取关联类型的具体定义
    let normalized = opt_normalize_projection_term(selcx, obligation.param_env,
        obligation.predicate.projection_term, ...)?;
    // 第2步：将规范化结果与期望类型统一
    match infcx.at(&obligation.cause, obligation.param_env)
        .eq(DefineOpaqueTypes::Yes, normalized, obligation.predicate.term) {
        Ok(InferOk { obligations, .. }) => ProjectAndUnifyResult::Holds(obligations),
        Err(err) => ProjectAndUnifyResult::MismatchedProjectionTypes(err),
    }
}

7.8 Where 子句与 Trait Bound 验证

调用带 trait bound 的函数时，编译器生成义务并通过 FulfillmentContext 验证：

fn process<T: Clone + Debug + Send>(item: T) { ... }
// process(Rc::new(42)) 生成三个义务：
// Rc<i32>: Clone → 成功 | Rc<i32>: Debug → 成功 | Rc<i32>: Send → 失败！

// compiler/rustc_trait_selection/src/traits/fulfill.rs
pub struct FulfillmentContext<'tcx, E: 'tcx> {
    predicates: ObligationForest<PendingPredicateObligation<'tcx>>,
    usable_in_snapshot: usize,
    _errors: PhantomData<E>,
}

ObligationForest 以森林形式组织义务，支持增量处理——遇歧义时挂起，获得更多类型信息后重试。

编译器的谓词体系区分 Predicate（广义）和 Clause（where 子句子集）：

pub enum ClauseKind<'tcx> {
    Trait(TraitPredicate<'tcx>),              // T: Trait
    Projection(ProjectionPredicate<'tcx>),     // <T as Trait>::Assoc = U
    TypeOutlives(TypeOutlivesPredicate<'tcx>), // T: 'a
    RegionOutlives(RegionOutlivesPredicate<'tcx>), // 'a: 'b
    // ...
}

7.9 Blanket 实现的编译器处理

impl<T: Display> ToString for T 这类 blanket impl 在 TraitImpls 中存储于 blanket_impls 向量——无法被 SimplifiedType 索引，任何查找都必须考虑。

查找 String: ToString 时：找到 blanket impl → 代入 T=String → 产生嵌套义务 String: Display → 解析成功。

Blanket impl 与单态化协同效果显著：

// 源码中的 blanket impl
impl<T: Display> ToString for T {
    fn to_string(&self) -> String {
        let mut buf = String::new();
        fmt::Write::write_fmt(&mut buf, format_args!("{}", self)).unwrap();
        buf
    }
}

// 单态化为 T=i32 后，编译器生成：
// fn <i32 as ToString>::to_string(self: &i32) -> String {
//     // Display::fmt 被静态解析为 <i32 as Display>::fmt —— 直接调用！
//     ...
// }

每个副本中的 Display::fmt 调用都被静态分发到对应类型的具体实现。这就是 blanket impl + 单态化 + 静态分发三者协作的结果。

7.10 性能分析：零成本的汇编级证明

考虑以下代码：

trait Doubler { fn double(&self) -> i64; }
impl Doubler for i64 {
    #[inline(never)]
    fn double(&self) -> i64 { self * 2 }
}
fn static_dispatch<T: Doubler>(x: &T) -> i64 { x.double() }
fn dynamic_dispatch(x: &dyn Doubler) -> i64 { x.double() }

static_dispatch::<i64> 的汇编（x86-64, -O2）：

static_dispatch:
    mov     rax, qword ptr [rdi]    ; 读取 *x
    shl     rax, 1                  ; 乘2优化为左移（内联后）
    ret

dynamic_dispatch 的汇编：

dynamic_dispatch:
    mov     rax, qword ptr [rsi + 24]  ; vtable 加载函数指针
    mov     rdi, qword ptr [rdi]       ; 加载数据指针
    call    rax                         ; 间接调用！
    ret

特征	静态分发	动态分发
调用指令	`call concrete_fn`（直接）	`call rax`（间接）
可否内联	可以	不可能
分支预测	100% 命中	可能 miss
额外内存访问	0 次	2 次

静态分发的真正威力在于内联后的优化级联：x.double() 内联为 x * 2 → 迭代器融合为单循环 → SIMD 向量化 → 循环展开。动态分发因无法内联，这些优化全部失效。

代价在编译期：N 个类型实例化同一泛型函数 → O(N) 倍编译时间和二进制大小。

7.11 TraitDef：Trait 的编译器表示

// compiler/rustc_middle/src/ty/trait_def.rs
pub struct TraitDef {
    pub def_id: DefId,
    pub safety: hir::Safety,                  // safe / unsafe trait
    pub constness: hir::Constness,            // const trait
    pub has_auto_impl: bool,                  // Send/Sync 等 auto trait
    pub is_marker: bool,                      // marker trait: impl 允许重叠
    pub is_coinductive: bool,                 // 允许求解器循环
    pub is_fundamental: bool,                 // 影响孤儿规则
    pub specialization_kind: TraitSpecializationKind,
    pub deny_explicit_impl: bool,             // 完全内建 trait
    pub must_implement_one_of: Option<Box<[Ident]>>,
    // ...
}

关键字段的影响：is_coinductive 允许 auto trait 的递归证明（struct Node { children: Vec<Node> } 中的 Node: Send 循环）；is_marker 允许 marker trait 的 impl 重叠（无方法体，无歧义）；is_fundamental 修改孤儿规则使 &MyType 可实现外部 trait。

7.12 完整流程总结

graph TB
    subgraph "1. 类型检查"
        A1["推断 T = Article"] --> A2["生成义务 Article: Summary"]
        A2 --> A3["FulfillmentContext 验证"]
    end
    subgraph "2. Trait 选择"
        B1["assemble_candidates"] --> B2["找到 ImplCandidate"]
        B2 --> B3["confirm → ImplSource::UserDefined"]
    end
    subgraph "3. 单态化 + 代码生成"
        C1["生成 notify::Article"] --> C2["resolve_associated_item"]
        C2 --> C3["translate_args → Instance"]
        C3 --> C4["emit call Article::summarize"]
    end
    A3 --> B1
    B3 --> C1
    style A3 fill:#3b82f6,color:#fff,stroke:none
    style B3 fill:#8b5cf6,color:#fff,stroke:none
    style C4 fill:#059669,color:#fff,stroke:none

7.13 边界情况

推断变量与歧义：let x = Default::default() 中 Self 是推断变量，编译器返回歧义，挂起义务等待类型信息。添加 let _: i32 = x 后推断解析，义务满足。ObligationForest 支持这种增量处理——义务在多个推断步骤间反复被尝试求解。

递归与 coinduction：考虑 struct Node { value: i32, children: Vec<Node> } 检查 Node: Send：需要 i32: Send（是）且 Vec<Node>: Send，而后者需要 Node: Send——形成循环。对于 coinductive auto trait，编译器允许此循环并视为成功。TraitObligationStack 的 reached_depth 追踪循环深度，参与循环的节点放弃缓存以避免错误结果。

负面推理：编译器可推断 Rc<i32>: !Send（因为 Rc 内部使用非原子引用计数）。在一致性检查中，coherence.rs 的 impl_intersection_has_negative_obligation 函数利用负面 trait bound 证明两个 impl 互斥：

// 理论上（需要 negative_impls 特性）：
impl<T: Send> Process for T { ... }
impl<T: !Send> Process for T { ... }
// 不重叠：T: Send 和 T: !Send 互斥

7.14 本章小结

本章深入 rustc 源码解析了 trait 静态分发的完整机制。SelectionContext 的三阶段选择算法是核心引擎；TraitImpls 的两级索引实现高效 impl 查找；一致性规则从制度层面保证选择确定性；特化图管理 impl 优先级；投影规范化处理关联类型；resolve_associated_item 连接 trait 选择与代码生成。

最终结果是：静态分发生成的机器码与手写等效代码完全一致，零间接调用、可内联、可触发后续优化级联。这就是"零成本抽象"的编译器实现。

下一章我们将深入 dyn Trait 的内部——fat pointer 内存布局、vtable 构建过程，以及动态分发的完整编译器实现。