Cpp 继承和虚函数下对象的内存布局

15 阅读9分钟

继承

class Base {  
 public:  
  int a;  
};

class Derived : public Base {  
 public:  
  int d;  
};

内存 layout:

*** Dumping AST Record Layout  
         0 | class Base  
         0 |   int a  
           | [sizeof=4, dsize=4, align=4,  
           |  nvsize=4, nvalign=4]  
  
*** Dumping AST Record Layout  
         0 | class Derived  
         0 |   class Base (base)  
         0 |     int a  
         4 |   int d  
           | [sizeof=8, dsize=8, align=4,  
           |  nvsize=8, nvalign=4]

也就是被继承的类的成员变量会放在继承的类的对象的地址开头。

虚函数

我们有时候需要实现多态,这就需要依赖于虚函数。关于多态,简而言之就是用父类型别的指针指向其子类的实例,然后通过父类的指针调用实际子类的成员函数。

单继承

struct Base {  
  virtual void f() { std::cout << "Base\n"; }  
};  
  
struct Derived : Base {  
  void f() override { std::cout << "Derived\n"; }  
};

通过在编译的时候加上:-Xclang -fdump-record-layouts-Xclang -fdump-vtable-layout 可以打印下面的结构

*** Dumping AST Record Layout  
         0 | struct Base  
         0 |   (Base vtable pointer)  
           | [sizeof=8, dsize=8, align=8,  
           |  nvsize=8, nvalign=8]  
  
*** Dumping AST Record Layout  
         0 | struct Derived  
         0 |   struct Base (primary base)  
         0 |     (Base vtable pointer)  
           | [sizeof=8, dsize=8, align=8,  
           |  nvsize=8, nvalign=8]

Original map  
Vtable for 'Base' (3 entries).  
   0 | offset_to_top (0)  
   1 | Base RTTI  
       -- (Base, 0) vtable address --  
   2 | void Base::f()  
  
VTable indices for 'Base' (1 entries).  
   0 | void Base::f()  
  
Original map  
 void Derived::f() -> void Base::f()  
Vtable for 'Derived' (3 entries).  
   0 | offset_to_top (0)  
   1 | Derived RTTI  
       -- (Base, 0) vtable address --  
       -- (Derived, 0) vtable address --  
   2 | void Derived::f()

如果 Derived 没有实现 f() 则:

Vtable for 'Derived' (3 entries).  
   0 | offset_to_top (0)  
   1 | Derived RTTI  
       -- (Base, 0) vtable address --  
       -- (Derived, 0) vtable address --  
   2 | void Base::f()

Base 结构体里只有一个 vptr 指针,所以大小为8字节。Derived 同理。 虚函数表(Vtable for 'xxx')中:

  • offset_to_top(0): 表示的是内存布局,也就是当前vptr在对象内存中存放的地址距离对象顶部的偏移。(对象放在堆或者栈上,但是vtable是在.rodata里面)
  • Derived RTTI:运行时类型信息,用于 dynamic_casttypeid
  • void Dervied::f():函数条目

多继承

#include <iostream>  
  
class Base {  
 public:  
  virtual void f() { std::cout << "Base\n"; }  
};  
  
class Base1 {  
 public:  
  virtual void f1() {  
    std::cout << "Base1" << std::endl;  
  }  
};  
  
  
class Derived : public Base, public Base1 {  
 public:  
  void f() override {  
    std::cout << "Derived" << std::endl;  
  }  
  
  void f1() override {  
    std::cout << "Derived1" << std::endl;  
  }  
};  
  
class Derived1 {  
  
};  
  
int main() {  
  Derived* d = new Derived();  
  std::cout << "--- 1. 内存地址观察 ---" << std::endl;  
  std::cout << "Derived 指针 (d):    " << d << std::endl;  
  
  // 转换为第一个基类,地址应该不变 (Offset 0)  Base* b = d;  
  std::cout << "Base 指针 (b):       " << b << " (Offset 0)" << std::endl;  
  
  // 转换为第二个基类,地址应该增加 8 字节 (Offset 8)  Base1* b1 = d;  
  std::cout << "Base1 指针 (b1):     " << b1 << " (Offset 8)" << std::endl;  
  
  unsigned long vptr_Base = *(unsigned long*)b;  
  unsigned long vptr_Base1 = *(unsigned long*)b1;  
  
  std::cout << "vptr_Base  指向: 0x" << std::hex << vptr_Base << std::endl;  
  std::cout << "vptr_Base1 指向: 0x" << std::hex << vptr_Base1 << std::endl;  
    
  std::cout << "\n--- 2. Thunk 与 this 指针调整 ---" << std::endl;  
  // 虽然是用 b1 (偏移了8) 调用,但进入 f1 内部后,this 会自动被 -8  b1->f1();  
  
  std::cout << "\n--- 3. RTTI 应用 ---" << std::endl;  
  // 使用 typeid 获取运行时类型名  
  std::cout << "b1 实际指向的类型: " << typeid(*b1).name() << std::endl;  
  
  // 使用 dynamic_cast 进行安全跳转  
  // 它会查找虚表中的 RTTI,发现 b1 虽然指向偏移 8,但本质是 Derived,  
  // 从而成功转换回 Derived 的起始地址。  
  Derived* d2 = dynamic_cast<Derived*>(b1);  
  if (d2) {  
    std::cout << "dynamic_cast 成功!回到了地址: " << d2 << std::endl;  
  }  
}

dump 出来的结果:

*** Dumping AST Record Layout  
         0 | class Derived  
         0 |   class Base (primary base)  
         0 |     (Base vtable pointer)  
         8 |   class Base1 (base)  
         8 |     (Base1 vtable pointer)  
           | [sizeof=16, dsize=16, align=8,             |  nvsize=16, nvalign=8]
           
Vtable for 'Derived' (7 entries).  
   0 | offset_to_top (0)  
   1 | Derived RTTI  
       -- (Base, 0) vtable address --  
       -- (Derived, 0) vtable address --  
   2 | void Derived::f()  
   3 | void Derived::f1()  
   4 | offset_to_top (-8)  
   5 | Derived RTTI  
       -- (Base1, 8) vtable address --  
   6 | void Derived::f1()  
       [this adjustment: -8 non-virtual] method: void Base1::f1()

输出:

--- 1. 内存地址观察 ---
Derived 指针 (d):    0x6000036d0020
Base 指针 (b):       0x6000036d0020 (Offset 0)
Base1 指针 (b1):     0x6000036d0028 (Offset 8)
vptr_Base  指向: 0x1005cc128
vptr_Base1 指向: 0x1005cc148

--- 2. Thunk 与 this 指针调整 ---
Derived1

--- 3. RTTI 应用 ---
b1 实际指向的类型: 7Derived
dynamic_cast 成功!回到了地址: 0x6000036d0020

结构体里的内容差不多。Derived 同时继承了 Base 和 Base1,所以会结构体开头保存了 Base 和 Base1 的 vptr。 Vtable 里变得不一样了,分为 Base 的虚表和 Base1 的虚表: Base 的虚表:

   0 | offset_to_top (0)  
   1 | Derived RTTI  
       -- (Base, 0) vtable address --  
       -- (Derived, 0) vtable address --  
   2 | void Derived::f()  
   3 | void Derived::f1() 

Base1 的虚表:

   4 | offset_to_top (-8)  
   5 | Derived RTTI  
       -- (Base1, 8) vtable address --  
   6 | void Derived::f1()  
       [this adjustment: -8 non-virtual] method: void Base1::f1()
  • offset_to_top (-8):如果你现在手里持有一个 Base1* 指针,它其实指向对象偏移 8 的位置。要找回整个对象的开头,需要减去 8。
  • this adjustment (指针调整/Thunk)
    • 当你通过 Base1* ptr 调用 f1() 时,实际上调用的是 Derived::f1()
    • 但是 Derived::f1() 期望的 this 指针是指向 Derived 开头的(偏移 0)。
    • 因此,编译器会生成一小段代码(称为 Thunk),在进入函数体之前,自动将 this 指针 -8。 所以调用 b1->f1() 的时候会先找到 b1 的vptr,b1 的 vptr 会指向 offset_to_top (-8) ,接着根据 RTTI 和 Thunk 来回到正确的对象地址开始

下面给出一个没有 override 虚函数的结构:

Original map  
Vtable for 'Derived' (6 entries).  
   0 | offset_to_top (0)  
   1 | Derived RTTI  
       -- (Base, 0) vtable address --  
       -- (Derived, 0) vtable address --  
   2 | void Base::f()  
   3 | offset_to_top (-8)  
   4 | Derived RTTI  
       -- (Base1, 8) vtable address --  
   5 | void Base1::f1()

再新增一个继承的类 Base2:

Vtable for 'Derived' (12 entries).  
   0 | offset_to_top (0)  
   1 | Derived RTTI  
       -- (Base, 0) vtable address --  
       -- (Derived, 0) vtable address --  
   2 | void Derived::f()  
   3 | void Base::b()  
   4 | void Derived::f1()  
   5 | void Derived::f2()  
   6 | offset_to_top (-8)  
   7 | Derived RTTI  
       -- (Base1, 8) vtable address --  
   8 | void Derived::f1()  
       [this adjustment: -8 non-virtual] method: void Base1::f1()  
   9 | offset_to_top (-16)  
  10 | Derived RTTI  
       -- (Base2, 16) vtable address --  
  11 | void Derived::f2()  
       [this adjustment: -16 non-virtual] method: void Base2::f2()

从这个例子可以看出 offset_to_top 具体表示的是什么,offset_to_top 反映的是类与类之间的堆叠厚度(成员变量和虚表指针的总和),而不是虚函数表的长度

菱形继承

有时候我们可能会写出如下的代码(也就是所谓的菱形继承):

class Base {  
 public:  
  int a;  
};

class Derived : public Base {  
 public:  
  int d;  
};  
  
class Derived1: public Base {  
 public:  
  int e;  
};  
  
class FinalDerived: public Derived, public Derived1 {};

内存布局:

*** Dumping AST Record Layout  
         0 | class FinalDerived  
         0 |   class Derived (base)  
         0 |     class Base (base)  
         0 |       int a  
         4 |     int d  
         8 |   class Derived1 (base)  
         8 |     class Base (base)  
         8 |       int a  
        12 |     int e  
           | [sizeof=16, dsize=16, align=4,  
           |  nvsize=16, nvalign=4]

会发现里面存了两个 Base::a 这时候可以使用虚继承来解决这个问题:

class Derived : virtual public Base {  
 public:  
  int d;  
};  
  
class Derived1: virtual public Base {  
 public:  
  int e;  
};  
  
class FinalDerived: public Derived, public Derived1 {};

内存布局对应的是:

*** Dumping AST Record Layout  
         0 | class Derived  
         0 |   (Derived vtable pointer)  
         8 |   int d  
        12 |   class Base (virtual base)  
        12 |     int a  
           | [sizeof=16, dsize=16, align=8,  
           |  nvsize=12, nvalign=8]  
  
*** Dumping AST Record Layout  
         0 | class Derived1  
         0 |   (Derived1 vtable pointer)  
         8 |   int e  
        12 |   class Base (virtual base)  
        12 |     int a  
           | [sizeof=16, dsize=16, align=8,  
           |  nvsize=12, nvalign=8]  
  
*** Dumping AST Record Layout  
         0 | class FinalDerived  
         0 |   class Derived (primary base)  
         0 |     (Derived vtable pointer)  
         8 |     int d  
        16 |   class Derived1 (base)  
        16 |     (Derived1 vtable pointer)  
        24 |     int e  
        28 |   class Base (virtual base)  
        28 |     int a  
           | [sizeof=32, dsize=32, align=8,  
           |  nvsize=28, nvalign=8]

会发现没有虚函数也生成了 vptr 和虚函数表,对象内存里面先是自己的成员变量才是被虚继承的类的成员变量。

Derived 对应的虚表:

Vtable for 'Derived' (3 entries).  
   0 | vbase_offset (12)  
   1 | offset_to_top (0)  
   2 | Derived RTTI  
       -- (Derived, 0) vtable address --

vbase_offset (12):这是存储在虚表(Vtable)中的一个值。它告诉 Derived 类型的对象:从当前虚表指针的位置开始算,往后挪 12 字节 就能找到虚基类 Base 的内存起始点。

完整的:

Vtable for 'FinalDerived' (6 entries).  
   0 | vbase_offset (32)  
   1 | offset_to_top (0)  
   2 | FinalDerived RTTI  
       -- (Derived, 0) vtable address --  
       -- (FinalDerived, 0) vtable address --  
   3 | vbase_offset (16)  
   4 | offset_to_top (-16)  
   5 | FinalDerived RTTI  
       -- (Derived1, 16) vtable address --  
  
Virtual base offset offsets for 'FinalDerived' (1 entry).  
   Base | -24  
  
  
Original map  
Construction vtable for ('Derived', 0) in 'FinalDerived' (3 entries).  
   0 | vbase_offset (32)  
   1 | offset_to_top (0)  
   2 | Derived RTTI  
       -- (Derived, 0) vtable address --  
  
Original map  
Construction vtable for ('Derived1', 16) in 'FinalDerived' (3 entries).  
   0 | vbase_offset (16)  
   1 | offset_to_top (0)  
   2 | Derived1 RTTI  
       -- (Derived1, 16) vtable address --  
  
Original map  
Vtable for 'Derived' (3 entries).  
   0 | vbase_offset (12)  
   1 | offset_to_top (0)  
   2 | Derived RTTI  
       -- (Derived, 0) vtable address --  
  
Virtual base offset offsets for 'Derived' (1 entry).  
   Base | -24

这里多了一个 Construction vtable。Construction vtable(构造虚表) 是编译器在执行构造函数时临时使用的虚表。 为什么要这么做? 在 C++ 中,构造函数的执行顺序是:Base -> Derived -> FinalDerived

  • Derived 的构造函数正在运行时,对象还不是一个完整的 FinalDerived
  • 此时,如果调用虚函数,必须确保调用的是 Derived 的版本,而不是 FinalDerived 的版本。
  • 因此,编译器会生成一个临时的虚表(Construction vtable),专门给构造阶段使用。

如果我们加入虚函数呢?比如:

class Base {  
 public:  
  int a;  
  virtual void f() {  
    std::cout << "Base\n";  
  }  
};

class Derived : virtual public Base {  
 public:  
  int d;  
  virtual void f() {  
    std::cout << "Derived\n";  
  }  
};

内存布局:

Vtable for 'Derived' (8 entries).  
   0 | vbase_offset (16)  
   1 | offset_to_top (0)  
   2 | Derived RTTI  
       -- (Derived, 0) vtable address --  
   3 | void Derived::f()  
   4 | vcall_offset (-16)  
   5 | offset_to_top (-16)  
   6 | Derived RTTI  
       -- (Base, 16) vtable address --  
   7 | void Derived::f()  
       [this adjustment: 0 non-virtual, -24 vcall offset offset] method: void Base::f()

会发现多了一个 vcall_offset,因为虚继承的偏移量是动态的,所以需要 vcall_offset 来辅助确定函数的位置

虚继承的几种变形:

  1. 代码如下:
class Base {  
 public:  
  int a;  
};

class Derived : virtual public Base {  
 public:  
  int d;  
};

class FinalDerived: virtual public Base, public Derived {  
 public:  
  int f;  
};
*** Dumping AST Record Layout  
         0 | class Derived  
         0 |   (Derived vtable pointer)  
         8 |   int d  
        12 |   class Base (virtual base)  
        12 |     int a  
           | [sizeof=16, dsize=16, align=8,  
           |  nvsize=12, nvalign=8]

*** Dumping AST Record Layout  
         0 | class FinalDerived  
         0 |   class Derived (primary base)  
         0 |     (Derived vtable pointer)  
         8 |     int d  
        12 |   int f  
        16 |   class Base (virtual base)  
        16 |     int a  
           | [sizeof=24, dsize=20, align=8,  
           |  nvsize=16, nvalign=8]
  1. 代码如下
class Base {  
 public:  
  int a;  
};  
  
class Base1 {  
 public:  
  int b;  
};

class Derived : virtual public Base, Base1 {  
 public:  
  int d;  
};
*** Dumping AST Record Layout  
         0 | class Derived  
         0 |   (Derived vtable pointer)  
         8 |   class Base1 (base)  
         8 |     int b  
        12 |   int d  
        16 |   class Base (virtual base)  
        16 |     int a  
           | [sizeof=24, dsize=20, align=8,  
           |  nvsize=16, nvalign=8]

vptr 都放在 0 的位置(macOS 上是这样的,windows好像有区别)。