JVM的dispatch_next执行java方法的字节码指令

400 阅读4分钟

接着前面文章,介绍了generate_fixed_frame生成固定的栈帧,那么接下来就是执行字节码,

InterpreterMacroAssembler的dispatch_next执行指令字节码
void InterpreterMacroAssembler::dispatch_next(TosState state, int step) {
  // load next bytecode (load before advancing rsi to prevent AGI)
  load_unsigned_byte(rbx, Address(rsi, step));
  // advance rsi
  increment(rsi, step);
  dispatch_base(state, Interpreter::dispatch_table(state));
}
  1. 之前介绍generate_fixed_frame的文章可以知道rsi寄存器保存的是byte code pointer(字节码志指令),下面这段代码中,第一次执行step的参数初始是0,将字节码的初始地址加载一个byte到rbx寄存器(因为字节码指令不超多256,也就是8位,所以加载一个字节就可以知道具体的指令的操作码)。
  load_unsigned_byte(rbx, Address(rsi, step));
  1. 将rsi寄存器值增加step(初始时是0)。
  increment(rsi, step);
  1. dispatch_base从TemplateInterpreter的_active_table(DispatchTable类型)分派表获取对应栈顶缓存的状态的字节码的汇编代码的入口地址的一维数组。
dispatch_base(state, Interpreter::dispatch_table(state));

dispatch_base分派执行字节码指令

void InterpreterMacroAssembler::dispatch_base(TosState state, address* table, bool verifyoop) {
  //省略验证的代码
  Address index(noreg, rbx, Address::times_ptr);
  ExternalAddress tbl((address)table);
  ArrayAddress dispatch(tbl, index);
  jump(dispatch);
}
  1. 调用Address的构造函数创建对象index,并传入三个参数中第一个参数是基地址,第二个参数是索引下标,第三个参数times_ptr是地址缩放因子(32位平台是2,就是一个索引下标占的位置1左移2为,就是一个位置会占用4字节)
Address index(noreg, rbx, Address::times_ptr);
  1. 调用ExternalAddress构造函数创建对象,传入参数分标配table(保存是dispatch_next中传入栈顶缓存的字节码的汇编代码的入口地址的一维数组)
  ExternalAddress tbl((address)table);
  1. 创建ArrayAddress对象,传入tbl(对分派表的包装)和index两个参数,
  ArrayAddress dispatch(tbl, index);
  1. 跳转ArrayAddress的基地址去执行字节码的汇编入口地址。
  jump(dispatch);

下面是jmp之前调用as_AddressArrayAddress对象转化成Address对象.

// macroAssembler_x86.cpp
void MacroAssembler::jump(ArrayAddress entry) {
  jmp(as_Address(entry));
 } 
Address MacroAssembler::as_Address(ArrayAddress adr) {
  return Address::make_array(adr);
}

接着调用make_array函数的将ArrayAddress转换成Address对象,传入ArrayAddress对象。

//assembler_x86.cpp
Address Address::make_array(ArrayAddress adr) {
  AddressLiteral base = adr.base();
  Address index = adr.index();
  assert(index._disp == 0, "must not have disp"); // maybe it can?
  Address array(index._base, index._index, index._scale, (intptr_t) base.target());
  array._rspec = base._rspec;
  return array;
}
  1. 下面就是生成jmp指令跳转到adr的地址去执行指令.
void Assembler::jmp(Address adr) {
  InstructionMark im(this);
  prefix(adr);
  emit_int8((unsigned char)0xFF);
  emit_operand(rsp, adr);
}

由intel开发后手册得到jmp指令的操作码是0xFF,对应emit_int8((unsigned char)0xFF)这一行, image.png

下面是x86平台的emit_operand生成jmp的操作数,即跳转指令的地址,接着调用emit_operand重载方法。

assembler_x86.cpp
void Assembler::emit_operand32(Register reg, Address adr) {
  assert(reg->encoding() < 8, "no extended registers");
  assert(!adr.base_needs_rex() && !adr.index_needs_rex(), "no extended registers");
  emit_operand(reg, adr._base, adr._index, adr._scale, adr._disp,adr._rspec);
}

emit_operand重载方法如下:

void Assembler::emit_operand(Register reg,Register base, Register index,Address::ScaleFactor scale, int disp,
RelocationHolder const& rspec,int rip_relative_correction) {
  relocInfo::relocType rtype = (relocInfo::relocType) rspec.type();
  int regenc = encode(reg) << 3;
  int indexenc = index->is_valid() ? encode(index) << 3 : 0;
  int baseenc = base->is_valid() ? encode(base) : 0;

  if (base->is_valid()) {
    if (index->is_valid()) {
      assert(scale != Address::no_scale, "inconsistent address");
      // [base + index*scale + disp]
      if (disp == 0 && rtype == relocInfo::none  &&
          base != rbp LP64_ONLY(&& base != r13)) {
        // [base + index*scale]
        // [00 reg 100][ss index base]
        assert(index != rsp, "illegal addressing mode");
        emit_int8(0x04 | regenc);
        emit_int8(scale << 6 | indexenc | baseenc);
      } else if (is8bit(disp) && rtype == relocInfo::none) {
        // [base + index*scale + imm8]
        // [01 reg 100][ss index base] imm8
        assert(index != rsp, "illegal addressing mode");
        emit_int8(0x44 | regenc);
        emit_int8(scale << 6 | indexenc | baseenc);
        emit_int8(disp & 0xFF);
      } else {
        // [base + index*scale + disp32]
        // [10 reg 100][ss index base] disp32
        assert(index != rsp, "illegal addressing mode");
        emit_int8(0x84 | regenc);
        emit_int8(scale << 6 | indexenc | baseenc);
        emit_data(disp, rspec, disp32_operand);
      }
    } else if (base == rsp LP64_ONLY(|| base == r12)) {
      // [rsp + disp]
      if (disp == 0 && rtype == relocInfo::none) {
        // [rsp]
        // [00 reg 100][00 100 100]
        emit_int8(0x04 | regenc);
        emit_int8(0x24);
      } else if (is8bit(disp) && rtype == relocInfo::none) {
        // [rsp + imm8]
        // [01 reg 100][00 100 100] disp8
        emit_int8(0x44 | regenc);
        emit_int8(0x24);
        emit_int8(disp & 0xFF);
      } else {
        // [rsp + imm32]
        // [10 reg 100][00 100 100] disp32
        emit_int8(0x84 | regenc);
        emit_int8(0x24);
        emit_data(disp, rspec, disp32_operand);
      }
    } else {
      // [base + disp]
      assert(base != rsp LP64_ONLY(&& base != r12), "illegal addressing mode");
      if (disp == 0 && rtype == relocInfo::none &&
          base != rbp LP64_ONLY(&& base != r13)) {
        // [base]
        // [00 reg base]
        emit_int8(0x00 | regenc | baseenc);
      } else if (is8bit(disp) && rtype == relocInfo::none) {
        // [base + disp8]
        // [01 reg base] disp8
        emit_int8(0x40 | regenc | baseenc);
        emit_int8(disp & 0xFF);
      } else {
        // [base + disp32]
        // [10 reg base] disp32
        emit_int8(0x80 | regenc | baseenc);
        emit_data(disp, rspec, disp32_operand);
      }
    }
  } else {
    if (index->is_valid()) {
      assert(scale != Address::no_scale, "inconsistent address");
      // [index*scale + disp]
      // [00 reg 100][ss index 101] disp32
      assert(index != rsp, "illegal addressing mode");
      emit_int8(0x04 | regenc);
      emit_int8(scale << 6 | indexenc | 0x05);
      emit_data(disp, rspec, disp32_operand);
    } else if (rtype != relocInfo::none ) {
      // [disp] (64bit) RIP-RELATIVE (32bit) abs
      // [00 000 101] disp32
      emit_int8(0x05 | regenc);
      // Note that the RIP-rel. correction applies to the generated
      // disp field, but _not_ to the target address in the rspec.
      // disp was created by converting the target address minus the pc
      // at the start of the instruction. That needs more correction here.
      // intptr_t disp = target - next_ip;
      assert(inst_mark() != NULL, "must be inside InstructionMark");
      address next_ip = pc() + sizeof(int32_t) + rip_relative_correction;
      int64_t adjusted = disp;
      // Do rip-rel adjustment for 64bit
      LP64_ONLY(adjusted -=  (next_ip - inst_mark()));
      assert(is_simm32(adjusted),
             "must be 32bit offset (RIP relative address)");
      emit_data((int32_t) adjusted, rspec, disp32_operand);

    } else {
      // 32bit never did this, did everything as the rip-rel/disp code above
      // [disp] ABSOLUTE
      // [00 reg 100][00 100 101] disp32
      emit_int8(0x04 | regenc);
      emit_int8(0x25);
      emit_data(disp, rspec, disp32_operand);
    }
  }
}

由intel的开发手册的指令的 image.png 以32位平台为例,则执行以下分支代码 则实际地址是base + index*scale + disp32,

        // [base + index*scale + disp32]
        // [10 reg 100][ss index base] disp32
        assert(index != rsp, "illegal addressing mode");
        emit_int8(0x84 | regenc);
        emit_int8(scale << 6 | indexenc | baseenc);
        emit_data(disp, rspec, disp32_operand);

总结
dispatch_next就是通过rbx寄存器保存bcp(字节码指针)的地址,通过模板解释器中DispatchTable找到tos(栈顶缓存状态)和字节码指令枚举找到对应汇编指令地址,然后通过汇编jmp指令跳转到对应的指令地址执行方法的字节码的第一条指令的汇编代码。