Android字节码处理-字节码知识

58 阅读7分钟

Java虚拟机

Java的口号是“Write once, run anywhere”,中文的意思也就是一次编写,到处运行。Java如何实现?增加了一个中间层Java虚拟机(Java VM),Java虚拟机来适配不同系统平台特性,屏蔽底层系统差异。计算机科学的任何问题都可以通过增加一个间接的中间层来解决实现。

类的名称

类的名称可以通过类的方法获取,类的名称比较重要的有两种,一种是全限定名(getName()),另一种是简单名称(getSimpleName())。简单可以理解为全限定名是包括包名的大名,简单名称不包括类的包名,只包括类的名称。以历史人物为类,常山赵子龙就是全限定名,赵子龙就是简单名称。当遇到匿名对象的时候,全限定名会以附庸的类名+$+数字(比如:Test$1),一个类中有多个匿名内部类时候数字会从1开始逐渐累加,而简单名称则返回空字符串。匿名就像武侠小说中无足轻重的小人,路人甲乙丙丁。

String.class.getName()     
  returns "java. lang. String" 
byte.class.getName()     
  returns "byte" 
(new Object[3]).getClass().getName()   
  returns "[Ljava. lang. Object;" 
(new int[3][4][5][6][7][8][9]).getClass().getName()   
  returns "[[[[[[[I"

类名称和编码名称之间的映射:

元素类型编码
booleanZ
byteB
charC
class or interfaceLclassname;
doubleD
floatF
intI
longJ
shortS
类型元素和编码之间的映射,boolean用Z代表,因为byte已经用B所用。long用J代表,而没有用L代表,因为class或者接口已经用Lclassname;代表了。

class文件的结构

示例代码如下:

public class Test {
    public static void main(String[] args) {
        System.out.println(java.lang.String.class.getName());
        System.out.println(java.lang.String.class.getSimpleName());
        System.out.println(java.lang.String.class.getCanonicalName());
        System.out.println(java.lang.String.class.getTypeName());
        System.out.println((new Object[3]).getClass().getName());
        System.out.println(new Runnable() {
            @Override
            public void run() {

            }
        }.getClass().getName());
        System.out.println(new Runnable() {
            @Override
            public void run() {

            }
        }.getClass().getSimpleName());
    }
}

.class文件就是一个普通的二进制文件,只不过是按照一定规则生成的。我们可以借助JDK的命令javap来查看.class文件。javap -v Test.class来查看class文件结构:

Classfile /Users/caicai/IdeaProjects/KotlinFirst/build/classes/java/main/Test.class
  Last modified 2025年3月15日; size 935 bytes
  SHA-256 checksum 57df8d686df7200817bf9439f597a072909d88c7afb75e9c04550cfffd32b7b2
  Compiled from "Test.java"
public class Test
  minor version: 0
  major version: 55
  flags: (0x0021) ACC_PUBLIC, ACC_SUPER
  this_class: #15                         // Test
  super_class: #9                         // java/lang/Object
  interfaces: 0, fields: 0, methods: 2, attributes: 3
Constant pool:
   #1 = Methodref          #9.#31         // java/lang/Object."<init>":()V
   #2 = Fieldref           #32.#33        // java/lang/System.out:Ljava/io/PrintStream;
   #3 = Class              #34            // java/lang/String
   #4 = Methodref          #35.#36        // java/lang/Class.getName:()Ljava/lang/String;
   #5 = Methodref          #37.#38        // java/io/PrintStream.println:(Ljava/lang/String;)V
   #6 = Methodref          #35.#39        // java/lang/Class.getSimpleName:()Ljava/lang/String;
   #7 = Methodref          #35.#40        // java/lang/Class.getCanonicalName:()Ljava/lang/String;
   #8 = Methodref          #35.#41        // java/lang/Class.getTypeName:()Ljava/lang/String;
   #9 = Class              #42            // java/lang/Object
  #10 = Methodref          #9.#43         // java/lang/Object.getClass:()Ljava/lang/Class;
  #11 = Class              #44            // Test$1
  #12 = Methodref          #11.#31        // Test$1."<init>":()V
  #13 = Class              #45            // Test$2
  #14 = Methodref          #13.#31        // Test$2."<init>":()V
  #15 = Class              #46            // Test
  #16 = Utf8               InnerClasses
  #17 = Utf8               <init>
  #18 = Utf8               ()V
  #19 = Utf8               Code
  #20 = Utf8               LineNumberTable
  #21 = Utf8               LocalVariableTable
  #22 = Utf8               this
  #23 = Utf8               LTest;
  #24 = Utf8               main
  #25 = Utf8               ([Ljava/lang/String;)V
  #26 = Utf8               args
  #27 = Utf8               [Ljava/lang/String;
  #28 = Utf8               SourceFile
  #29 = Utf8               Test.java
  #30 = Utf8               NestMembers
  #31 = NameAndType        #17:#18        // "<init>":()V
  #32 = Class              #47            // java/lang/System
  #33 = NameAndType        #48:#49        // out:Ljava/io/PrintStream;
  #34 = Utf8               java/lang/String
  #35 = Class              #50            // java/lang/Class
  #36 = NameAndType        #51:#52        // getName:()Ljava/lang/String;
  #37 = Class              #53            // java/io/PrintStream
  #38 = NameAndType        #54:#55        // println:(Ljava/lang/String;)V
  #39 = NameAndType        #56:#52        // getSimpleName:()Ljava/lang/String;
  #40 = NameAndType        #57:#52        // getCanonicalName:()Ljava/lang/String;
  #41 = NameAndType        #58:#52        // getTypeName:()Ljava/lang/String;
  #42 = Utf8               java/lang/Object
  #43 = NameAndType        #59:#60        // getClass:()Ljava/lang/Class;
  #44 = Utf8               Test$1
  #45 = Utf8               Test$2
  #46 = Utf8               Test
  #47 = Utf8               java/lang/System
  #48 = Utf8               out
  #49 = Utf8               Ljava/io/PrintStream;
  #50 = Utf8               java/lang/Class
  #51 = Utf8               getName
  #52 = Utf8               ()Ljava/lang/String;
  #53 = Utf8               java/io/PrintStream
  #54 = Utf8               println
  #55 = Utf8               (Ljava/lang/String;)V
  #56 = Utf8               getSimpleName
  #57 = Utf8               getCanonicalName
  #58 = Utf8               getTypeName
  #59 = Utf8               getClass
  #60 = Utf8               ()Ljava/lang/Class;
{
  public Test();
    descriptor: ()V
    flags: (0x0001) ACC_PUBLIC
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: invokespecial #1                  // Method java/lang/Object."<init>":()V
         4: return
      LineNumberTable:
        line 1: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       5     0  this   LTest;

  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: (0x0009) ACC_PUBLIC, ACC_STATIC
    Code:
      stack=3, locals=1, args_size=1
         0: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
         3: ldc           #3                  // class java/lang/String
         5: invokevirtual #4                  // Method java/lang/Class.getName:()Ljava/lang/String;
         8: invokevirtual #5                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        11: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
        14: ldc           #3                  // class java/lang/String
        16: invokevirtual #6                  // Method java/lang/Class.getSimpleName:()Ljava/lang/String;
        19: invokevirtual #5                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        22: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
        25: ldc           #3                  // class java/lang/String
        27: invokevirtual #7                  // Method java/lang/Class.getCanonicalName:()Ljava/lang/String;
        30: invokevirtual #5                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        33: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
        36: ldc           #3                  // class java/lang/String
        38: invokevirtual #8                  // Method java/lang/Class.getTypeName:()Ljava/lang/String;
        41: invokevirtual #5                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        44: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
        47: iconst_3
        48: anewarray     #9                  // class java/lang/Object
        51: invokevirtual #10                 // Method java/lang/Object.getClass:()Ljava/lang/Class;
        54: invokevirtual #4                  // Method java/lang/Class.getName:()Ljava/lang/String;
        57: invokevirtual #5                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        60: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
        63: new           #11                 // class Test$1
        66: dup
        67: invokespecial #12                 // Method Test$1."<init>":()V
        70: invokevirtual #10                 // Method java/lang/Object.getClass:()Ljava/lang/Class;
        73: invokevirtual #4                  // Method java/lang/Class.getName:()Ljava/lang/String;
        76: invokevirtual #5                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        79: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
        82: new           #13                 // class Test$2
        85: dup
        86: invokespecial #14                 // Method Test$2."<init>":()V
        89: invokevirtual #10                 // Method java/lang/Object.getClass:()Ljava/lang/Class;
        92: invokevirtual #6                  // Method java/lang/Class.getSimpleName:()Ljava/lang/String;
        95: invokevirtual #5                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        98: return
      LineNumberTable:
        line 3: 0
        line 4: 11
        line 5: 22
        line 6: 33
        line 7: 44
        line 8: 60
        line 13: 70
        line 8: 76
        line 14: 79
        line 19: 89
        line 14: 95
        line 20: 98
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0      99     0  args   [Ljava/lang/String;
}
SourceFile: "Test.java"
NestMembers:
  Test$2
  Test$1
InnerClasses:
  #13;                                    // class Test$2
  #11;                                    // class Test$1

从上面的信息可以看出:class文件包括主版本,次版本,本类,父类,接口个数,方法个数,属性个数,常量池以及属性,构造方法以及类中定义的方法等。需要特别说明的是static main方法的descriptor:([Ljava/lang/String;)V方法参数是String[],返回值是void。 如果父类是Object,类默认的无参构造方法会调用父类的无参构造函数, invokespecial #1,1代表常量池的索引,也就是java/lang/Object.<init>:()V。

ASM Bytecode Viewer

今天安利一款ASM相关的工具,在IntelliJ IDEA的插件市场中找到ASM Bytecode Viewer插件,下载并重启此IDE。安装完ASM Bytecode Viewer插件之后,右键点击相应的.java文件,菜单中会出现ASM Bytecode Viewer菜单选项,选择此选项之后,会出现ASM Bytecode Viewer的视图,包括三个标签页:Bytecode,ASMified,Groovified,三个标签的作用分别是字节码,用ASM如何实现,Groovy如何实现,三个标签中用的最多的是ASMified。ASM Bytecode Viewer在Android Studio因为版本升级出现不兼容的现象,但是在IntelliJ IDEA表现良好,没有出现兼容性的问题。

总结

Android字节码处理是一个不容易学习,同时也是一个学习曲线相当陡峭的技术,不仅需要学习Java虚拟机的知识,同时也需要学习字节码相关的知识,同时还需要学习ASM来实现对字节码的操控的知识。相关工具或者插件的学习能够对我们起到事半功倍的效果,比如说javap命令,ASM Bytecode Viewer插件,知其然,知其所以然。学会字节码处理技术后,可以帮助我们优化编译流程,不仅在编译速度和性能方面,甚至可以自定义编译流程,开发编译插件,帮助大家。希望文章对您有所帮助,如有错误,请不吝指出。