“Maybe bug 77342775” 异常问题分析
[TOC]
1. 背景
1.1 前情提要
-
宿主是 release 包
-
插件打包了 org.apache.http.legacy.jar 这个jar包里的代码,但都是空实现类。例如 HttpRequestBase:
package org.apache.http.message;
public abstract class AbstractHttpMessage implements HttpMessage {
// !!!这里空实现类里定义了 2 个变量,跟系统实现类是一致的,参考下文
protected HeaderGroup headergroup;
protected HttpParams params;
protected AbstractHttpMessage(HttpParams params) {
throw new RuntimeException("Stub!");
}
......
}
package org.apache.http.client.methods;
public abstract class HttpRequestBase extends AbstractHttpMessage implements HttpUriRequest, AbortableHttpRequest, Cloneable {
// 这里空实现类里没有定义任何变量,而系统的实现类里有5个变量
@Deprecated
public HttpRequestBase() {
throw new RuntimeException("Stub!");
}
@Deprecated
public void setConnectionRequest(ClientConnectionRequest connRequest) throws IOException {
throw new RuntimeException("Stub!");
}
......
}
- 插件里的代码实现了 BodyParamsEntity 和 HttpRequest 这两个类(继承自org.apache.http.legacy.jar 这个包里的空实现类)
package com.xosp.android.framework.http.client;
public class BodyParamsEntity extends AbstractHttpEntity {
// 具体实现可忽略
.......
}
package com.xosp.android.framework.http.client;
public class HttpRequest extends HttpRequestBase implements HttpEntityEnclosingRequest {
// !!!这里新增了四个变量,包括 entity
private HttpEntity entity;
private HttpMethod method;
private URIBuilder uriBuilder;
private Charset uriCharset;
@Override
public HttpEntity getEntity() {
return this.entity;
}
@Override
public void setEntity(HttpEntity httpEntity) {
this.entity = httpEntity;
}
......
}
1.2 异常信息
1.2.1 Class 'x.x.x' does not implement interface 'y.y.y'
Class 'com.xosp.android.framework.http.client.entity.BodyParamsEntity' does not implement interface 'java.util.concurrent.locks.Lock' in call to 'void java.util.concurrent.locks.Lock.lock()' (declaration of 'org.apache.http.client.methods.HttpRequestBase' appears in /system/framework/org.apache.http.legacy.boot.jar)
1.2.2 Maybe bug 77342775, looking for ...
04-28 14:27:08.705 10893 11010 E com.xosp.example: Maybe bug 77342775, looking for Lorg/apache/http/HttpEntity; 0x137f34f0[continuous;main space (region space)] defined in /system/framework/org.apache.http.legacy.boot.jar/0x7884cd4e40
04-28 14:27:08.705 10893 11010 E com.xosp.example: with loader: dalvik.system.PathClassLoader/0x787c2f7400[hit:continuous;main space (region space)](/system/framework/org.apache.http.legacy.boot.jar/0x7884cd4e40:/data/app/com.xosp.example-eLNLa1Ch8IMzsPBMytsZrw==/base.apk/0x7884df81c0:+!classes2.dex/0x7884df8540:+!classes3.dex/0x7884df87e0);java.lang.BootClassLoader/0x7884cbb780
04-28 14:27:08.705 10893 11010 E com.xosp.example: in interface table for Ljava/util/concurrent/locks/ReentrantLock; 0x6f7af018[image;/data/dalvik-cache/arm64/system@framework@boot-core-oj.art;+;0x6f713000] defined in /system/framework/core-oj.jar/0x7884cd0140 ifcount=2
04-28 14:27:08.706 10893 11010 E com.xosp.example: with loader BootClassLoader
04-28 14:27:08.706 10893 11010 E com.xosp.example: iface #0: java.util.concurrent.locks.Lock
04-28 14:27:08.706 10893 11010 E com.xosp.example: iface #1: java.io.Serializable
1.3 HttpRequestBase(系统实现类)
package org.apache.http.client.methods;
public abstract class AbstractHttpMessage implements HttpMessage {
// !!!这里系统的实现类里有 2 个变量,跟空实现类是一致的,参考上文
protected HeaderGroup headergroup;
protected HttpParams params;
protected AbstractHttpMessage(HttpParams params) {
this.headergroup = new HeaderGroup();
this.params = params;
}
......
}
package org.apache.http.client.methods;
public abstract class HttpRequestBase extends AbstractHttpMessage implements HttpUriRequest, AbortableHttpRequest, Cloneable {
// 这里系统的实现类里有5个变量,而空实现类里没有
private Lock abortLock = new ReentrantLock();
private boolean aborted;
private ClientConnectionRequest connRequest;
private ConnectionReleaseTrigger releaseTrigger;
private URI uri;
@Override
public void setConnectionRequest(ClientConnectionRequest connRequest) throws IOException {
// 从日志信息可以看出这里就是抛出异常的位置
this.abortLock.lock();
try {
if (this.aborted) {
throw new IOException("Request already aborted");
}
this.releaseTrigger = null;
this.connRequest = connRequest;
} finally {
this.abortLock.unlock();
}
}
......
}
1.4 关于内联
-
可能很多人看到这个问题,首先会联想到插件化在 Android 9 上的内联问题。但如果是内联引起的话,那么因为 /system/framework/org.apache.http.legacy.boot.jar 这里的类都是在一个独立的 Dex 上,所以它首先会报 Inlined method resolution crossed dex file boundary 这个异常,但是我们分析了所有的日志(包括网络上相关的问题),都没有发现相关信息。
-
另外如果是内联的问题,应该会有 Fatal signal xx 这样的 native 异常导致的 crash 场景,但是在我们在各个 Android 9 的设备上的日志都是 IncompatibleClassChangeError,并未发现有 native crash 的 case:
java.lang.IncompatibleClassChangeError: Class 'com.xosp.android.framework.http.client.entity.BodyParamsEntity' does not implement interface 'java.util.concurrent.locks.Lock' in call to 'void java.util.concurrent.locks.Lock.lock()' (declaration of 'org.apache.http.client.methods.HttpRequestBase' appears in /system/framework/org.apache.http.legacy.boot.jar)
- 因此,我们开始怀疑是不是 dex2oat 编译优化出来的代码有问题,果然:
2.dex2oat编译模式
2.1 参考链接
2.2 四种编译模式
-
verify: only run DEX code verification (no AOT compilation).
-
quicken: (until Android 11) run DEX code verification and optimize some DEX instructions to get better interpreter performance.
-
speed: Run DEX code verification and AOT-compile all methods.
-
speed-profile: Run DEX code verification and AOT-compile methods listed in a profile file.
其中对于 quicken 模式,除了运行 DEX 代码验证,还会对部分 DEX 指令进行优化,那么具体是做了什么优化呢?我们通过在模拟器上手动触发不同模式的 dex2oat 来简单的对比一下。
2.2.1 verify编译
- adb shell cmd package compile -m verify -f com.xosp.example
11: void com.xosp.android.framework.http.client.HttpRequest.setEntity(org.apache.http.HttpEntity) (dex_method_idx=34256)
DEX CODE:
0x0000: 5b01 7041 | iput-object v1, v0, Lorg/apache/http/HttpEntity; com.xosp.android.framework.http.client.HttpRequest.entity // field@16752
0x0002: 7300 | return-void-no-barrier
OatMethodOffsets (offset=0x00000000)
code_offset: 0x00000000
OatQuickMethodHeader (offset=0x00000000)
vmap_table: (offset=0x00000000)
QuickMethodFrameInfo
frame_size_in_bytes: 0
core_spill_mask: 0x00000000
fp_spill_mask: 0x00000000
vr_stack_locations:
ins: v0[sp + #8] v1[sp + #12]
method*: v2[sp + #0]
CODE: (code_offset=0x00000000 size_offset=0x00000000 size=0)
NO CODE!
- 从 oatdump 的输出可以看到,对于 verify 模式,DEX 的指令并未进行优化,还是基本的 iput-object v1, v0
| Opcode (hex) | Opcode name | Explanation | Example |
|---|---|---|---|
| 5B | iput-object vx,vy,field_id | Puts the object reference in vx into an instance field. The instance is referenced by vy. | 5B20 0000 - iput-object v0, v2, LineReader.bis:Ljava/io/BufferedInputStream; // field@0000 Stores the object reference in v0 into field@0000 (entry #0 in the field table). The instance is referenced by v2. |
2.2.2 quicken编译
- adb shell cmd package compile -m quicken -f com.xosp.example
11: void com.xosp.android.framework.http.client.HttpRequest.setEntity(org.apache.http.HttpEntity) (dex_method_idx=34256)
DEX CODE:
0x0000: e801 2400 | iput-object-quick v1, v0, // offset@36
0x0002: 7300 | return-void-no-barrier
OatMethodOffsets (offset=0x00000000)
code_offset: 0x00000000
OatQuickMethodHeader (offset=0x00000000)
vmap_table: (offset=0x00000000)
QuickMethodFrameInfo
frame_size_in_bytes: 0
core_spill_mask: 0x00000000
fp_spill_mask: 0x00000000
vr_stack_locations:
ins: v0[sp + #8] v1[sp + #12]
method*: v2[sp + #0]
CODE: (code_offset=0x00000000 size_offset=0x00000000 size=0)
NO CODE!
-
而对于 quicken 模式,我们可以看到指令已经替换成 iput-object-quick v1, v0, // offset@36。
-
iput-object-quick 指令即:通过对象数据区的起始地址和偏移值,直接修改内存来实现变量保存逻辑的。
| Opcode (hex) | Opcode name | Explanation | Example |
|---|---|---|---|
| F7 | iput-object-quick vx,vy,offset | Puts the object reference value stored in vx to offset in vy instance's data area to vx6. | F701 4C00 - iput-object-quick v1, v0, [obj+004c] Puts the object reference value in v1 to offset 0CH of the instance pointed by v3. |
2.2.3 speed编译
- adb shell cmd package compile -m speed -f com.xosp.example
11: void com.xosp.android.framework.http.client.HttpRequest.setEntity(org.apache.http.HttpEntity) (dex_method_idx=34256)
DEX CODE:
0x0000: 5b01 7041 | iput-object v1, v0, Lorg/apache/http/HttpEntity; com.xosp.android.framework.http.client.HttpRequest.entity // field@16752
0x0002: 7300 | return-void-no-barrier
OatMethodOffsets (offset=0x0003c7c0)
code_offset: 0x004abbd0
OatQuickMethodHeader (offset=0x004abbb8)
vmap_table: (offset=0x0044d5b4)
Optimized CodeInfo (number_of_dex_registers=2, number_of_stack_maps=0)
StackMapEncoding (native_pc_bit_offset=0, dex_pc_bit_offset=0, dex_register_map_bit_offset=0, inline_info_bit_offset=0, register_mask_bit_offset=0, stack_mask_index_bit_offset=0, total_bit_size=0)
DexRegisterLocationCatalog (number_of_entries=0, size_in_bytes=0)
QuickMethodFrameInfo
frame_size_in_bytes: 0
core_spill_mask: 0x40000000 (r30)
fp_spill_mask: 0x00000000
vr_stack_locations:
ins: v0[sp + #8] v1[sp + #12]
method*: v2[sp + #0]
CODE: (code_offset=0x004abbd0 size_offset=0x004abbcc size=24)...
0x004abbd0: b9002422 str w2, [x1, #36]
0x004abbd4: 34000082 cbz w2, #+0x10 (addr 0x4abbe4)
0x004abbd8: f9404670 ldr x16, [tr, #136] ; card_table
0x004abbdc: 530a7c31 lsr w17, w1, #10
0x004abbe0: 38316a10 strb w16, [x16, x17]
0x004abbe4: d65f03c0 ret
- 对于 speed 模式,我们可以看到 CODE 已经有编译好的机器码。这里暂不做深入解读,感兴趣的同学可以自行分析。
3. 问题分析
- 由上文 dex2oat 的编译模式,可以看出 quicken 模式会对 DEX 指令进行优化,而 verify 模式则不会。巧合是时,在 Android 9 上默认的编译模式正好是 quicken。因此我们决定 dump 出插件的 odex 代码 进行对比验证:
3.1 Android 8.0 插件加载
!!!这里 classpath 传的参数是 '&'
classpath = &
!!!这里 编译模式 是 quicken
compiler-filter = quicken
concurrent-copying = true
debuggable = false
dex2oat-cmdline = --instruction-set=arm64 --instruction-set-features=a53 --runtime-arg -Xrelocate --boot-image=/system/framework/boot.art --runtime-arg -Xms64m --runtime-arg -Xmx512m --instruction-set-variant=generic --instruction-set-features=default --dex-file=/data/data/com.xosp.example/files/plugin/0/base.apk --output-vdex-fd=196 --oat-fd=197 --oat-location=/data/data/com.xosp.example/files/plugin/0/oat/arm64/base.odex --compiler-filter=quicken --class-loader-context=&
dex2oat-host = Arm64
image-location = /data/dalvik-cache/arm64/system@framework@boot.art:/data/dalvik-cache/arm64/system@framework@boot-core-libart.art:/data/dalvik-cache/arm64/system@framework@boot-conscrypt.art:/data/dalvik-cache/arm64/system@framework@boot-okhttp.art:/data/dalvik-cache/arm64/system@framework@boot-bouncycastle.art:/data/dalvik-cache/arm64/system@framework@boot-apache-xml.art:/data/dalvik-cache/arm64/system@framework@boot-legacy-test.art:/data/dalvik-cache/arm64/system@framework@boot-ext.art:/data/dalvik-cache/arm64/system@framework@boot-framework.art:/data/dalvik-cache/arm64/system@framework@boot-telephony-common.art:/data/dalvik-cache/arm64/system@framework@boot-voip-common.art:/data/dalvik-cache/arm64/system@framework@boot-ims-common.art:/data/dalvik-cache/arm64/
!!!这里在 Android 8.0 上 org.apache.http.legacy.boot.art 是在 image-location 的参数里的
system@framework@boot-org.apache.http.legacy.boot.art:/data/dalvik-cache/arm64/system@framework@boot-android.hidl.base-V1.0-java.art:/data/dalvik-cache/arm64/system@framework@boot-android.hidl.manager-V1.0-java.art
native-debuggable = false
pic = false
11: void com.xosp.android.framework.http.client.HttpRequest.setEntity(org.apache.http.HttpEntity) (dex_method_idx=34256)
DEX CODE:
0x0000: e801 2400 | iput-object-quick v1, v0, // offset@36
0x0002: 7300 | return-void-no-barrier
......
CODE: (code_offset=0x00000000 size_offset=0x00000000 size=0)
NO CODE!
8: org.apache.http.HttpEntity com.xosp.android.framework.http.client.HttpRequest.getEntity() (dex_method_idx=34252)
DEX CODE:
0x0000: e510 2400 | iget-object-quick v0, v1, // offset@36
0x0002: 1100 | return-object v0
......
CODE: (code_offset=0x00000000 size_offset=0x00000000 size=0)
NO CODE!
- 当我们在 Android 8 的机器上,以插件的方式加载的时候,可以看到通过 dex2oat 编译之后,setEntity 方法调用的是 iput-object-quick v1, v0, // offset@36 这个指令,特别注意这里的 offset 是 36。
3.2 Android 9.0 插件加载
!!!这里 classpath 传的参数是 '&'
classpath = &
!!!这里 编译模式 是 quicken
compiler-filter = quicken
concurrent-copying = true
debuggable = false
dex2oat-cmdline = /system/bin/dex2oat --instruction-set=arm64 --instruction-set-features=a53 --runtime-arg -Xhidden-api-checks --runtime-arg -Xrelocate --boot-image=/system/framework/boot.art --runtime-arg -Xms64m --runtime-arg -Xmx512m --instruction-set-variant=generic --instruction-set-features=default --dex-file=/./data/user/0/com.xosp.example/files/plugin/0/base.apk --output-vdex-fd=146 --oat-fd=152 --oat-location=/./data/user/0/com.xosp.example/files/plugin/0/oat/arm64/base.odex --compiler-filter=quicken --class-loader-context=&
dex2oat-host = Arm64
image-location = /data/dalvik-cache/arm64/system@framework@boot.art:/data/dalvik-cache/arm64/system@framework@boot-core-libart.art:/data/dalvik-cache/arm64/system@framework@boot-conscrypt.art:/data/dalvik-cache/arm64/system@framework@boot-okhttp.art:/data/dalvik-cache/arm64/system@framework@boot-bouncycastle.art:/data/dalvik-cache/arm64/system@framework@boot-apache-xml.art:/data/dalvik-cache/arm64/system@framework@boot-ext.art:/data/dalvik-cache/arm64/system@framework@boot-framework.art:/data/dalvik-cache/arm64/system@framework@boot-telephony-common.art:/data/dalvik-cache/arm64/system@framework@boot-voip-common.art:/data/dalvik-cache/arm64/system@framework@boot-ims-common.art:/data/dalvik-cache/arm64/system@framework@boot-android.hidl.base-V1.0-java.art:/data/dalvik-cache/arm64/system@framework@boot-android.hidl.manager-V1.0-java.art:/data/dalvik-cache/arm64/system@framework@boot-framework-oahl-backward-compatibility.art:/data/dalvik-cache/arm64/system@framework@boot-android.test.base.art
native-debuggable = false
pic = false
11: void com.xosp.android.framework.http.client.HttpRequest.setEntity(org.apache.http.HttpEntity) (dex_method_idx=34256)
DEX CODE:
0x0000: e801 1000 | iput-object-quick v1, v0, // offset@16
0x0002: 7300 | return-void-no-barrier
......
CODE: (code_offset=0x00000000 size_offset=0x00000000 size=0)
NO CODE!
8: org.apache.http.HttpEntity com.xosp.android.framework.http.client.HttpRequest.getEntity() (dex_method_idx=34252)
DEX CODE:
0x0000: e510 1000 | iget-object-quick v0, v1, // offset@16
0x0002: 1100 | return-object v0
......
CODE: (code_offset=0x00000000 size_offset=0x00000000 size=0)
NO CODE!
-
然而,当我们在 Android 9 的机器上,以插件的方式加载的时候,我们看到通过 dex2oat 编译之后,setEntity 方法调用的是 iput-object-quick v1, v0, // offset@16,注意这里 offset 变成了 16。
-
注意到,这里地址偏移 16 和 36 相差 20,正好可以对上 5 个对象的引用,详见上文 HttpRequestBase(系统实现类)
3.3 Android 9.0 直接安装
!!!这里 classpath 把 org.apache.http.legacy.boot.jar 传递了过去
classpath = PCL[/system/framework/org.apache.http.legacy.boot.jar*796383208]
compilation-reason = install
compiler-filter = speed-profile
concurrent-copying = true
debuggable = false
dex2oat-cmdline = /system/bin/dex2oat --zip-fd=6 --zip-location=base.apk --input-vdex-fd=-1 --output-vdex-fd=8 --oat-fd=7 --oat-location=/data/app/com.xosp.example/oat/arm64/base.odex --instruction-set=arm64 --instruction-set-variant=generic --instruction-set-features=default --runtime-arg -Xms64m --runtime-arg -Xmx512m --compiler-filter=speed-profile --swap-fd=9 --app-image-fd=10 --image-format=lz4 --classpath-dir=/data/app/com.xosp.example-uVSqKqKftrQRujUYmb4Qwg== --class-loader-context=PCL[/system/framework/org.apache.http.legacy.boot.jar] --generate-mini-debug-info --compact-dex-level=none --runtime-arg -Xtarget-sdk-version:30 --runtime-arg -Xhidden-api-checks --compilation-reason=install
dex2oat-host = Arm64
image-location = /data/dalvik-cache/arm64/system@framework@boot.art:/data/dalvik-cache/arm64/system@framework@boot-core-libart.art:/data/dalvik-cache/arm64/system@framework@boot-conscrypt.art:/data/dalvik-cache/arm64/system@framework@boot-okhttp.art:/data/dalvik-cache/arm64/system@framework@boot-bouncycastle.art:/data/dalvik-cache/arm64/system@framework@boot-apache-xml.art:/data/dalvik-cache/arm64/system@framework@boot-ext.art:/data/dalvik-cache/arm64/system@framework@boot-framework.art:/data/dalvik-cache/arm64/system@framework@boot-telephony-common.art:/data/dalvik-cache/arm64/system@framework@boot-voip-common.art:/data/dalvik-cache/arm64/system@framework@boot-ims-common.art:/data/dalvik-cache/arm64/system@framework@boot-android.hidl.base-V1.0-java.art:/data/dalvik-cache/arm64/system@framework@boot-android.hidl.manager-V1.0-java.art:/data/dalvik-cache/arm64/system@framework@boot-framework-oahl-backward-compatibility.art:/data/dalvik-cache/arm64/system@framework@boot-android.test.base.art
native-debuggable = false
pic = false
11: void com.xosp.android.framework.http.client.HttpRequest.setEntity(org.apache.http.HttpEntity) (dex_method_idx=34256)
DEX CODE:
0x0000: e801 2400 | iput-object-quick v1, v0, // offset@36
0x0002: 7300 | return-void-no-barrier
......
CODE: (code_offset=0x00000000 size_offset=0x00000000 size=0)
NO CODE!
8: org.apache.http.HttpEntity com.xosp.android.framework.http.client.HttpRequest.getEntity() (dex_method_idx=34252)
DEX CODE:
0x0000: e510 2400 | iget-object-quick v0, v1, // offset@36
0x0002: 1100 | return-object v0
......
CODE: (code_offset=0x00000000 size_offset=0x00000000 size=0)
NO CODE!
- 另外,当我们在 Android 9 的机器上,直接安装的时候,我们看到通过 dex2oat 编译之后,setEntity 方法调用的是 iput-object-quick v1, v0, // offset@36,这里 offset 是 36,跟 Android 8 上优化之后代码是一样的。
4. 总结
-
通过上文分析,可以知道,其实这个问题的原因与所抛出的异常是一致的: IncompatibleClassChangeError, 简单的说就是编译时的代码和运行时的代码不一致。
-
之所以 org.apache.http.legacy.jar 这个包频频出镜,是因为插件打包的时候,打包了空实现的类;而在实际运行的时候,又加载的是系统提供的类。二者在类的结构上并不一致,因而导致运行时内存或者变量引用错乱。
-
理论上来说,由于 quicken 编译优化之后,地址的偏移值取决于编译时的类和运行时的类之间的差异,因此也有可能会有触发其他的异常信息,并不仅仅只有 IncompatibleClassChangeError。
-
在插件化的场景下,其实很容易可以构造出这种场景的。
5. 相关链接
Add extra logging for bug 77342775.
6. 更多
6.1 FieldOffset的计算逻辑
bool ClassLinker::LinkFields(Thread* self,
Handle<mirror::Class> klass,
bool is_static,
size_t* class_size) {
self->AllowThreadSuspension();
const size_t num_fields = is_static ? klass->NumStaticFields() : klass->NumInstanceFields();
LengthPrefixedArray<ArtField>* const fields = is_static ? klass->GetSFieldsPtr() :
klass->GetIFieldsPtr();
// Initialize field_offset
MemberOffset field_offset(0);
if (is_static) {
field_offset = klass->GetFirstReferenceStaticFieldOffsetDuringLinking(image_pointer_size_);
} else {
ObjPtr<mirror::Class> super_class = klass->GetSuperClass();
if (super_class != nullptr) {
CHECK(super_class->IsResolved())
<< klass->PrettyClass() << " " << super_class->PrettyClass();
field_offset = MemberOffset(super_class->GetObjectSize());
}
}
CHECK_EQ(num_fields == 0, fields == nullptr) << klass->PrettyClass();
// we want a relatively stable order so that adding new fields
// minimizes disruption of C++ version such as Class and Method.
//
// The overall sort order order is:
// 1) All object reference fields, sorted alphabetically.
// 2) All java long (64-bit) integer fields, sorted alphabetically.
// 3) All java double (64-bit) floating point fields, sorted alphabetically.
// 4) All java int (32-bit) integer fields, sorted alphabetically.
// 5) All java float (32-bit) floating point fields, sorted alphabetically.
// 6) All java char (16-bit) integer fields, sorted alphabetically.
// 7) All java short (16-bit) integer fields, sorted alphabetically.
// 8) All java boolean (8-bit) integer fields, sorted alphabetically.
// 9) All java byte (8-bit) integer fields, sorted alphabetically.
//
// Once the fields are sorted in this order we will attempt to fill any gaps that might be present
// in the memory layout of the structure. See ShuffleForward for how this is done.
......
......
......
}