背景
在一次随机测试当出现了奇怪的现象,所有的程序调用同一个库文件的函数时都出现 ILL_ILLOPC 指令异常,一看 PC 地址附近内存全是零的现象,然而文件是系统的只读分区中,原文件并没有发生损坏,文件内容却丢失了部分。恰好这一次极低概率问题触发了某个场景保存了 Ramdump 文件。
错误特征
Cmdline: /vendor/bin/hw/vendor.qti.hardware.display.composer-service
pid: 1967, tid: 1967, name: binder:1967_2 >>> /vendor/bin/hw/vendor.qti.hardware.display.composer-service <<<
uid: 1000
tagged_addr_ctrl: 0000000000000001 (PR_TAGGED_ADDR_ENABLE)
pac_enabled_keys: 000000000000000f (PR_PAC_APIAKEY, PR_PAC_APIBKEY, PR_PAC_APDAKEY, PR_PAC_APDBKEY)
signal 4 (SIGILL), code 1 (ILL_ILLOPC), fault addr 0x000000772cb25cf4
x0 0000000000000000 x1 0000007fee454720 x2 0000005dc0bc044c x3 0000007fee454670
x4 0000007fee454664 x5 0000000000000001 x6 000000000000003f x7 0000000000000000
x8 0000000000000000 x9 0000000000000001 x10 0000000000000044 x11 0000000000000003
x12 0000000000000028 x13 0000000000000000 x14 b40000748b0788d0 x15 0000000000000000
x16 00000074864df850 x17 000000772cb25cf4 x18 000000772d22c000 x19 0000007fee454720
x20 b40000760b010910 x21 0000005dc0bc044c x22 0000007fee454720 x23 000000772d152f00
x24 0000007fee454668 x25 00000000fffffc0c x26 0000000000000000 x27 0000000000000000
x28 0000007fee454720 x29 0000007fee4545e0
lr 00000074864d3380 sp 0000007fee454590 pc 000000772cb25cf4 pst 0000000060001400
28 total frames
backtrace:
#00 pc 0000000000023cf4 /apex/com.android.runtime/lib64/bionic/libm.so
...
memory near pc (/apex/com.android.runtime/lib64/bionic/libm.so):
000000772cb25cd0 0000000000000000 0000000000000000 ................
000000772cb25ce0 0000000000000000 0000000000000000 ................
000000772cb25cf0 0000000000000000 0000000000000000 ................
000000772cb25d00 0000000000000000 0000000000000000 ................
000000772cb25d10 0000000000000000 0000000000000000 ................
000000772cb25d20 0000000000000000 0000000000000000 ................
000000772cb25d30 0000000000000000 0000000000000000 ................
000000772cb25d40 0000000000000000 0000000000000000 ................
000000772cb25d50 0000000000000000 0000000000000000 ................
000000772cb25d60 0000000000000000 0000000000000000 ................
000000772cb25d70 0000000000000000 0000000000000000 ................
000000772cb25d80 0000000000000000 0000000000000000 ................
000000772cb25d90 0000000000000000 0000000000000000 ................
000000772cb25da0 0000000000000000 0000000000000000 ................
000000772cb25db0 0000000000000000 0000000000000000 ................
000000772cb25dc0 0000000000000000 0000000000000000 ................
Ramdump 分析
crash-android> ps | grep logd
1082 1 2 ffffff88109b8000 IN 0.2 11208532 72896 logd
1108 1 2 ffffff88129a0000 IN 0.2 11208532 72896 logd.reader
1109 1 0 ffffff88129a1640 IN 0.2 11208532 72896 logd.writer
1110 1 0 ffffff881556ac80 IN 0.2 11208532 72896 logd.control
1124 1 0 ffffff8812b4ac80 IN 0.2 11208532 72896 logd.klogd
1125 1 0 ffffff8812b4d900 IN 0.2 11208532 72896 logd.auditd
crash-android> lp core -p 1082 --zram
Saved [1082.core].
core-parser -c 1082.core
core-parser> logcat -b crash
...
2026-02-09 02:28:17.422 1000 8765 8765 F DEBUG: pid: 8754, tid: 8760, name: surfaceflinger >>> /system/bin/surfaceflinger <<<
2026-02-09 02:28:17.422 1000 8765 8765 F DEBUG: uid: 1000
2026-02-09 02:28:17.422 1000 8765 8765 F DEBUG: tagged_addr_ctrl: 0000000000000001 (PR_TAGGED_ADDR_ENABLE)
2026-02-09 02:28:17.422 1000 8765 8765 F DEBUG: pac_enabled_keys: 000000000000000f (PR_PAC_APIAKEY, PR_PAC_APIBKEY, PR_PAC_APDAKEY, PR_PAC_APDBKEY)
2026-02-09 02:28:17.422 1000 8765 8765 F DEBUG: signal 4 (SIGILL), code 1 (ILL_ILLOPC), fault addr 0x0000007d9ff25dc0
2026-02-09 02:28:17.422 1000 8765 8765 F DEBUG: x0 0000000000000001 x1 0000000000000001 x2 0000000000000000 x3 0000000000000000
2026-02-09 02:28:17.422 1000 8765 8765 F DEBUG: x4 b400007c856301d0 x5 0000007da18e9580 x6 b400007c15623fd0 x7 b400007c15623fd0
2026-02-09 02:28:17.422 1000 8765 8765 F DEBUG: x8 0000007da18e9580 x9 0000007da13f4304 x10 0000007da11c66c8 x11 0000007da18e9054
2026-02-09 02:28:17.422 1000 8765 8765 F DEBUG: x12 00000000000002b7 x13 0000000000000000 x14 b400007cf56122fd x15 0000000000000096
2026-02-09 02:28:17.422 1000 8765 8765 F DEBUG: x16 0000007da1a464b0 x17 0000007d9ff25dc0 x18 0000007ae4488000 x19 0000000000000001
2026-02-09 02:28:17.422 1000 8765 8765 F DEBUG: x20 0000000000000001 x21 3fd0c15240000000 x22 0000007ae4ff6ad0 x23 0000000000000000
2026-02-09 02:28:17.422 1000 8765 8765 F DEBUG: x24 b400007c15623fd0 x25 0000000000000000 x26 0000000000000000 x27 0000000000000000
2026-02-09 02:28:17.422 1000 8765 8765 F DEBUG: x28 0000000000000000 x29 0000007ae4ff6b80
2026-02-09 02:28:17.422 1000 8765 8765 F DEBUG: lr 0000007da18e9abc sp 0000007ae4ff6aa0 pc 0000007d9ff25dc0 pst 0000000060001400
2026-02-09 02:28:17.422 1000 8765 8765 F DEBUG: 32 total frames
2026-02-09 02:28:17.422 1000 8765 8765 F DEBUG: backtrace:
2026-02-09 02:28:17.422 1000 8765 8765 F DEBUG: #00 pc 0000000000024dc0 /apex/com.android.runtime/lib64/bionic/libm.so (tan+0) (BuildId: a985a539ac1f4bfe3de003f47a1575ed)
...
2026-02-09 02:28:17.863 1000 8804 8804 F DEBUG: Cmdline: /vendor/bin/hw/vendor.qti.hardware.display.composer-service
2026-02-09 02:28:17.863 1000 8804 8804 F DEBUG: pid: 8769, tid: 8769, name: vendor.qti.hard >>> /vendor/bin/hw/vendor.qti.hardware.display.composer-service <<<
2026-02-09 02:28:17.863 1000 8804 8804 F DEBUG: uid: 1000
2026-02-09 02:28:17.863 1000 8804 8804 F DEBUG: tagged_addr_ctrl: 0000000000000001 (PR_TAGGED_ADDR_ENABLE)
2026-02-09 02:28:17.863 1000 8804 8804 F DEBUG: pac_enabled_keys: 000000000000000f (PR_PAC_APIAKEY, PR_PAC_APIBKEY, PR_PAC_APDAKEY, PR_PAC_APDBKEY)
2026-02-09 02:28:17.864 1000 8804 8804 F DEBUG: signal 4 (SIGILL), code 1 (ILL_ILLOPC), fault addr 0x000000784fa24cf4
2026-02-09 02:28:17.864 1000 8804 8804 F DEBUG: x0 0000000000000000 x1 0000000000000000 x2 ffffffffffffffc0 x3 0000000000000010
2026-02-09 02:28:17.864 1000 8804 8804 F DEBUG: x4 0000000000000000 x5 0000000000000040 x6 000000000000003f x7 0000000000000000
2026-02-09 02:28:17.864 1000 8804 8804 F DEBUG: x8 00000000fffffffb x9 0000000000000ad0 x10 0000000000000001 x11 0000000000000a30
2026-02-09 02:28:17.864 1000 8804 8804 F DEBUG: x12 0000000000000000 x13 0000000000000000 x14 0000000000000000 x15 7d0000003c8c0000
2026-02-09 02:28:17.864 1000 8804 8804 F DEBUG: x16 00000075b39f9850 x17 000000784fa24cf4 x18 0000007854880000 x19 0000000000000001
2026-02-09 02:28:17.864 1000 8804 8804 F DEBUG: x20 0000000000000000 x21 0000000000000000 x22 0000007fdd1647e0 x23 b40000778400a510
2026-02-09 02:28:17.864 1000 8804 8804 F DEBUG: x24 b400007644008eb0 x25 00000000000000a4 x26 b400007724008ed0 x27 0000000000000000
2026-02-09 02:28:17.864 1000 8804 8804 F DEBUG: x28 b40000778400af74 x29 0000007fdd164590
2026-02-09 02:28:17.864 1000 8804 8804 F DEBUG: lr 00000075b39ecf20 sp 0000007fdd164560 pc 000000784fa24cf4 pst 0000000080001400
2026-02-09 02:28:17.864 1000 8804 8804 F DEBUG: 21 total frames
2026-02-09 02:28:17.864 1000 8804 8804 F DEBUG: backtrace:
2026-02-09 02:28:17.864 1000 8804 8804 F DEBUG: #00 pc 0000000000023cf4 /apex/com.android.runtime/lib64/bionic/libm.so (scalbnf+0) (BuildId: a985a539ac1f4bfe3de003f47a1575ed)
...
2026-02-09 02:28:18.037 1046 8733 8735 F DEBUG: Cmdline: media.swcodec oid.media.swcodec/bin/mediaswcodec
2026-02-09 02:28:18.037 1046 8733 8735 F DEBUG: pid: 8733, tid: 8735, name: binder:8733_3 >>> media.swcodec <<<
2026-02-09 02:28:18.037 1046 8733 8735 F DEBUG: uid: 1046
2026-02-09 02:28:18.037 1046 8733 8735 F DEBUG: tagged_addr_ctrl: 0000000000000001 (PR_TAGGED_ADDR_ENABLE)
2026-02-09 02:28:18.037 1046 8733 8735 F DEBUG: pac_enabled_keys: 000000000000000f (PR_PAC_APIAKEY, PR_PAC_APIBKEY, PR_PAC_APDAKEY, PR_PAC_APDBKEY)
2026-02-09 02:28:18.037 1046 8733 8735 F DEBUG: signal 4 (SIGILL), code 1 (ILL_ILLOPC), fault addr --------
2026-02-09 02:28:18.037 1046 8733 8735 F DEBUG: x0 9117dd6f103851fa x1 0000007e81ce0738 x2 0000000000000000 x3 0000000000000030
2026-02-09 02:28:18.037 1046 8733 8735 F DEBUG: x4 0000000000000000 x5 b400007d70e0419c x6 d20020000000000c x7 0000000000000000
2026-02-09 02:28:18.037 1046 8733 8735 F DEBUG: x8 000000007f7fffff x9 0000007e81cc0000 x10 0000000000000004 x11 0200007c50e11eb0
2026-02-09 02:28:18.038 1046 8733 8735 F DEBUG: x12 0000000000040004 x13 657461722d656d61 x14 0a1b350e0000102c x15 0000000000000000
2026-02-09 02:28:18.038 1046 8733 8735 F DEBUG: x16 0000007bcc8733d8 x17 0000007e7a923498 x18 0000007bd25b4000 x19 b400007be0e0a838
2026-02-09 02:28:18.038 1046 8733 8735 F DEBUG: x20 214557310473bf25 x21 b400007be0e0ab68 x22 b400007c10e132e0 x23 0000007bcc8728d8
2026-02-09 02:28:18.038 1046 8733 8735 F DEBUG: x24 9117dd6f103851fa x25 0000000000000000 x26 0000000000000000 x27 0000007bd36f9bd0
2026-02-09 02:28:18.038 1046 8733 8735 F DEBUG: x28 0000007bd36f9bc0 x29 0000007bd36f9e10
2026-02-09 02:28:18.038 1046 8733 8735 F DEBUG: lr 0000007bcc858b8c sp 0000007bd36f9ad0 pc 0000007e7a923498 pst 0000000080001400
2026-02-09 02:28:18.038 1046 8733 8735 F DEBUG: 18 total frames
2026-02-09 02:28:18.038 1046 8733 8735 F DEBUG: backtrace:
2026-02-09 02:28:18.038 1046 8733 8735 F DEBUG: #00 pc 0000000000023498 /apex/com.android.runtime/lib64/bionic/libm.so (nextafterf+0) (BuildId: a985a539ac1f4bfe3de003f47a1575ed)
...
core-parser>
从 Ramdump 中取出最后的 Android 日志,可以看到存在多个进程调用了 libm.so 的函数均出现 ILL_ILLOPC 错误。集中在文件页表 OFFSET:0x23000 和 OFFSET:0x24000 上。
| 偏移地址 | 文件路径 |
|---|---|
| #00 pc 0000000000023498 | /apex/com.android.runtime/lib64/bionic/libm.so (nextafterf+0) |
| #00 pc 0000000000024dc0 | /apex/com.android.runtime/lib64/bionic/libm.so (tan+0) |
| #00 pc 0000000000023cf4 | /apex/com.android.runtime/lib64/bionic/libm.so (scalbnf+0) |
最后剩余的进程
crash-android> lp cmdline -a | grep surfaceflinger
crash-android> lp cmdline -a | grep composer
crash-android> lp cmdline -a | grep mediaswcodec
PID: 8905 media.swcodec oid.media.swcodec/bin/mediaswcodec
crash-android>
在内存中找到这个三个反复 crash 的进程,存在一个刚拉起未运行到 crash 的位置的现场。
进程内存分析
crash-android> set 8905
PID: 8905
COMMAND: "binder:8905_2"
TASK: ffffff88575e2c80 [THREAD_INFO: ffffff88575e2c80]
CPU: 6
STATE: TASK_INTERRUPTIBLE
crash-android>
crash-android> vm -p
...
VMA START END FLAGS FILE
ffffff8921ff7100 7e86c80000 7e86c94000 800000000000071 /apex/com.android.runtime/lib64/bionic/libm.so
VIRTUAL PHYSICAL
7e86c80000 9d9fe3000
7e86c81000 a07334000
7e86c82000 881549000
7e86c83000 abe5e9000
7e86c84000 f00e7000
7e86c85000 ad3ed2000
7e86c86000 9309c5000
7e86c87000 89532e000
7e86c88000 895337000
7e86c89000 9a5135000
7e86c8a000 a93e59000
7e86c8b000 ad1293000
7e86c8c000 ab88a8000
7e86c8d000 d3612000
7e86c8e000 a3d844000
7e86c8f000 a53bdf000
7e86c90000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 10000
7e86c91000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 11000
7e86c92000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 12000
7e86c93000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 13000
VMA START END FLAGS FILE
ffffff8921ff7500 7e86c94000 7e86cb8000 1000075 /apex/com.android.runtime/lib64/bionic/libm.so
VIRTUAL PHYSICAL
7e86c94000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 14000
7e86c95000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 15000
7e86c96000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 16000
7e86c97000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 17000
7e86c98000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 18000
7e86c99000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 19000
7e86c9a000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 1a000
7e86c9b000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 1b000
7e86c9c000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 1c000
7e86c9d000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 1d000
7e86c9e000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 1e000
7e86c9f000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 1f000
7e86ca0000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 20000
7e86ca1000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 21000
7e86ca2000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 22000
7e86ca3000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 23000
7e86ca4000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 24000
7e86ca5000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 25000
7e86ca6000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 26000
7e86ca7000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 27000
7e86ca8000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 28000
7e86ca9000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 29000
7e86caa000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 2a000
7e86cab000 FILE: /apex/com.android.runtime/lib64/bionic/libm.so OFFSET: 2b000
...
从进程当下的状态看,程序运行到数学库的函数触发缺页中断,加载过来的页表内容异常为 0 才会出现问题。
页内存检验
crash-android> lp core -p 8905 --zram
Saved [8905.core].
core-parser -c 8905.core
// 未进行 sysroot 加载原文件的页表到 core-parser,因此当前是 ramdump 中提取的原始内存
core-parser> map | grep libm.so
64 0x7ebcfa1910 [7e86c80000, 7e86c94000) r-- 7e86c80000 /apex/com.android.runtime/lib64/bionic/libm.so [*]
core-parser> map -s 64
VADDR SIZE INFO NAME
0000007e86ca603c 000000000000000c 0000000000000012 ceil
0000007e86ca630c 0000000000000018 0000000000000012 fetestexcept
0000007e86cb2ccc 0000000000000310 0000000000000012 expl
0000007e86ca5c4c 0000000000000098 0000000000000012 cexpl
0000007e86c9fc6c 000000000000008c 0000000000000012 cprojf
0000007e86ca60cc 000000000000000c 0000000000000012 floor
0000007e86c9e218 0000000000000280 0000000000000012 ccosh
0000007e86ca6098 000000000000000c 0000000000000012 fabs
0000007e86ca3278 000000000000007c 0000000000000012 nearbyintf
0000007e86cb4808 0000000000000238 0000000000000012 cosf
0000007e86cac808 0000000000000114 0000000000000012 roundl
0000007e86ca8600 000000000000026c 0000000000000012 fmodl
0000007e86c87dbc 0000000000000008 0000000000000011 __fe_dfl_env
0000007e86ca616c 000000000000000c 0000000000000012 lround
0000007e86ca60d8 000000000000000c 0000000000000012 floorf
0000007e86c9e754 000000000000004c 0000000000000012 ccosf
0000007e86ca7e98 00000000000002d8 0000000000000012 asinl
...
core-parser 能够正常从先有的 ramdump 中解析 dynamic 的符号信息,说明除了代码段内存,目前进程已加载的现有内存页基本是正确的。
core-parser> env core --load | grep libm.so
287 [7e86c80000, 7e86c94000) r-- 0000014000 0000014000 /apex/com.android.runtime/lib64/bionic/libm.so [*]
288 [7e86c94000, 7e86cb8000) r-x 0000024000 0000024000 /apex/com.android.runtime/lib64/bionic/libm.so [*]
289 [7e86cb8000, 7e86cb9000) r-- 0000001000 0000001000 /apex/com.android.runtime/lib64/bionic/libm.so [*]
291 [7e86cbc000, 7e86cbd000) rw- 0000001000 0000001000 /apex/com.android.runtime/lib64/bionic/libm.so [*]
core-parser> rd 7e86c80000 -e 7e86c80000 -f 7e86c80000.bin
core-parser> rd 7e86c94000 -e 7e86cb8000 -f 7e86c94000.bin
core-parser> rd 7e86cb8000 -e 7e86cb9000 -f 7e86cb8000.bin
core-parser> rd 7e86cbc000 -e 7e86cbd000 -f 7e86cbc000.bin
readelf -l 7e86c80000.bin
readelf: Error: Reading 1856 bytes extends past end of file for section headers
Elf file type is DYN (Shared object file)
Entry point 0x0
There are 12 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040
0x00000000000002a0 0x00000000000002a0 R 0x8
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000012834 0x0000000000012834 R 0x4000
LOAD 0x0000000000014000 0x0000000000014000 0x0000000000014000
0x0000000000023ae8 0x0000000000023ae8 R E 0x4000
LOAD 0x0000000000038000 0x0000000000038000 0x0000000000038000
0x00000000000002e0 0x0000000000001000 RW 0x4000
LOAD 0x000000000003c000 0x000000000003c000 0x000000000003c000
0x0000000000000080 0x00000000000000a0 RW 0x4000
DYNAMIC 0x0000000000038018 0x0000000000038018 0x0000000000038018
0x00000000000001c0 0x00000000000001c0 RW 0x8
readelf: Error: the dynamic segment offset + size exceeds the size of the file
GNU_RELRO 0x0000000000038000 0x0000000000038000 0x0000000000038000
0x00000000000002e0 0x0000000000001000 R 0x1
GNU_EH_FRAME 0x000000000000dbc0 0x000000000000dbc0 0x000000000000dbc0
0x0000000000000b04 0x0000000000000b04 R 0x4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0x0
GNU_PROPERTY 0x0000000000000330 0x0000000000000330 0x0000000000000330
0x0000000000000020 0x0000000000000020 R 0x8
NOTE 0x00000000000002e0 0x00000000000002e0 0x00000000000002e0
0x0000000000000050 0x0000000000000050 R 0x4
NOTE 0x0000000000000330 0x0000000000000330 0x0000000000000330
0x0000000000000020 0x0000000000000020 R 0x8
取 7e86c80000.bin 和 7e86cb8000.bin 原文件的内容进行比较,可以确定是正确无误的,这也能说明程序为什么能正常的完成动态库的加载和链接,而错误发生在运行相关代码阶段。
core-parser> disas nextafterf
LIB: /apex/com.android.runtime/lib64/bionic/libm.so
nextafterf: [7e86ca3498, 7e86ca3570]
0x7e86ca3498: 00000000 | udf #0
0x7e86ca349c: 00000000 | udf #0
0x7e86ca34a0: 00000000 | udf #0
0x7e86ca34a4: 00000000 | udf #0
0x7e86ca34a8: 00000000 | udf #0
0x7e86ca34ac: 00000000 | udf #0
0x7e86ca34b0: 00000000 | udf #0
0x7e86ca34b4: 00000000 | udf #0
0x7e86ca34b8: 00000000 | udf #0
0x7e86ca34bc: 00000000 | udf #0
0x7e86ca34c0: 00000000 | udf #0
0x7e86ca34c4: 00000000 | udf #0
0x7e86ca34c8: 00000000 | udf #0
...
core-parser> disas scalbnf
LIB: /apex/com.android.runtime/lib64/bionic/libm.so
scalbnf: [7e86ca3cf4, 7e86ca3d80]
0x7e86ca3cf4: 00000000 | udf #0
0x7e86ca3cf8: 00000000 | udf #0
0x7e86ca3cfc: 00000000 | udf #0
0x7e86ca3d00: 00000000 | udf #0
0x7e86ca3d04: 00000000 | udf #0
0x7e86ca3d08: 00000000 | udf #0
0x7e86ca3d0c: 00000000 | udf #0
0x7e86ca3d10: 00000000 | udf #0
0x7e86ca3d14: 00000000 | udf #0
0x7e86ca3d18: 00000000 | udf #0
...
此时的程序还未加载 0x23000 的文件页表到内存中,因此问题发生在运行该代码段时,触发缺页中断,从原文件中找到对应的页表加载到内存这个阶段,由于最后一刻,还未发生调用数学库的代码,因此进程还活着,我们还能从 Ramdump 中找到。那如果进程找不到,这咋办?(方法很多)
页缓存
crash-android> struct vm_area_struct ffffff8921ff7100 -x
struct vm_area_struct {
{
{
vm_start = 0x7e86c80000,
vm_end = 0x7e86c94000
},
vm_freeptr = {
v = 0x7e86c80000
}
},
vm_mm = 0xffffff80278b9400,
vm_page_prot = {
pgprot = 0x60000000000fc3
},
{
vm_flags = 0x800000000000071,
__vm_flags = 0x800000000000071
},
vm_lock_seq = 0x1d73,
anon_vma_chain = {
next = 0xffffff8921ff7130,
prev = 0xffffff8921ff7130
},
anon_vma = 0x0,
vm_ops = 0xffffffebed8177f0 <generic_file_vm_ops>,
vm_pgoff = 0x0,
vm_file = 0xffffff89deb52c00,
vm_private_data = 0x0,
...
crash-android>
crash-android> struct file 0xffffff89deb52c00
struct file {
f_count = {
counter = 4
},
...
f_mode = 688157,
f_op = 0xffffffebed8e82c0 <erofs_file_fops>,
f_mapping = 0xffffff8845d11d60,
private_data = 0x0,
f_inode = 0xffffff8845d11bd8,
f_flags = 131072,
f_iocb_flags = 0,
f_cred = 0xffffff80424a9780,
f_path = {
mnt = 0xffffff880f8c6920,
dentry = 0xffffff88366e3380
},
...
crash-android> struct inode 0xffffff8845d11bd8
struct inode {
i_mode = 33188,
i_opflags = 13,
i_uid = {
val = 1000
},
i_gid = {
val = 1000
},
i_flags = 0,
i_acl = 0x0,
i_default_acl = 0xffffffffffffffff,
i_op = 0xffffffebed8e7f00 <erofs_generic_iops>,
i_sb = 0xffffff801e149000,
i_mapping = 0xffffff8845d11d60,
i_security = 0xffffff8818ac01c0,
i_ino = 316544,
{
i_nlink = 1,
__i_nlink = 1
},
...
crash-android> files -p 0xffffff8845d11bd8
INODE NRPAGES
ffffff8845d11bd8 61
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
fffffffee567f8c0 9d9fe3000 ffffff8845d11d60 0 11 400000000000032c referenced,uptodate,lru,active,workingset
fffffffee61ccd00 a07334000 ffffff8845d11d60 1 10 400000000000032c referenced,uptodate,lru,active,workingset
fffffffee0055240 881549000 ffffff8845d11d60 2 6 400000000000012c referenced,uptodate,lru,active
fffffffee8f97a40 abe5e9000 ffffff8845d11d60 3 14 400000000000032c referenced,uptodate,lru,active,workingset
fffffffec1c039c0 f00e7000 ffffff8845d11d60 4 6 400000000000012c referenced,uptodate,lru,active
fffffffee94fb480 ad3ed2000 ffffff8845d11d60 5 14 400000000000032c referenced,uptodate,lru,active,workingset
fffffffee2c27140 9309c5000 ffffff8845d11d60 6 6 400000000000012c referenced,uptodate,lru,active
fffffffee054cb80 89532e000 ffffff8845d11d60 7 6 400000000000032c referenced,uptodate,lru,active,workingset
fffffffee054cdc0 895337000 ffffff8845d11d60 8 6 400000000000032c referenced,uptodate,lru,active,workingset
fffffffee4944d40 9a5135000 ffffff8845d11d60 9 9 400000000000032c referenced,uptodate,lru,active,workingset
fffffffee84f9640 a93e59000 ffffff8845d11d60 a 6 400000000000032c referenced,uptodate,lru,active,workingset
fffffffee944a4c0 ad1293000 ffffff8845d11d60 b 6 400000000000032c referenced,uptodate,lru,active,workingset
fffffffee8e22a00 ab88a8000 ffffff8845d11d60 c 6 400000000000032c referenced,uptodate,lru,active,workingset
fffffffec14d8480 d3612000 ffffff8845d11d60 d 6 400000000000032c referenced,uptodate,lru,active,workingset
fffffffee6f61100 a3d844000 ffffff8845d11d60 e 6 400000000000032c referenced,uptodate,lru,active,workingset
fffffffee74ef7c0 a53bdf000 ffffff8845d11d60 f 15 400000000000032c referenced,uptodate,lru,active,workingset
fffffffee7b608c0 a6d823000 ffffff8845d11d60 10 1 400000000000032c referenced,uptodate,lru,active,workingset
fffffffec1553b80 d54ee000 ffffff8845d11d60 11 1 400000000000032c referenced,uptodate,lru,active,workingset
fffffffec1c32d40 f0cb5000 ffffff8845d11d60 12 1 400000000000032c referenced,uptodate,lru,active,workingset
fffffffec19e5880 e7962000 ffffff8845d11d60 13 1 400000000000032c referenced,uptodate,lru,active,workingset
fffffffec1ac2c80 eb0b2000 ffffff8845d11d60 14 1 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee6f95780 a3e55e000 ffffff8845d11d60 15 1 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee8c5e500 ab1794000 ffffff8845d11d60 16 1 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee831c640 a8c719000 ffffff8845d11d60 17 1 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffec1ad0240 eb409000 ffffff8845d11d60 18 1 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffec0af8680 abe1a000 ffffff8845d11d60 19 1 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee7dc1b00 a7706c000 ffffff8845d11d60 1a 1 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee808d000 a82340000 ffffff8845d11d60 1b 1 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee90c2e00 ac30b8000 ffffff8845d11d60 1c 1 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee4987ec0 9a61fb000 ffffff8845d11d60 1d 1 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee8972740 aa5c9d000 ffffff8845d11d60 1e 1 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee6bb7900 a2ede4000 ffffff8845d11d60 1f 1 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee7916f80 a645be000 ffffff8845d11d60 20 1 400000000000212c referenced,uptodate,lru,active,arch_1
fffffffee9435b40 ad0d6d000 ffffff8845d11d60 21 1 400000000000212c referenced,uptodate,lru,active,arch_1
fffffffee7e03680 a780da000 ffffff8845d11d60 22 1 400000000000212c referenced,uptodate,lru,active,arch_1
fffffffee7f7ce80 a7df3a000 ffffff8845d11d60 23 1 400000000000212c referenced,uptodate,lru,active,arch_1
fffffffee60f7580 a03dd6000 ffffff8845d11d60 24 1 400000000000212c referenced,uptodate,lru,active,arch_1
fffffffee7e22b40 a788ad000 ffffff8845d11d60 25 1 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee634d780 a0d35e000 ffffff8845d11d60 26 1 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee2389080 90e242000 ffffff8845d11d60 27 1 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee7142e80 a450ba000 ffffff8845d11d60 28 1 400000000000212c referenced,uptodate,lru,active,arch_1
fffffffee41dd600 987758000 ffffff8845d11d60 29 1 400000000000212c referenced,uptodate,lru,active,arch_1
fffffffee6e0a1c0 a38287000 ffffff8845d11d60 2a 3 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee4fc3d80 9bf0f6000 ffffff8845d11d60 2b 3 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee7f03600 a7c0d8000 ffffff8845d11d60 2c 1 400000000000212c referenced,uptodate,lru,active,arch_1
fffffffee6e0fec0 a383fb000 ffffff8845d11d60 2d 2 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee759f3c0 a567cf000 ffffff8845d11d60 2e 2 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee40dbac0 9836eb000 ffffff8845d11d60 2f 1 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee6a9ed00 a2a7b4000 ffffff8845d11d60 30 2 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee55be9c0 9d6fa7000 ffffff8845d11d60 31 2 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee7e50440 a79411000 ffffff8845d11d60 32 2 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee6e8f140 a3a3c5000 ffffff8845d11d60 33 6 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffec00c99c0 83267000 ffffff8845d11d60 34 2 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffec00c9980 83266000 ffffff8845d11d60 35 2 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee664c200 a19308000 ffffff8845d11d60 36 2 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee8a8a440 aaa291000 ffffff8845d11d60 37 2 400000000000232c referenced,uptodate,lru,active,workingset,arch_1
fffffffee87c0440 a9f011000 ffffff8845d11d60 38 1 400000000000012c referenced,uptodate,lru,active
fffffffee6ac5380 a2b14e000 ffffff8845d11d60 39 1 400000000000012c referenced,uptodate,lru,active
fffffffee71a0140 a46805000 ffffff8845d11d60 3a 1 400000000000012c referenced,uptodate,lru,active
fffffffee63df800 a0f7e0000 ffffff8845d11d60 3b 1 400000000000012c referenced,uptodate,lru,active
fffffffee891a800 aa46a0000 ffffff8845d11d60 3c 1 400000000000012c referenced,uptodate,lru,active
crash-android> rd -p a03dd6000 -e a03dd6100
a03dd6000: 0000000000000000 0000000000000000 ................
a03dd6010: 0000000000000000 0000000000000000 ................
a03dd6020: 0000000000000000 0000000000000000 ................
a03dd6030: 0000000000000000 0000000000000000 ................
a03dd6040: 0000000000000000 0000000000000000 ................
a03dd6050: 0000000000000000 0000000000000000 ................
a03dd6060: 0000000000000000 0000000000000000 ................
a03dd6070: 0000000000000000 0000000000000000 ................
a03dd6080: 0000000000000000 0000000000000000 ................
a03dd6090: 0000000000000000 0000000000000000 ................
a03dd60a0: 0000000000000000 0000000000000000 ................
a03dd60b0: 0000000000000000 0000000000000000 ................
a03dd60c0: 0000000000000000 0000000000000000 ................
a03dd60d0: 0000000000000000 0000000000000000 ................
a03dd60e0: 0000000000000000 0000000000000000 ................
a03dd60f0: 0000000000000000 0000000000000000 ................
crash-android>
crash-android> rd -p a7df3a000 -e a7df3a100
a7df3a000: 0000000000000000 0000000000000000 ................
a7df3a010: 0000000000000000 0000000000000000 ................
a7df3a020: 0000000000000000 0000000000000000 ................
a7df3a030: 0000000000000000 0000000000000000 ................
a7df3a040: 0000000000000000 0000000000000000 ................
a7df3a050: 0000000000000000 0000000000000000 ................
a7df3a060: 0000000000000000 0000000000000000 ................
a7df3a070: 0000000000000000 0000000000000000 ................
a7df3a080: 0000000000000000 0000000000000000 ................
a7df3a090: 0000000000000000 0000000000000000 ................
a7df3a0a0: 0000000000000000 0000000000000000 ................
a7df3a0b0: 0000000000000000 0000000000000000 ................
a7df3a0c0: 0000000000000000 0000000000000000 ................
a7df3a0d0: 0000000000000000 0000000000000000 ................
a7df3a0e0: 0000000000000000 0000000000000000 ................
a7df3a0f0: 0000000000000000 0000000000000000 ................
crash-android>
可见文件 inode 对应的 address_space 记录的页缓存中的 0x23000 与 0x24000 是个空页。经过核实从 OFFSET 在 [0x1D000, 0x27000) 的区间的 10 个页内容都是 0,它前后页表都是正确的,不是整个代码段的页表异常,仅是部分。
当前结论:页缓存内容被清零,因此该机器这段时间内加载 libm.so 并运行该部分代码即会出现指令异常错误。
内存痕迹
相信前面的内容,绝大部分工程师都能分析到,但也会止步于此无法在往下。连续 10 页清零的特征,在内核中能够办到的函数可以想到 zero_fill_bio 函数。
该函数的参数内存特征有数据结构 bio,而 bio 又持有 bi_io_vec 指针地址,而 bio_vec 就是我们要找的那 10 个页表地址,因此,我们检索的内存特征是那 10 页 page 的地址,于是在内存中可以找到痕迹。
crash-android> rd ffffff8026f16000 -e ffffff8026f16100
ffffff8026f16000: fffffffee808d000 0000000000001000 ................ 0x1b
ffffff8026f16010: fffffffee90c2e00 0000000000001000 ................ 0x1c
ffffff8026f16020: fffffffee4987ec0 0000000000001000 .~.............. 0x1d
ffffff8026f16030: fffffffee8972740 0000000000001000 @'.............. 0x1e
ffffff8026f16040: fffffffee6bb7900 0000000000001000 .y.............. 0x1f
ffffff8026f16050: fffffffee7916f80 0000000000001000 .o.............. 0x20
ffffff8026f16060: fffffffee9435b40 0000000000001000 @[C............. 0x21
ffffff8026f16070: fffffffee7e03680 0000000000001000 .6.............. 0x22
ffffff8026f16080: fffffffee7f7ce80 0000000000001000 ................ 0x23
ffffff8026f16090: fffffffee60f7580 0000000000001000 .u.............. 0x24
ffffff8026f160a0: fffffffee7e22b40 0000000000001000 @+.............. 0x25
ffffff8026f160b0: fffffffee634d780 0000000000001000 ..4............. 0x26
ffffff8026f160c0: 0000000000000000 0000000000000000 ................
ffffff8026f160d0: 0000000000000000 0000000000000000 ................
ffffff8026f160e0: 0000000000000000 0000000000000000 ................
ffffff8026f160f0: 0000000000000000 0000000000000000 ................
检索内存找到相关性痕迹,正好符合连续 10 页的操作 bio 痕迹,正是此问题的相关页。继续检索 bio 地址,那特征就是持有 vec 指针地址的内存附近,于是搜索地址 ffffff8026f16000 得到以下痕迹。
crash-android> rd ffffff8026f16f78 -e ffffff8026f17178
ffffff8026f16f78: 0000000000000000 0000000000000000 ................
ffffff8026f16f88: 0000000000000000 0000000000000000 ................
ffffff8026f16f98: 0000000000000000 0000000000000000 ................
ffffff8026f16fa8: 0000000000000000 0000000000000000 ................
ffffff8026f16fb8: 0000000000000000 0000000000000000 ................
ffffff8026f16fc8: 0000000000000000 0000000000000000 ................
ffffff8026f16fd8: 0000000000000000 0000000000000000 ................
ffffff8026f16fe8: 0000000000000000 0000000000000000 ................
ffffff8026f16ff8: 0000000000000000 321a84c0c34b3bb3 .........;K....2
ffffff8026f17008: 0000000000000000 0000000000000000 ................
ffffff8026f17018: 0000000100000000 0000000000004c40 ........@L......
ffffff8026f17028: 000000020000a000 ffffffff00000000 ................
ffffff8026f17038: 0000000000000000 0000000000000000 ................
ffffff8026f17048: 0000000000000000 0000000000000000 ................
ffffff8026f17058: 0000000000000000 0000000000000000 ................
ffffff8026f17068: 00000100000c0000 0000000000000001 ................
ffffff8026f17078: ffffff8026f16000 0000000000000000 .`.&............
ffffff8026f17088: 0000000000000000 0000000000000000 ................
ffffff8026f17098: 0000000000000000 ffffff8827f16500 .........e.'....
ffffff8026f170a8: 0000000000988000 ffffffebecca92a4 ................
ffffff8026f170b8: 0000000000000000 0000400400000000 .............@..
ffffff8026f170c8: 0000000000000000 ffffff801e149000 ................
ffffff8026f170d8: 0000000000000000 0000000000000000 ................
ffffff8026f170e8: 0000000000000000 0000000000000000 ................
ffffff8026f170f8: 0000000000000000 0000000000000000 ................
ffffff8026f17108: 0000000000000000 0000000000000000 ................
ffffff8026f17118: 0000000000000000 0000000000000000 ................
ffffff8026f17128: 0000000000000000 0000000000000000 ................
ffffff8026f17138: 0000000000000000 0000000000000000 ................
ffffff8026f17148: 0000000000000000 0000000000000000 ................
ffffff8026f17158: 0000000000000000 0000000000000000 ................
ffffff8026f17168: 0000000000000000 0000000000000000 ................
crash-android> struct bio ffffff8026f17000 -x
struct bio {
bi_next = 0x321a84c0c34b3bb3,
bi_bdev = 0x0,
bi_opf = 0x0,
bi_flags = 0x0,
bi_ioprio = 0x0,
bi_write_hint = WRITE_LIFE_NOT_SET,
bi_status = 0x0,
__bi_remaining = {
counter = 0x1
},
bi_iter = {
bi_sector = 0x4c40,
bi_size = 0xa000,
bi_idx = 0x2,
bi_bvec_done = 0x0
},
{
bi_cookie = 0xffffffff,
__bi_nr_segments = 0xffffffff
},
bi_end_io = 0x0,
bi_private = 0x0,
bi_blkg = 0x0,
bi_issue = {
value = 0x0
},
bi_iocost_cost = 0x0,
bi_crypt_context = 0x0,
bi_skip_dm_default_key = 0x0,
bi_vcnt = 0xc,
bi_max_vecs = 0x100,
__bi_cnt = {
counter = 0x1
},
bi_io_vec = 0xffffff8026f16000,
bi_pool = 0x0,
android_oem_data1 = 0x0,
__kabi_reserved1 = 0x0,
__kabi_reserved2 = 0x0,
bi_inline_vecs = 0xffffff8026f170a0
}
由于该文件是在 erofs 文件系统中的,因此该数据痕迹应该是 erofs_fileio_rq 遗留的。
crash-android> struct kiocb ffffff8026f170a0
struct kiocb {
ki_filp = 0xffffff8827f16500,
ki_pos = 9994240,
ki_complete = 0xffffffebecca92a4 <erofs_fileio_ki_complete>,
private = 0x0,
ki_flags = 0,
ki_ioprio = 16388,
{
ki_waitq = 0x0,
dio_complete = 0x0
}
}
crash-android> lp file 0xffffff8827f16500
/system/apex/com.android.runtime.apex
crash-android> struct erofs_fileio_rq ffffff8026f16000 -x
struct erofs_fileio_rq {
bvecs = {{
bv_page = 0xfffffffee808d000,
bv_len = 0x1000,
bv_offset = 0x0
}, {
bv_page = 0xfffffffee90c2e00,
bv_len = 0x1000,
bv_offset = 0x0
}, {
bv_page = 0xfffffffee4987ec0,
bv_len = 0x1000,
bv_offset = 0x0
}, {
bv_page = 0xfffffffee8972740,
bv_len = 0x1000,
bv_offset = 0x0
}, {
bv_page = 0xfffffffee6bb7900,
bv_len = 0x1000,
bv_offset = 0x0
}, {
bv_page = 0xfffffffee7916f80,
bv_len = 0x1000,
bv_offset = 0x0
}, {
bv_page = 0xfffffffee9435b40,
bv_len = 0x1000,
bv_offset = 0x0
}, {
bv_page = 0xfffffffee7e03680,
bv_len = 0x1000,
bv_offset = 0x0
}, {
bv_page = 0xfffffffee7f7ce80,
bv_len = 0x1000,
bv_offset = 0x0
}, {
bv_page = 0xfffffffee60f7580,
bv_len = 0x1000,
bv_offset = 0x0
}, {
bv_page = 0xfffffffee7e22b40,
bv_len = 0x1000,
bv_offset = 0x0
}, {
bv_page = 0xfffffffee634d780,
bv_len = 0x1000,
bv_offset = 0x0
}, {
...
bio = {
bi_next = 0x321a84c0c34b3bb3,
bi_bdev = 0x0,
bi_opf = 0x0,
bi_flags = 0x0,
bi_ioprio = 0x0,
bi_write_hint = WRITE_LIFE_NOT_SET,
bi_status = 0x0,
__bi_remaining = {
counter = 0x1
},
bi_iter = {
bi_sector = 0x4c40,
bi_size = 0xa000,
bi_idx = 0x2,
bi_bvec_done = 0x0
},
{
bi_cookie = 0xffffffff,
__bi_nr_segments = 0xffffffff
},
bi_end_io = 0x0,
bi_private = 0x0,
bi_blkg = 0x0,
bi_issue = {
value = 0x0
},
bi_iocost_cost = 0x0,
bi_crypt_context = 0x0,
bi_skip_dm_default_key = 0x0,
bi_vcnt = 0xc,
bi_max_vecs = 0x100,
__bi_cnt = {
counter = 0x1
},
bi_io_vec = 0xffffff8026f16000,
bi_pool = 0x0,
android_oem_data1 = 0x0,
__kabi_reserved1 = 0x0,
__kabi_reserved2 = 0x0,
bi_inline_vecs = 0xffffff8026f170a0
},
iocb = {
ki_filp = 0xffffff8827f16500,
ki_pos = 0x988000,
ki_complete = 0xffffffebecca92a4 <erofs_fileio_ki_complete>,
private = 0x0,
ki_flags = 0x0,
ki_ioprio = 0x4004,
{
ki_waitq = 0x0,
dio_complete = 0x0
}
},
sb = 0xffffff801e149000
}
小结
- 在某一次程序访问 libm.so 文件,此时 inode address_space 缓存中未能命中,从 erofs 文件系统中重新读取页面数据(erofs_fileio_read_folio 或 erofs_fileio_readahead),此处可能存在异常导致连续的 10 页 (0x1d~0x26)清零未装入原始数据,并完成了本次 IO 处理,页表属性更新为 uptodate,并添加到 inode address_space 的管理。
- composer-service 首次发生 ILL_ILLOPC,接下来由于 inode address_space 缓存中的页表存在,这个期间任一程序调用 libm.so 这部分函数时,发生缺页中断,进入文件页缺页函数(filemap_fault),命中 inode address_space 缓存拿到已经污染的页表(uptodate 软件层面上认为是干净有效的页表),因此程序不断的 ILL_ILLOPC 错误。
文件系统分析
梳理下 read_folio 的到 erofs_fileio_ki_complete 的代码流程,程序如何走到 zero_fill_bio 中。
可以看到 vfs_iocb_iter_read 进入 filemap_get_pages 函数中会被信号打断,返回当前读取的数据大小,没有任何错误码,因此进入函数 erofs_fileio_ki_complete 数据大小和预期大小不相符,调用了bio_advance 和 zero_fill_bio 函数,将多余的页表内容清零。
在 folio_end_read 函数中将页表状态置为 uptodate,因此污染了 page cache 的页表。
模拟测试
std::string md5file(const std::string& path) {
std::ifstream f(path, std::ios::binary);
if (!f) return "";
MD5 md5;
char buf[4096];
while (f.read(buf, sizeof(buf)) || f.gcount())
md5.update(reinterpret_cast<const uint8_t*>(buf), f.gcount());
return md5.digest();
}
int erofs_bug_zero_fill_bio(char *filename) {
struct stat sb;
if (stat(filename, &sb) == -1)
return 0;
drop_caches();
std::thread([&]() {
// std::this_thread::sleep_for(std::chrono::milliseconds(1));
std::this_thread::sleep_for(std::chrono::microseconds(
sb.st_size >= (256 * 1024) ? 1000 : (int)(0.0038146F * sb.st_size)));
syscall(SYS_kill, getpid(), 9);
}).detach();
std::string md5 = md5file(filename);
std::cout << md5 << std::endl;
return 0;
}
int main(int argc, char* argv[]) {
if (argc < 2) return 1;
drop_caches();
std::string md5 = md5file(argv[1]);
while (1) {
pid_t pid = fork();
int status;
if (pid == 0) {
erofs_bug_zero_fill_bio(argv[1]);
exit(0);
} else if (pid > 0) {
waitpid(pid, &status, 0);
std::string current_md5 = md5file(argv[1]);
if (md5 != current_md5) {
std::cout << "erofs bug zero_fill_bio!!\n"
<< argv[1] << " md5sum miss match!!\n"
<< md5 << " != " << current_md5 << std::endl;
break;
}
}
}
return 0;
}
# ./data/erofs-detect /apex/com.android.runtime/lib64/bionic/libm.so
erofs bug zero_fill_bio!!
/apex/com.android.runtime/lib64/bionic/libm.so md5sum miss match!!
65f089be0c9b8cb2d4d7b9bfff44c50e != f53a942a9508d1a5ba5d3ba703ba71df
# md5sum /apex/com.android.runtime/lib64/bionic/libm.so
f53a942a9508d1a5ba5d3ba703ba71df /apex/com.android.runtime/lib64/bionic/libm.so
# echo 3 > /proc/sys/vm/drop_caches
# md5sum /apex/com.android.runtime/lib64/bionic/libm.so
65f089be0c9b8cb2d4d7b9bfff44c50e /apex/com.android.runtime/lib64/bionic/libm.so