《拓詞》應用閃退問題分析報告
【問題描述】
第三方優質應用《拓詞》打開就停止運行,不管是什麽版本的系統和什麽版本的拓詞。
出現問題時,系統沒有生成tombstone文件,只有main.log中有如下信息:
pid: 17241, tid: 17276, name: Thread-413 >>> com.towords <<< signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0000001c
【分析步驟】
發現每次拓詞crash時debuggerd進程也會一起crash,所以才不會生成調用棧。
所以先得看看debuggerd為什麽會掛掉,首先查看debuggerd crash時的core:
(gdb) bt
#0 load_symbol_table ([email protected]=0x411ae05c "/data/data/com.towords/files/libprotectClass.so") at system/core/libcorkscrew/symbol_table.c:94
#1 0x401039fe in load_ptrace_map_info_data (mi=0x411ae048, pid=<optimized out>) at system/core/libcorkscrew/ptrace.c:96
#2 load_ptrace_context ([email protected]
查看源碼:
@system/core/libcorkscrew/symbol_table.c
symbol_table_t* load_symbol_table(const char *filename) { symbol_table_t* table = NULL; int fd = open(filename, O_RDONLY); //打開/data/data/com.towords/files/libprotectClass.sostruct stat sb; size_t length = sb.st_size; char* base = mmap(NULL, length, PROT_READ, MAP_PRIVATE, fd, 0); //映射到內存空間中 Elf32_Ehdr *hdr = (Elf32_Ehdr*)base; Elf32_Shdr *shdr = (Elf32_Shdr*)(base + hdr->e_shoff); //獲取SectionHeader的偏移 int sym_idx = -1; int dynsym_idx = -1; for (Elf32_Half i = 0; i < hdr->e_shnum; i++) { if (shdr[i].sh_type == SHT_SYMTAB) { //<<<< 查找symboltable sym_idx = i; }
debuggerd在讀取libprotectClass.so的symboltable的時候下標i越界了。
(gdb) disassemble
Dump of assembler code for function load_symbol_table:
0x4012cabc <+0>: stmdb sp!, {r4, r5, r6, r7, r8, r9, r10, r11, lr}
0x4012cac0 <+4>: movs r1, #0
0x4012cac2 <+6>: sub sp, #148 ; 0x94
...
0x4012cb14 <+88>: mla r4, r1, r0, r7
=> 0x4012cb18 <+92>: ldr r3, [r4, #4]
從r4+4的地址取值時FC的,查看r4的值:
(gdb) info reg r4
r0 0x1d 29
r1 0x28 40
r2 0x0 0
r3 0x0 0
r4 0x4016b00c 1075228684
r5 0x4013b000 1075032064
r6 0x1 1
r7 0x4016ab84 1075227524
r8 0xffffffff 4294967295
r9 0x1 1
r10 0x1 1
r11 0x4005b6c0 1074116288
r12 0x66 102
sp 0xbebe4f88 0xbebe4f88
lr 0x40096cef 1074359535
pc 0x40103b18 0x40103b18 <load_symbol_table+92>
cpsr 0x80010030 -2147418064
這個值剛好是頁邊界,很可能是訪問越界了,估計ELF的頭信息被篡改了。
用readelf查看這個libprotectClass.so的頭信息:
ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2‘s complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: DYN (Shared object file) Machine: ARM Version: 0x1 Entry point address: 0x0 Start of program headers: 52 (bytes into file) Start of section headers: 195460 (bytes into file) Flags: 0x5000000, Version5 EABI Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 7 Size of section headers: 108 (bytes) Number of section headers: 102
最後兩個值section header大小和個數異常,且少了一個section header string table index。
一般正常的elf頭信息如下:
ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2‘s complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: DYN (Shared object file) Machine: ARM Version: 0x1 Entry point address: 0xd4c Start of program headers: 52 (bytes into file) Start of section headers: 8568 (bytes into file) Flags: 0x5000000, Version5 EABI Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 8 Size of section headers: 40 (bytes) Number of section headers: 25 Section header string table index: 24
顯然是為了防止elf文件被破解,人為的破壞了elf的頭信息,這樣很多反匯編工具就無法正常解析這個elf問價了。
為了能正確打印調試信息,需要把這個頭信息改回正確值:
真正的number of section header的值可以通過:
so文件大小(0x2ff1c)減去Start of section headers值(0x2fb84),再除以Size of section headers值0x28(40)即獲得。
(gdb) p /x (0x2ff1c-0x2fb84)/0x28 $19 = 0x17
Section header string table index值一般是number of section header減一,這裏就死0x16。
通過二進制編輯器將libprotectClass.so文件裏的對應位改掉即可。
修改前:
7F 45 4C 46 01 01 01 00 00 00 00 00 00 00 00 00
03 00 28 00 01 00 00 00 00 00 00 00 34 00 00 00
84 FB 02 00 00 00 00 05 34 00 20 00 07 00 6C 00
66 00 78 00 06 00 00 00 34 00 00 00 34 00 00 00
修改後:
7F 45 4C 46 01 01 01 00 00 00 00 00 00 00 00 00 03 00 28 00 01 00 00 00 00 00 00 00 34 00 00 00 84 FB 02 00 00 00 00 05 34 00 20 00 07 00 28 00 17 00 16 00 06 00 00 00 34 00 00 00 34 00 00 00
push到手機裏後,重啟復現問題,發現debuggerd還是會crash,而且調用棧一模一樣。
推斷可能是程序啟動的時候,自己改寫這個so庫。因此用chmod 555 libprotectClass.so命令把這個庫的寫權限給去掉。
再重啟復現問題,發現debuggerd不再crash,也會生成libprotectClass.so的調用棧,coredump文件、maps文件等調試信息。
同時,mail.log裏多了如下警告信息:
08-06 21:35:04.303 5299 5299 W System.err: java.io.FileNotFoundException: /data/data/com.towords/files/libprotectClass.so: open failed: EACCES (Permission denied) 08-06 21:35:04.305 5299 5299 W System.err: at libcore.io.IoBridge.open(IoBridge.java:409) 08-06 21:35:04.305 5299 5299 W System.err: at java.io.FileOutputStream.<init>(FileOutputStream.java:88) 08-06 21:35:04.305 5299 5299 W System.err: at java.io.FileOutputStream.<init>(FileOutputStream.java:128) 08-06 21:35:04.306 5299 5299 W System.err: at java.io.FileOutputStream.<init>(FileOutputStream.java:117) 08-06 21:35:04.306 5299 5299 W System.err: at com.qihoo.util.StubApplication.copy(StubApplication.java:217) 08-06 21:35:04.306 5299 5299 W System.err: at com.qihoo.util.StubApplication.attachBaseContext(StubApplication.java:147) 08-06 21:35:04.306 5299 5299 W System.err: at android.app.Application.attach(Application.java:185) 08-06 21:35:04.306 5299 5299 W System.err: at android.app.Instrumentation.newApplication(Instrumentation.java:991) 08-06 21:35:04.306 5299 5299 W System.err: at android.app.Instrumentation.newApplication(Instrumentation.java:975) 08-06 21:35:04.306 5299 5299 W System.err: at android.app.LoadedApk.makeApplication(LoadedApk.java:504) 08-06 21:35:04.306 5299 5299 W System.err: at android.app.ActivityThread.handleBindApplication(ActivityThread.java:4314) 08-06 21:35:04.306 5299 5299 W System.err: at android.app.ActivityThread.access$1500(ActivityThread.java:138) 08-06 21:35:04.306 5299 5299 W System.err: at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1261) 08-06 21:35:04.306 5299 5299 W System.err: at android.os.Handler.dispatchMessage(Handler.java:102) 08-06 21:35:04.307 5299 5299 W System.err: at android.os.Looper.loop(Looper.java:136) 08-06 21:35:04.307 5299 5299 W System.err: at android.app.ActivityThread.main(ActivityThread.java:5016) 08-06 21:35:04.307 5299 5299 W System.err: at java.lang.reflect.Method.invokeNative(Native Method) 08-06 21:35:04.307 5299 5299 W System.err: at java.lang.reflect.Method.invoke(Method.java:515) 08-06 21:35:04.307 5299 5299 W System.err: at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:792) 08-06 21:35:04.307 5299 5299 W System.err: at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:608) 08-06 21:35:04.307 5299 5299 W System.err: at dalvik.system.NativeStart.main(Native Method) 08-06 21:35:04.307 5299 5299 W System.err: Caused by: libcore.io.ErrnoException: open failed: EACCES (Permission denied) 08-06 21:35:04.308 5299 5299 W System.err: at libcore.io.Posix.open(Native Method) 08-06 21:35:04.308 5299 5299 W System.err: at libcore.io.BlockGuardOs.open(BlockGuardOs.java:110) 08-06 21:35:04.308 5299 5299 W System.err: at libcore.io.IoBridge.open(IoBridge.java:393) 08-06 21:35:04.309 5299 5299 W System.err: ... 20 more
很明顯,程序確實在啟動的時候再改寫這個libprotectClass.so文件,由於是W的log,即使不讓它寫也不會影響程序的執行。
從com.qihoo.util.StubApplication可以看到,這裏拓詞可能是用了奇虎的一些安全框架。
回歸正題,現在再看看拓詞是怎麽掛的
有了應用的coredump、maps、tombstone等信息,我們就可以對這個應用進行全面的分析。
從tombstone可以看到如下信息:
pid: 5299, tid: 5397, name: Thread-333 >>> com.towords <<< signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0000001c r0 753d0628 r1 00000000 r2 42da5e60 r3 00000000 r4 42da5e60 r5 42da5e60 r6 00000000 r7 7598f7d8 r8 7598fb10 r9 7539ff0c sl 00000001 fp 7598fb24 ip 1d300001 sp 7598f748 lr 415479e7 pc 4155ac2e cpsr 600b0030 backtrace: #00 pc 0005fc2e /system/lib/libdvm.so (dvmCallMethodV(Thread*, Method const*, Object*, bool, JValue*, std::__va_list)+9) #01 pc 0004c9e3 /system/lib/libdvm.so #02 pc 0000ebbb <unknown>
用gdb分析core:
(gdb) disassemble Dump of assembler code for function dvmCallMethodV(Thread*, Method const*, Object*, bool, JValue*, std::__va_list): 0x4155ac24 <+0>: stmdb sp!, {r4, r5, r6, r7, r8, r9, r10, r11, lr} 0x4155ac28 <+4>: mov r10, r3 0x4155ac2a <+6>: sub sp, #28 0x4155ac2c <+8>: movs r3, #0 => 0x4155ac2e <+10>: ldr r5, [r1, #28] 0x4155ac30 <+12>: mov r6, r0
顯然r1值為空導致這次crash。r1值是Method*,是上一級函數傳下來的。
從sp中查找上一級的返回地址:
0x7598f748: 0x42da5e60 0x415b5bd8 0x00000014 0x415245cc
0x7598f758: 0x42da5e60 0x753d0628 0x415a6c6c 0x42da5e60
r4
0x7598f768: 0x42da5e60 0x00000000 0x7598f7d8 0x7598fb10
r5 r6 r7 r8
0x7598f778: 0x7539ff0c 0x753d0638 0x7598fb24 0x415479e7
r9 r10 r11 lr
從lr的值可以推出上一級的函數地址為0x415479e7附近:
(gdb) disassemble 0x415479e6 Dump of assembler code for function NewObjectV(JNIEnv*, jclass, jmethodID, va_list): 0x415479a0 <+0>: push {r4, r5, r6, r7, lr} 0x415479a2 <+2>: mov r5, r0 0x415479a4 <+4>: sub sp, #28 0x415479a6 <+6>: mov r4, r1 0x415479a8 <+8>: add r0, sp, #12 0x415479aa <+10>: mov r1, r5 0x415479ac <+12>: mov r6, r2 ; jmethodID 0x415479ae <+14>: mov r7, r3 0x415479b0 <+16>: bl 0x41543c88 <ScopedJniThreadState::ScopedJniThreadState(_JNIEnv*)> 0x415479b4 <+20>: mov r1, r4 0x415479b6 <+22>: ldr r0, [sp, #12] 0x415479b8 <+24>: bl 0x41544d00 <dvmDecodeIndirectRef(Thread*, _jobject*)> 0x415479bc <+28>: mov r4, r0 0x415479be <+30>: bl 0x41543974 <canAllocClass(ClassObject*)> 0x415479c2 <+34>: cbz r0, 0x41547a02 <NewObjectV(JNIEnv*, jclass, jmethodID, va_list)+98> 0x415479c4 <+36>: ldr r3, [r4, #44] ; 0x2c 0x415479c6 <+38>: cmp r3, #7 0x415479c8 <+40>: beq.n 0x415479e8 <NewObjectV(JNIEnv*, jclass, jmethodID, va_list)+72> 0x415479ca <+42>: mov r0, r4 0x415479cc <+44>: bl 0x41566010 <dvmInitClass(ClassObject*)> 0x415479d0 <+48>: cbnz r0, 0x415479e8 <NewObjectV(JNIEnv*, jclass, jmethodID, va_list)+72> 0x415479d2 <+50>: b.n 0x41547a02 <NewObjectV(JNIEnv*, jclass, jmethodID, va_list)+98> 0x415479d4 <+52>: add r3, sp, #16 0x415479d6 <+54>: ldr r0, [sp, #12] 0x415479d8 <+56>: mov r1, r6 ; jmethodID 0x415479da <+58>: mov r2, r4 ; Object* 0x415479dc <+60>: stmia.w sp, {r3, r7} 0x415479e0 <+64>: movs r3, #1 0x415479e2 <+66>: bl 0x4155ac24 <dvmCallMethodV(Thread*, Method const*, Object*, bool, JValue*, std::__va_list)> => 0x415479e6 <+70>: b.n 0x41547a04 <NewObjectV(JNIEnv*, jclass, jmethodID, va_list)+100> 0x415479e8 <+72>: mov r0, r4
這裏的Method*是MethodID,依然是上一級函數傳下來的,繼續用sp推導上一級函數
0x7598f788: 0x7598f798 0x7598f7d8 0x00000000 0x753d0628 0x7598f798: 0x4185ceb0 0x415477c5 0x753d0cc8 0x415479a1 r4 0x7598f7a8: 0x753d0cc8 0x754034bc 0x753ff80a 0x753f0bbd r5 r6 r7 lr
lr值是0x753f0bbd,查看附近代碼:
0x753f0b84: push {r3} 0x753f0b86: push {r0, r1, r4, r5, r6, r7, lr} 0x753f0b88: ldr r3, [r0, #0] 0x753f0b8a: adds r5, r0, #0 0x753f0b8c: adds r7, r2, #0 0x753f0b8e: ldr r3, [r3, #24] 0x753f0b90: blx r3 0x753f0b92: ldr r6, [pc, #52] 0x753f0b94: adds r1, r0, #0 0x753f0b96: add r6, pc 0x753f0b98: str r0, [r6, #0] 0x753f0b9a: cmp r0, #0 0x753f0b9c: beq.n 0x753f0bbe 0x753f0b9e: ldr r3, [r5, #0] 0x753f0ba0: adds r2, r7, #0 0x753f0ba2: adds r0, r5, #0 0x753f0ba4: adds r3, #8 0x753f0ba6: ldr r4, [r3, #124] ; 0x7c 0x753f0ba8: ldr r3, [sp, #28] 0x753f0baa: blx r4 0x753f0bac: adds r2, r0, #0 0x753f0bae: ldr r0, [r5, #0] 0x753f0bb0: add r3, sp, #32 0x753f0bb2: str r3, [sp, #4] 0x753f0bb4: ldr r4, [r0, #116] ; 0x74 0x753f0bb6: ldr r1, [r6, #0] 0x753f0bb8: adds r0, r5, #0 ==> 0x753f0bba: blx r4 ; NewObjectV(JNIEnv*, jclass, jmethodID, va_list) 0x753f0bbc: str r0, [r6, #4] 0x753f0bbe: pop {r0, r1, r4, r5, r6, r7}
調用NewObjectV時傳入的參第三個參數r2就是Method*,
這裏的r2是參數MethodID,它是下面函數調用的返回值:
0x753f0baa: blx r4
這個r4值相關代碼如下:
0x753f0b86: push {r0, r1, r4, r5, r6, r7, lr} 0x753f0b8a: adds r5, r0, #0 0x753f0b9e: ldr r3, [r5, #0] 0x753f0ba6: ldr r4, [r3, #124] ; 0x7c 0x753f0ba8: ldr r3, [sp, #28] 0x753f0baa: blx r4
其中,棧裏的數據如下:
0x7598f7b8: 0x753d0cc8 0x7598f7d8 0x753d0cc8 0x753c86a8 r0 r1 r4 r5 0x7598f7c8: 0x753d0cc8 0x400c6384 0x753f0d35 0x753ff19c r6 r7 lr r3
這樣,可以推導出r4值是0x415477c5:
0x753f0b86: push {r0, r1, r4, r5, r6, r7, lr} 0x753f0b8a: adds r5, r0, #0 ; r5 = r0 = 0x753d0cc8 0x753f0b9e: ldr r3, [r5, #0] ; r3 = [r5] = [0x753d0cc8] = 0x415a43ec 0x753f0ba6: ldr r4, [r3, #124] ; r4 = [0x415a43ec+124] = [0x415a4468] = 0x415477c5 0x753f0baa: blx r4
這個函數就是GetMethodID():
(gdb) disassemble 0x415477c5
Dump of assembler code for function GetMethodID(JNIEnv*, jclass, char const*, char const*):
0x415477c4 <+0>: stmdb sp!, {r4, r5, r6, r7, r8, r9, lr}
0x415477c8 <+4>: mov r5, r0
0x415477ca <+6>: sub sp, #20
0x415477cc <+8>: mov r4, r1
...
MethodID就是通過調用虛擬機的GetMethodID()來獲取的,而這個函數卻返回了0。
我們再看看它是要獲取哪個函數的MethodID,這需要解析它的幾個參數。
第一個參數相關代碼:
0x753f0b84: push {r3} 0x753f0b86: push {r0, r1, r4, r5, r6, r7, lr} 0x753f0b8a: adds r5, r0, #0 ; r5 = r0 = 0x753d0cc8 0x753f0ba2: adds r0, r5, #0 ; r0 = r5 = 0x753d0cc8 0x753f0baa: blx r4
從GetMethodID(JNIEnv*, jclass, char const*, char const*)的定義可知,第一個參數
r0 = 0x753d0cc8是JNIEnv*
第二個參數相關代碼:
0x753f0b90: blx r3 0x753f0b92: ldr r6, [pc, #52] ; r6 = [0x753f0bc8] = 0x00012922 0x753f0b94: adds r1, r0, #0 ; r1 = r0 0x753f0b96: add r6, pc ; r6 += 0x753f0b96 + 2 = 0x754034bc 0x753f0b98: str r0, [r6, #0] ; r0 = [0x754034bc] = 0x4185ceb0 0x753f0baa: blx r4 ; GetMethodID(JNIEnv*, jclass, char const*, char const*)
r1值等於blx r3的返回值r0,而這個r0是保存在r6指向的內存裏,這樣r1的值就是0x4185ceb0。
根據GetMethodID(JNIEnv*, jclass, char const*, char const*)的定義可知,第二個參數是ClassObject*
(gdb) p *(ClassObject*)0x4185ceb0 $14 = { <Object> = { clazz = 0x416cc1e8, lock = 0 }, members of ClassObject: instanceData = {0, 0, 0, 0}, descriptor = 0x6f21a8b9 <Address 0x6f21a8b9 out of bounds>, ...
通過map表,[email protected][email protected]@classes.dex:
6ec5c000-6edd4000 r--p 00000000 b3:1b 40972 [email protected]@[email protected]
...
6f14b000-6f14c000 r--p 004ef000 b3:1b 40972 [email protected]@[email protected]
6f14c000-6f5b0000 r--p 004f0000 b3:1b 40972 [email protected]@[email protected]
6f5b0000-6f669000 rw-p 00000000 00:04 9331 /dev/ashmem/dalvik-aux-structure (deleted)
計算相對偏移
(gdb) p /x 0x6f21a8b9-0x6ec5c000 $15 = 0x5be8b9
[email protected]@[email protected],用二進制編輯器查看:
@[email protected]@[email protected] 0x5be8b9: 24 4C 61 6E 64 72 6F 69 64 2F 74 65 6C 65 70 68 6F 6E 79 2F 54 65 6C 65 70 68 6F 6E 79 4D 61 6E 61 67 65 72 3B 00 $Landroid/telephony/TelephonyManager;
確定這個Object所屬類是android/telephony/TelephonyManager。
這裏的blx r3通過推導也很容易知道是調用FindClass(),
也就是說這裏通過FindClass()找到了android/telephony/TelephonyManager類。
第三個參數相關代碼:
0x753f0ba0: adds r2, r7, #0
這裏r7的值直接取下一級函數NewObjectV()對應的棧裏面取就是了。
0x7598f788: 0x7598f798 0x7598f7d8 0x00000000 0x753d0628
Thead*
0x7598f798: 0x4185ceb0 0x415477c5 0x753d0cc8 0x415479a1
r4
0x7598f7a8: 0x753d0cc8 0x754034bc 0x753ff80a 0x753f0bbd
r5 r6 r7 lr
r2 = r7 = 0x753ff80a
根據GetMethodID(JNIEnv*, jclass, char const*, char const*)定義可知它是一個字符串:
(gdb) x /s 0x753ff80a <init>
第三個參數是字符串"<init>"。
第四個參數相關代碼:
0x753f0b84: push {r3} ; [0x753ff19c] = r3, sp = 0x7598f7d4 0x753f0b86: push {r0, r1, r4, r5, r6, r7, lr} ; sp -= 28 = 0x7598f7b8 0x753f0ba8: ldr r3, [sp, #28] ; r3 = [sp-28] = [0x7598f7d4] 0x753f0baa: blx r4 ; GetMethodID(JNIEnv*, jclass, char const*, char const*)
r3 就是第一句話中壓入棧裏的 0x753ff19c
0x7598f7b8: 0x753d0cc8 0x7598f7d8 0x753d0cc8 0x753c86a8
r0 r1 r4 r5
0x7598f7c8: 0x753d0cc8 0x400c6384 0x753f0d35 0x753ff19c
r6 r7 lr r3
根據GetMethodID(JNIEnv*, jclass, char const*, char const*)定義可知它也是一個字符串:
(gdb) x /s 0x753ff19c ()V
至此,這裏大概的邏輯是這樣的:
jclass localClass = env->FindClass("android/telephony/TelephonyManager"); jmethodID localMethodID = env->GetMethodID(localClass,"<init>","()V") jobject localObject = env->NewObject(localClass,localMethodID,NULL)
也就是在調用android/telephony/TelephonyManager的默認構造函數的時候死掉的。
查找代碼發現frameworks/telephony/base/java/android/telephony/TelephonyManager.java中確實沒有默認構造函數。
而原生代碼中是有默認構造函數的。
查看代碼提交記錄,發現是有位同事發現沒有地方調用這個默認構造函數,所以給去掉了。
【解決方案】
添加默認構造函數後,APP不再crash了。
《拓詞》應用閃退問題分析報告