深入理解Dalvik虛擬機- 解釋器的執行機制
Java全部的方法都是類方法,因此Dalvik的字節碼運行就兩種。一是類的Method。包含靜態和非靜態。兩者的差距也就是有沒有this參數。二就是類的初始化代碼,就是類載入的時候。成員變量的初始化以及顯式的類初始化塊代碼。
當中類的初始化代碼在dalvik/vm/oo/Class.cpp的dvmInitClass:
bool dvmInitClass(ClassObject* clazz) { ... dvmLockObject(self, (Object*) clazz); ... android_atomic_release_store(CLASS_INITIALIZING, (int32_t*)(void*)&clazz->status); dvmUnlockObject(self, (Object*) clazz); ... initSFields(clazz); /* Execute any static initialization code. */ method = dvmFindDirectMethodByDescriptor(clazz, "<clinit>", "()V"); if (method == NULL) { LOGVV("No <clinit> found for %s", clazz->descriptor); } else { LOGVV("Invoking %s.<clinit>", clazz->descriptor); JValue unused; dvmCallMethod(self, method, NULL, &unused); } ... }
從代碼可見。類初始化的主要代碼邏輯包含:
類對象加鎖。所以類的載入是單線程的
初始化static成員(initSFields)
調用<cinit>,靜態初始化塊
類的初始化塊代碼在<cinit>的成員函數裏。可見Dalvik的字節碼解釋,本質上還是類成員函數的解釋運行。
虛擬機以Method作為解釋器的運行單元。其入口就統一為dvmCallMethod,該函數的定義在dalvik/vm/interp/Stack.cpp裏。
void dvmCallMethod(Thread* self, const Method* method, Object* obj, JValue* pResult, ...) { va_list args; va_start(args, pResult); dvmCallMethodV(self, method, obj, false, pResult, args); va_end(args); } void dvmCallMethodV(Thread* self, const Method* method, Object* obj, bool fromJni, JValue* pResult, va_list args) { ... if (dvmIsNativeMethod(method)) { TRACE_METHOD_ENTER(self, method); /* * Because we leave no space for local variables, "curFrame" points * directly at the method arguments. */ (*method->nativeFunc)((u4*)self->interpSave.curFrame, pResult, method, self); TRACE_METHOD_EXIT(self, method); } else { dvmInterpret(self, method, pResult); } … }
Java的Method有native函數和非native函數。native的函數的代碼段是在so裏。是本地指令集而非虛擬機的字節碼。
虛擬機以Method作為解釋器的運行單元,其入口就統一為dvmCallMethod,該函數的定義在dalvik/vm/interp/Stack.cpp裏。
void dvmCallMethod(Thread* self, const Method* method, Object* obj, JValue* pResult, ...) { va_list args; va_start(args, pResult); dvmCallMethodV(self, method, obj, false, pResult, args); va_end(args); } void dvmCallMethodV(Thread* self, const Method* method, Object* obj, bool fromJni, JValue* pResult, va_list args) { ... if (dvmIsNativeMethod(method)) { TRACE_METHOD_ENTER(self, method); /* * Because we leave no space for local variables, "curFrame" points * directly at the method arguments. */ (*method->nativeFunc)((u4*)self->interpSave.curFrame, pResult, method, self); TRACE_METHOD_EXIT(self, method); } else { dvmInterpret(self, method, pResult); } … }
假設method是個native的函數,那麽就直接調用nativeFunc這個函數指針,否則就調用dvmInterpret代碼,dvmInterpret就是解釋器的入口。
假設把Dalvik函數運行的調用棧畫出來。我們會更清楚整個流程。
public class HelloWorld { public int foo(int i, int j){ int k = i + j; return k; } public static void main(String[] args) { System.out.print(new HelloWorld().foo(1, 2)); } }
Dalvik虛擬機有兩個棧,一個Java棧。一個是VM的native棧。vm的棧是OS的函數調用棧。Java的棧則是由VM管理的棧,每次在dvmCallMethod的時候,在Method運行之前,會調用dvmPushInterpFrame(java→java)或者dvmPushJNIFrame(java→native)。JNI的Frame比InterpFrame少了局部變量的棧空間,native函數的局部變量是在vm的native棧裏,由OS負責壓棧出棧。DvmCallMethod結束的時候會調用dvmPopFrame做Java
Stack的出棧。
所以Java Method的運行就是dvmInterpret函數對這個Method的字節碼做解析,函數的實參與局部變量都在Java的Stack裏獲取。SaveBlock是StackSaveArea數據結構。裏面包括了當前函數相應的棧信息,包括返回地址等。而Native Method的運行就是Method的nativeFunc的運行,實參和局部變量都是在VM的native stack裏。
Method的nativeFunc是native函數的入口,dalvik虛擬機上的java 的函數hook技術,都是通過改變Method的屬性,SET_METHOD_FLAG(method, ACC_NATIVE),偽裝成native函數。再設置nativeFunc作為鉤子函數。從而實現hook功能。非常顯然,hook了的method不再具有多態性。
void dvmResolveNativeMethod(const u4* args, JValue* pResult, const Method* method, Thread* self) { ClassObject* clazz = method->clazz; /* * If this is a static method, it could be called before the class * has been initialized. */ if (dvmIsStaticMethod(method)) { if (!dvmIsClassInitialized(clazz) && !dvmInitClass(clazz)) { assert(dvmCheckException(dvmThreadSelf())); return; } } else { assert(dvmIsClassInitialized(clazz) || dvmIsClassInitializing(clazz)); } /* start with our internal-native methods */ DalvikNativeFunc infunc = dvmLookupInternalNativeMethod(method); if (infunc != NULL) { /* resolution always gets the same answer, so no race here */ IF_LOGVV() { char* desc = dexProtoCopyMethodDescriptor(&method->prototype); LOGVV("+++ resolved native %s.%s %s, invoking", clazz->descriptor, method->name, desc); free(desc); } if (dvmIsSynchronizedMethod(method)) { ALOGE("ERROR: internal-native can‘t be declared ‘synchronized‘"); ALOGE("Failing on %s.%s", method->clazz->descriptor, method->name); dvmAbort(); // harsh, but this is VM-internal problem } DalvikBridgeFunc dfunc = (DalvikBridgeFunc) infunc; dvmSetNativeFunc((Method*) method, dfunc, NULL); dfunc(args, pResult, method, self); return; } /* now scan any DLLs we have loaded for JNI signatures */ void* func = lookupSharedLibMethod(method); if (func != NULL) { /* found it, point it at the JNI bridge and then call it */ dvmUseJNIBridge((Method*) method, func); (*method->nativeFunc)(args, pResult, method, self); return; } IF_ALOGW() { char* desc = dexProtoCopyMethodDescriptor(&method->prototype); ALOGW("No implementation found for native %s.%s:%s", clazz->descriptor, method->name, desc); free(desc); } dvmThrowUnsatisfiedLinkError("Native method not found", method); }
dvmResolveNativeMethod首先會調用dvmLookupInternalNativeMethod查詢這個函數是否預置的函數,主要是查以下的函數集:
static DalvikNativeClass gDvmNativeMethodSet[] = { { "Ljava/lang/Object;", dvm_java_lang_Object, 0 }, { "Ljava/lang/Class;", dvm_java_lang_Class, 0 }, { "Ljava/lang/Double;", dvm_java_lang_Double, 0 }, { "Ljava/lang/Float;", dvm_java_lang_Float, 0 }, { "Ljava/lang/Math;", dvm_java_lang_Math, 0 }, { "Ljava/lang/Runtime;", dvm_java_lang_Runtime, 0 }, { "Ljava/lang/String;", dvm_java_lang_String, 0 }, { "Ljava/lang/System;", dvm_java_lang_System, 0 }, { "Ljava/lang/Throwable;", dvm_java_lang_Throwable, 0 }, { "Ljava/lang/VMClassLoader;", dvm_java_lang_VMClassLoader, 0 }, { "Ljava/lang/VMThread;", dvm_java_lang_VMThread, 0 }, { "Ljava/lang/reflect/AccessibleObject;", dvm_java_lang_reflect_AccessibleObject, 0 }, { "Ljava/lang/reflect/Array;", dvm_java_lang_reflect_Array, 0 }, { "Ljava/lang/reflect/Constructor;", dvm_java_lang_reflect_Constructor, 0 }, { "Ljava/lang/reflect/Field;", dvm_java_lang_reflect_Field, 0 }, { "Ljava/lang/reflect/Method;", dvm_java_lang_reflect_Method, 0 }, { "Ljava/lang/reflect/Proxy;", dvm_java_lang_reflect_Proxy, 0 }, { "Ljava/util/concurrent/atomic/AtomicLong;", dvm_java_util_concurrent_atomic_AtomicLong, 0 }, { "Ldalvik/bytecode/OpcodeInfo;", dvm_dalvik_bytecode_OpcodeInfo, 0 }, { "Ldalvik/system/VMDebug;", dvm_dalvik_system_VMDebug, 0 }, { "Ldalvik/system/DexFile;", dvm_dalvik_system_DexFile, 0 }, { "Ldalvik/system/VMRuntime;", dvm_dalvik_system_VMRuntime, 0 }, { "Ldalvik/system/Zygote;", dvm_dalvik_system_Zygote, 0 }, { "Ldalvik/system/VMStack;", dvm_dalvik_system_VMStack, 0 }, { "Lorg/apache/harmony/dalvik/ddmc/DdmServer;", dvm_org_apache_harmony_dalvik_ddmc_DdmServer, 0 }, { "Lorg/apache/harmony/dalvik/ddmc/DdmVmInternal;", dvm_org_apache_harmony_dalvik_ddmc_DdmVmInternal, 0 }, { "Lorg/apache/harmony/dalvik/NativeTestTarget;", dvm_org_apache_harmony_dalvik_NativeTestTarget, 0 }, { "Lsun/misc/Unsafe;", dvm_sun_misc_Unsafe, 0 }, { NULL, NULL, 0 }, };
不是內置的話,就會載入so庫。查詢相應的native函數,查詢的規則就是我們熟知的了,com.xx.Helloworld.foobar相應com_xx_Helloworld_foobar。
要註意的是,這個函數並非nativeFunc。接下來的dvmUseJNIBridge調用裏,dvmCallJNIMethod會作為nativeFunc。這個函數主要須要將之前提到的java stack frame裏的ins實參,轉譯成jni的函數調用參數。xposed/dexposed就會自己設置自己的nativeFun自己接管native函數的運行。
dvmInterpret是解釋器的代碼入口,代碼位置在interp/Interp.cpp
void dvmInterpret(Thread* self, const Method* method, JValue* pResult) { InterpSaveState interpSaveState; ExecutionSubModes savedSubModes; . . . interpSaveState = self->interpSave; self->interpSave.prev = &interpSaveState; . . . self->interpSave.method = method; self->interpSave.curFrame = (u4*) self->interpSave.curFrame; self->interpSave.pc = method->insns; . . . typedef void (*Interpreter)(Thread*); Interpreter stdInterp; if (gDvm.executionMode == kExecutionModeInterpFast) stdInterp = dvmMterpStd; #if defined(WITH_JIT) else if (gDvm.executionMode == kExecutionModeJit || gDvm.executionMode == kExecutionModeNcgO0 || gDvm.executionMode == kExecutionModeNcgO1) stdInterp = dvmMterpStd; #endif else stdInterp = dvmInterpretPortable; // Call the interpreter (*stdInterp)(self); *pResult = self->interpSave.retval; /* Restore interpreter state from previous activation */ self->interpSave = interpSaveState; #if defined(WITH_JIT) dvmJitCalleeRestore(calleeSave); #endif if (savedSubModes != kSubModeNormal) { dvmEnableSubMode(self, savedSubModes); } }
Thread的一個非常重要的field就是interpSave,是InterpSaveState類型的,裏面包括了當前函數。pc。當前棧幀等重要的變量,dvmInterpret一開始調用的時候就會初始化。
Dalvik解釋器有兩個。一個是dvmInterpretPortable。一個是 dvmMterpStd。兩者的差別在於。前者是從c++實現,後者是匯編實現。
dvmInterpretPortable是在vm/mterp/out/InterpC-portable.cpp中定義
void dvmInterpretPortable(Thread* self) { . . . DvmDex* methodClassDex; // curMethod->clazz->pDvmDex JValue retval; /* core state */ const Method* curMethod; // method we‘re interpreting const u2* pc; // program counter u4* fp; // frame pointer u2 inst; // current instruction /* instruction decoding */ u4 ref; // 16 or 32-bit quantity fetched directly u2 vsrc1, vsrc2, vdst; // usually used for register indexes /* method call setup */ const Method* methodToCall; bool methodCallRange; /* static computed goto table */ DEFINE_GOTO_TABLE(handlerTable); /* copy state in */ curMethod = self->interpSave.method; pc = self->interpSave.pc; fp = self->interpSave.curFrame; retval = self->interpSave.retval; methodClassDex = curMethod->clazz->pDvmDex; . . . FINISH(0); /* fetch and execute first instruction */ /*--- start of opcodes ---*/ /* File: c/OP_NOP.cpp */ HANDLE_OPCODE(OP_NOP) FINISH(1); OP_END /* File: c/OP_MOVE.cpp */ HANDLE_OPCODE(OP_MOVE /*vA, vB*/) vdst = INST_A(inst); vsrc1 = INST_B(inst); ILOGV("|move%s v%d,v%d %s(v%d=0x%08x)", (INST_INST(inst) == OP_MOVE) ? "" : "-object", vdst, vsrc1, kSpacing, vdst, GET_REGISTER(vsrc1)); SET_REGISTER(vdst, GET_REGISTER(vsrc1)); FINISH(1); OP_END ….. }
解釋器的指令運行是通過跳轉表來實現,DEFINE_GOTO_TABLE(handlerTable)定義了指令Op的goto表。
FINISH(0),則表示從第一條指令開始運行,
# define FINISH(_offset) { ADJUST_PC(_offset); inst = FETCH(0); if (self->interpBreak.ctl.subMode) { dvmCheckBefore(pc, fp, self); } goto *handlerTable[INST_INST(inst)]; } #define FETCH(_offset) (pc[(_offset)])
FETCH(0)獲得當前要運行的指令,通過查跳轉表handlerTable來跳轉到這條指令的運行點,就是函數後面的HANDLE_OPCODE的定義。
後者是針對不同平臺做過優化的解釋器。
dvmMterpStd會做匯編級的優化,dvmMterpStdRun的入口就是針對不同的平臺指令集,有相應的解釋器代碼,比方armv7 neon相應的代碼就在mterp/out/InterpAsm-armv7-a-neon.S。
dvmMterpStdRun: #define MTERP_ENTRY1 .save {r4-r10,fp,lr}; stmfd sp!, {r4-r10,fp,lr} @ save 9 regs #define MTERP_ENTRY2 .pad #4; sub sp, sp, #4 @ align 64 .fnstart MTERP_ENTRY1 MTERP_ENTRY2 /* save stack pointer, add magic word for debuggerd */ str sp, [r0, #offThread_bailPtr] @ save SP for eventual return /* set up "named" registers, figure out entry point */ mov rSELF, r0 @ set rSELF LOAD_PC_FP_FROM_SELF() @ load rPC and rFP from "thread" ldr rIBASE, [rSELF, #offThread_curHandlerTable] @ set rIBASE . . . /* start executing the instruction at rPC */ FETCH_INST() @ load rINST from rPC GET_INST_OPCODE(ip) @ extract opcode from rINST GOTO_OPCODE(ip) @ jump to next instruction . . . #define rPC r4 #define rFP r5 #define rSELF r6 #define rINST r7 #define rIBASE r8
非jit的情況下,先是FETCH_INST把pc的指令載入到rINST寄存器,之後GET_INST_OPCODE獲得操作碼 and _reg, rINST, #255。是把rINST的低16位給ip寄存器,GOTO_OPCODE跳轉到相應的地址。
#define GOTO_OPCODE(_reg) add pc, rIBASE, _reg, lsl #6
rIBASE 指向的curHandlerTable是跳轉表的首地址。GOTO_OPCODE(ip)就將pc的地址指向該指令相應的操作碼所在的跳轉表地址。
static Thread* allocThread(int interpStackSize) #ifndef DVM_NO_ASM_INTERP thread->mainHandlerTable = dvmAsmInstructionStart; thread->altHandlerTable = dvmAsmAltInstructionStart; thread->interpBreak.ctl.curHandlerTable = thread->mainHandlerTable; #endif
可見dvmAsmInstructionStart就是跳轉表的入口,定義在dvmMterpStdRun裏,
你能夠在這裏找到全部的Java字節碼的指令相應的解釋器代碼。
比方new操作符相應的代碼例如以下,先載入Thread.interpSave.methodClassDex,這是一個DvmDex指針,隨後載入 DvmDex的pResClasses來查找類是否載入過。假設沒載入過,那麽跳轉到 LOP_NEW_INSTANCE_resolve去載入類,假設載入過,就是類的初始化以及AllocObject的處理。LOP_NEW_INSTANCE_resolve就是調用clazz的dvmResolveClass載入。
/* ------------------------------ */ .balign 64 .L_OP_NEW_INSTANCE: /* 0x22 */ /* File: armv5te/OP_NEW_INSTANCE.S */ /* * Create a new instance of a class. */ /* new-instance vAA, class@BBBB */ ldr r3, [rSELF, #offThread_methodClassDex] @ r3<- pDvmDex FETCH(r1, 1) @ r1<- BBBB ldr r3, [r3, #offDvmDex_pResClasses] @ r3<- pDvmDex->pResClasses ldr r0, [r3, r1, lsl #2] @ r0<- resolved class #if defined(WITH_JIT) add r10, r3, r1, lsl #2 @ r10<- &resolved_class #endif EXPORT_PC() @ req‘d for init, resolve, alloc cmp r0, #0 @ already resolved? beq .LOP_NEW_INSTANCE_resolve @ no, resolve it now .LOP_NEW_INSTANCE_resolved: @ r0=class ldrb r1, [r0, #offClassObject_status] @ r1<- ClassStatus enum cmp r1, #CLASS_INITIALIZED @ has class been initialized? bne .LOP_NEW_INSTANCE_needinit @ no, init class now .LOP_NEW_INSTANCE_initialized: @ r0=class mov r1, #ALLOC_DONT_TRACK @ flags for alloc call bl dvmAllocObject @ r0<- new object b .LOP_NEW_INSTANCE_finish @ continue .LOP_NEW_INSTANCE_needinit: mov r9, r0 @ save r0 bl dvmInitClass @ initialize class cmp r0, #0 @ check boolean result mov r0, r9 @ restore r0 bne .LOP_NEW_INSTANCE_initialized @ success, continue b common_exceptionThrown @ failed, deal with init exception /* * Resolution required. This is the least-likely path. * * r1 holds BBBB */ .LOP_NEW_INSTANCE_resolve: ldr r3, [rSELF, #offThread_method] @ r3<- self->method mov r2, #0 @ r2<- false ldr r0, [r3, #offMethod_clazz] @ r0<- method->clazz bl dvmResolveClass @ r0<- resolved ClassObject ptr cmp r0, #0 @ got null? bne .LOP_NEW_INSTANCE_resolved @ no, continue b common_exceptionThrown @ yes, handle exception
作者簡單介紹:
田力。網易彩票Android端創始人,小米視頻創始人。現任roobo技術經理、視頻雲技術總監
歡迎關註微信公眾號 磨劍石,定期推送技術心得以及源代碼分析等文章,謝謝
深入理解Dalvik虛擬機- 解釋器的執行機制