1. 程式人生 > >深入理解Dalvik虛擬機- 解釋器的執行機制

深入理解Dalvik虛擬機- 解釋器的執行機制

util dlink stat counter before expose 加鎖 enter 機制

Dalvik的指令運行是解釋器+JIT的方式,解釋器就是虛擬機來對Javac編譯出來的字節碼,做譯碼、運行,而不是轉化成CPU的指令集。由CPU來做譯碼,運行。可想而知。解釋器的效率是相對較低的,所以出現了JIT(Just In Time),JIT是將運行次數較多的函數,做即時編譯,在運行時刻,編譯成本地目標代碼。JIT能夠看成是解釋器的一個補充優化。再之後又出現了Art虛擬機的AOT(Ahead Of Time)模式,做靜態編譯,在Apk安裝的時候就會做字節碼的編譯,從而效率直逼靜態語言。

Java全部的方法都是類方法,因此Dalvik的字節碼運行就兩種。一是類的Method。包含靜態和非靜態。兩者的差距也就是有沒有this參數。二就是類的初始化代碼,就是類載入的時候。成員變量的初始化以及顯式的類初始化塊代碼。



當中類的初始化代碼在dalvik/vm/oo/Class.cpp的dvmInitClass:

bool dvmInitClass(ClassObject* clazz)
{
    ...
    dvmLockObject(self, (Object*) clazz);
    ...
    android_atomic_release_store(CLASS_INITIALIZING,
                                 (int32_t*)(void*)&clazz->status);
    dvmUnlockObject(self, (Object*) clazz);
    ...
    initSFields(clazz);

    /* Execute any static initialization code.
     */
    method = dvmFindDirectMethodByDescriptor(clazz, "<clinit>", "()V");
    if (method == NULL) {
        LOGVV("No <clinit> found for %s", clazz->descriptor);
    } else {
        LOGVV("Invoking %s.<clinit>", clazz->descriptor);
        JValue unused;
        dvmCallMethod(self, method, NULL, &unused);
    }
    ...
}

從代碼可見。類初始化的主要代碼邏輯包含:

類對象加鎖。所以類的載入是單線程的

初始化static成員(initSFields)

調用<cinit>,靜態初始化塊

類的初始化塊代碼在<cinit>的成員函數裏。可見Dalvik的字節碼解釋,本質上還是類成員函數的解釋運行。



虛擬機以Method作為解釋器的運行單元。其入口就統一為dvmCallMethod,該函數的定義在dalvik/vm/interp/Stack.cpp裏。

void dvmCallMethod(Thread* self, const Method* method, Object* obj,
    JValue* pResult, ...)
{
    va_list args;
    va_start(args, pResult);
    dvmCallMethodV(self, method, obj, false, pResult, args);
    va_end(args);
}

void dvmCallMethodV(Thread* self, const Method* method, Object* obj,
    bool fromJni, JValue* pResult, va_list args)
{
   ...
    if (dvmIsNativeMethod(method)) {
        TRACE_METHOD_ENTER(self, method);
        /*
         * Because we leave no space for local variables, "curFrame" points
         * directly at the method arguments.
         */
        (*method->nativeFunc)((u4*)self->interpSave.curFrame, pResult,
                              method, self);
        TRACE_METHOD_EXIT(self, method);
    } else {
        dvmInterpret(self, method, pResult);
    }
   …
}

Java的Method有native函數和非native函數。native的函數的代碼段是在so裏。是本地指令集而非虛擬機的字節碼。


虛擬機以Method作為解釋器的運行單元,其入口就統一為dvmCallMethod,該函數的定義在dalvik/vm/interp/Stack.cpp裏。

void dvmCallMethod(Thread* self, const Method* method, Object* obj,
    JValue* pResult, ...)
{
    va_list args;
    va_start(args, pResult);
    dvmCallMethodV(self, method, obj, false, pResult, args);
    va_end(args);
}

void dvmCallMethodV(Thread* self, const Method* method, Object* obj,
    bool fromJni, JValue* pResult, va_list args)
{
   ...
	    if (dvmIsNativeMethod(method)) {
        TRACE_METHOD_ENTER(self, method);
        /*
         * Because we leave no space for local variables, "curFrame" points
         * directly at the method arguments.
         */
        (*method->nativeFunc)((u4*)self->interpSave.curFrame, pResult,
                              method, self);
        TRACE_METHOD_EXIT(self, method);
    } else {
        dvmInterpret(self, method, pResult);
 }
   …
}

假設method是個native的函數,那麽就直接調用nativeFunc這個函數指針,否則就調用dvmInterpret代碼,dvmInterpret就是解釋器的入口。

假設把Dalvik函數運行的調用棧畫出來。我們會更清楚整個流程。

public class HelloWorld {

    public int foo(int i, int j){
        int k = i + j;
        return k;
    }

    public static void main(String[] args) {
        System.out.print(new HelloWorld().foo(1, 2));
    }
}

技術分享圖片


Dalvik虛擬機有兩個棧,一個Java棧。一個是VM的native棧。vm的棧是OS的函數調用棧。Java的棧則是由VM管理的棧,每次在dvmCallMethod的時候,在Method運行之前,會調用dvmPushInterpFrame(java→java)或者dvmPushJNIFrame(java→native)。JNI的Frame比InterpFrame少了局部變量的棧空間,native函數的局部變量是在vm的native棧裏,由OS負責壓棧出棧。DvmCallMethod結束的時候會調用dvmPopFrame做Java Stack的出棧。

所以Java Method的運行就是dvmInterpret函數對這個Method的字節碼做解析,函數的實參與局部變量都在Java的Stack裏獲取。SaveBlock是StackSaveArea數據結構。裏面包括了當前函數相應的棧信息,包括返回地址等。而Native Method的運行就是Method的nativeFunc的運行,實參和局部變量都是在VM的native stack裏。



Method的nativeFunc是native函數的入口,dalvik虛擬機上的java 的函數hook技術,都是通過改變Method的屬性,SET_METHOD_FLAG(method, ACC_NATIVE),偽裝成native函數。再設置nativeFunc作為鉤子函數。從而實現hook功能。非常顯然,hook了的method不再具有多態性。

nativeFunc的默認函數是dvmResolveNativeMethod(vm/Native.cpp)


void dvmResolveNativeMethod(const u4* args, JValue* pResult,
    const Method* method, Thread* self)
{
    ClassObject* clazz = method->clazz;

    /*
     * If this is a static method, it could be called before the class
     * has been initialized.
     */
    if (dvmIsStaticMethod(method)) {
        if (!dvmIsClassInitialized(clazz) && !dvmInitClass(clazz)) {
            assert(dvmCheckException(dvmThreadSelf()));
            return;
        }
    } else {
        assert(dvmIsClassInitialized(clazz) ||
               dvmIsClassInitializing(clazz));
    }

    /* start with our internal-native methods */
    DalvikNativeFunc infunc = dvmLookupInternalNativeMethod(method);
    if (infunc != NULL) {
        /* resolution always gets the same answer, so no race here */
        IF_LOGVV() {
            char* desc = dexProtoCopyMethodDescriptor(&method->prototype);
            LOGVV("+++ resolved native %s.%s %s, invoking",
                clazz->descriptor, method->name, desc);
            free(desc);
        }
        if (dvmIsSynchronizedMethod(method)) {
            ALOGE("ERROR: internal-native can‘t be declared ‘synchronized‘");
            ALOGE("Failing on %s.%s", method->clazz->descriptor, method->name);
            dvmAbort();     // harsh, but this is VM-internal problem
        }
        DalvikBridgeFunc dfunc = (DalvikBridgeFunc) infunc;
        dvmSetNativeFunc((Method*) method, dfunc, NULL);
        dfunc(args, pResult, method, self);
        return;
    }

    /* now scan any DLLs we have loaded for JNI signatures */
    void* func = lookupSharedLibMethod(method);
    if (func != NULL) {
        /* found it, point it at the JNI bridge and then call it */
        dvmUseJNIBridge((Method*) method, func);
        (*method->nativeFunc)(args, pResult, method, self);
        return;
    }

    IF_ALOGW() {
        char* desc = dexProtoCopyMethodDescriptor(&method->prototype);
        ALOGW("No implementation found for native %s.%s:%s",
            clazz->descriptor, method->name, desc);
        free(desc);
    }

    dvmThrowUnsatisfiedLinkError("Native method not found", method);
}

dvmResolveNativeMethod首先會調用dvmLookupInternalNativeMethod查詢這個函數是否預置的函數,主要是查以下的函數集:

static DalvikNativeClass gDvmNativeMethodSet[] = {
    { "Ljava/lang/Object;",               dvm_java_lang_Object, 0 },
    { "Ljava/lang/Class;",                dvm_java_lang_Class, 0 },
    { "Ljava/lang/Double;",               dvm_java_lang_Double, 0 },
    { "Ljava/lang/Float;",                dvm_java_lang_Float, 0 },
    { "Ljava/lang/Math;",                 dvm_java_lang_Math, 0 },
    { "Ljava/lang/Runtime;",              dvm_java_lang_Runtime, 0 },
    { "Ljava/lang/String;",               dvm_java_lang_String, 0 },
    { "Ljava/lang/System;",               dvm_java_lang_System, 0 },
    { "Ljava/lang/Throwable;",            dvm_java_lang_Throwable, 0 },
    { "Ljava/lang/VMClassLoader;",        dvm_java_lang_VMClassLoader, 0 },
    { "Ljava/lang/VMThread;",             dvm_java_lang_VMThread, 0 },
    { "Ljava/lang/reflect/AccessibleObject;",
            dvm_java_lang_reflect_AccessibleObject, 0 },
    { "Ljava/lang/reflect/Array;",        dvm_java_lang_reflect_Array, 0 },
    { "Ljava/lang/reflect/Constructor;",
            dvm_java_lang_reflect_Constructor, 0 },
    { "Ljava/lang/reflect/Field;",        dvm_java_lang_reflect_Field, 0 },
    { "Ljava/lang/reflect/Method;",       dvm_java_lang_reflect_Method, 0 },
    { "Ljava/lang/reflect/Proxy;",        dvm_java_lang_reflect_Proxy, 0 },
    { "Ljava/util/concurrent/atomic/AtomicLong;",
            dvm_java_util_concurrent_atomic_AtomicLong, 0 },
    { "Ldalvik/bytecode/OpcodeInfo;",     dvm_dalvik_bytecode_OpcodeInfo, 0 },
    { "Ldalvik/system/VMDebug;",          dvm_dalvik_system_VMDebug, 0 },
    { "Ldalvik/system/DexFile;",          dvm_dalvik_system_DexFile, 0 },
    { "Ldalvik/system/VMRuntime;",        dvm_dalvik_system_VMRuntime, 0 },
    { "Ldalvik/system/Zygote;",           dvm_dalvik_system_Zygote, 0 },
    { "Ldalvik/system/VMStack;",          dvm_dalvik_system_VMStack, 0 },
    { "Lorg/apache/harmony/dalvik/ddmc/DdmServer;",
            dvm_org_apache_harmony_dalvik_ddmc_DdmServer, 0 },
    { "Lorg/apache/harmony/dalvik/ddmc/DdmVmInternal;",
            dvm_org_apache_harmony_dalvik_ddmc_DdmVmInternal, 0 },
    { "Lorg/apache/harmony/dalvik/NativeTestTarget;",
            dvm_org_apache_harmony_dalvik_NativeTestTarget, 0 },
    { "Lsun/misc/Unsafe;",                dvm_sun_misc_Unsafe, 0 },
    { NULL, NULL, 0 },
};

不是內置的話,就會載入so庫。查詢相應的native函數,查詢的規則就是我們熟知的了,com.xx.Helloworld.foobar相應com_xx_Helloworld_foobar。

要註意的是,這個函數並非nativeFunc。接下來的dvmUseJNIBridge調用裏,dvmCallJNIMethod會作為nativeFunc。這個函數主要須要將之前提到的java stack frame裏的ins實參,轉譯成jni的函數調用參數。xposed/dexposed就會自己設置自己的nativeFun自己接管native函數的運行。

dvmInterpret是解釋器的代碼入口,代碼位置在interp/Interp.cpp

void dvmInterpret(Thread* self, const Method* method, JValue* pResult)
{
    InterpSaveState interpSaveState;
    ExecutionSubModes savedSubModes;
    . . . 
    interpSaveState = self->interpSave;
    self->interpSave.prev = &interpSaveState; 
    . . . 

    self->interpSave.method = method;
    self->interpSave.curFrame = (u4*) self->interpSave.curFrame;
    self->interpSave.pc = method->insns;
    . . .
    typedef void (*Interpreter)(Thread*);
    Interpreter stdInterp;
    if (gDvm.executionMode == kExecutionModeInterpFast)
        stdInterp = dvmMterpStd;
#if defined(WITH_JIT)
    else if (gDvm.executionMode == kExecutionModeJit ||
             gDvm.executionMode == kExecutionModeNcgO0 ||
             gDvm.executionMode == kExecutionModeNcgO1)
        stdInterp = dvmMterpStd;
#endif
    else
        stdInterp = dvmInterpretPortable;

    // Call the interpreter
    (*stdInterp)(self);
    *pResult = self->interpSave.retval;

    /* Restore interpreter state from previous activation */
    self->interpSave = interpSaveState;
#if defined(WITH_JIT)
    dvmJitCalleeRestore(calleeSave);
#endif
    if (savedSubModes != kSubModeNormal) {
        dvmEnableSubMode(self, savedSubModes);
    }
}

Thread的一個非常重要的field就是interpSave,是InterpSaveState類型的,裏面包括了當前函數。pc。當前棧幀等重要的變量,dvmInterpret一開始調用的時候就會初始化。

Dalvik解釋器有兩個。一個是dvmInterpretPortable。一個是 dvmMterpStd。兩者的差別在於。前者是從c++實現,後者是匯編實現。


dvmInterpretPortable是在vm/mterp/out/InterpC-portable.cpp中定義


void dvmInterpretPortable(Thread* self)
{
    . . .
    DvmDex* methodClassDex;     // curMethod->clazz->pDvmDex
    JValue retval;

    /* core state */
    const Method* curMethod;    // method we‘re interpreting
    const u2* pc;               // program counter
    u4* fp;                     // frame pointer
    u2 inst;                    // current instruction
    /* instruction decoding */
    u4 ref;                     // 16 or 32-bit quantity fetched directly
    u2 vsrc1, vsrc2, vdst;      // usually used for register indexes
    /* method call setup */
    const Method* methodToCall;
    bool methodCallRange;

    /* static computed goto table */
    DEFINE_GOTO_TABLE(handlerTable);
    /* copy state in */
    curMethod = self->interpSave.method;
    pc = self->interpSave.pc;
    fp = self->interpSave.curFrame;
    retval = self->interpSave.retval;   

    methodClassDex = curMethod->clazz->pDvmDex;

    . . . 
   
    FINISH(0);                  /* fetch and execute first instruction */
/*--- start of opcodes ---*/

/* File: c/OP_NOP.cpp */
HANDLE_OPCODE(OP_NOP)
    FINISH(1);
OP_END

/* File: c/OP_MOVE.cpp */
HANDLE_OPCODE(OP_MOVE /*vA, vB*/)
    vdst = INST_A(inst);
    vsrc1 = INST_B(inst);
    ILOGV("|move%s v%d,v%d %s(v%d=0x%08x)",
        (INST_INST(inst) == OP_MOVE) ? "" : "-object", vdst, vsrc1,
        kSpacing, vdst, GET_REGISTER(vsrc1));
    SET_REGISTER(vdst, GET_REGISTER(vsrc1));
    FINISH(1);
OP_END
…..
}

解釋器的指令運行是通過跳轉表來實現,DEFINE_GOTO_TABLE(handlerTable)定義了指令Op的goto表。


FINISH(0),則表示從第一條指令開始運行,


# define FINISH(_offset) {                                                          ADJUST_PC(_offset);                                                         inst = FETCH(0);                                                            if (self->interpBreak.ctl.subMode) {                                            dvmCheckBefore(pc, fp, self);                                           }                                                                           goto *handlerTable[INST_INST(inst)];                                    }

#define FETCH(_offset)     (pc[(_offset)])

FETCH(0)獲得當前要運行的指令,通過查跳轉表handlerTable來跳轉到這條指令的運行點,就是函數後面的HANDLE_OPCODE的定義。

後者是針對不同平臺做過優化的解釋器。
dvmMterpStd會做匯編級的優化,dvmMterpStdRun的入口就是針對不同的平臺指令集,有相應的解釋器代碼,比方armv7 neon相應的代碼就在mterp/out/InterpAsm-armv7-a-neon.S。

dvmMterpStdRun:
#define MTERP_ENTRY1     .save {r4-r10,fp,lr};     stmfd   sp!, {r4-r10,fp,lr}         @ save 9 regs
#define MTERP_ENTRY2     .pad    #4;     sub     sp, sp, #4                  @ align 64

    .fnstart
    MTERP_ENTRY1
    MTERP_ENTRY2

    /* save stack pointer, add magic word for debuggerd */
    str     sp, [r0, #offThread_bailPtr]  @ save SP for eventual return

    /* set up "named" registers, figure out entry point */
    mov     rSELF, r0                   @ set rSELF
    LOAD_PC_FP_FROM_SELF()              @ load rPC and rFP from "thread"
    ldr     rIBASE, [rSELF, #offThread_curHandlerTable] @ set rIBASE
    . . .
    /* start executing the instruction at rPC */
    FETCH_INST()                        @ load rINST from rPC
    GET_INST_OPCODE(ip)                 @ extract opcode from rINST
    GOTO_OPCODE(ip)                     @ jump to next instruction
    . . .

#define rPC     r4
#define rFP     r5
#define rSELF   r6
#define rINST   r7
#define rIBASE  r8

非jit的情況下,先是FETCH_INST把pc的指令載入到rINST寄存器,之後GET_INST_OPCODE獲得操作碼 and _reg, rINST, #255。是把rINST的低16位給ip寄存器,GOTO_OPCODE跳轉到相應的地址。

#define GOTO_OPCODE(_reg) add pc, rIBASE, _reg, lsl #6

rIBASE 指向的curHandlerTable是跳轉表的首地址。GOTO_OPCODE(ip)就將pc的地址指向該指令相應的操作碼所在的跳轉表地址。



static Thread* allocThread(int interpStackSize)
#ifndef DVM_NO_ASM_INTERP
    thread->mainHandlerTable = dvmAsmInstructionStart;
    thread->altHandlerTable = dvmAsmAltInstructionStart;
    thread->interpBreak.ctl.curHandlerTable = thread->mainHandlerTable;
#endif

可見dvmAsmInstructionStart就是跳轉表的入口,定義在dvmMterpStdRun裏,
你能夠在這裏找到全部的Java字節碼的指令相應的解釋器代碼。

比方new操作符相應的代碼例如以下,先載入Thread.interpSave.methodClassDex,這是一個DvmDex指針,隨後載入 DvmDex的pResClasses來查找類是否載入過。假設沒載入過,那麽跳轉到 LOP_NEW_INSTANCE_resolve去載入類,假設載入過,就是類的初始化以及AllocObject的處理。LOP_NEW_INSTANCE_resolve就是調用clazz的dvmResolveClass載入。

/* ------------------------------ */
    .balign 64
.L_OP_NEW_INSTANCE: /* 0x22 */
/* File: armv5te/OP_NEW_INSTANCE.S */
    /*
     * Create a new instance of a class.
     */
    /* new-instance vAA, class@BBBB */
    ldr     r3, [rSELF, #offThread_methodClassDex]    @ r3<- pDvmDex
    FETCH(r1, 1)                        @ r1<- BBBB
    ldr     r3, [r3, #offDvmDex_pResClasses]    @ r3<- pDvmDex->pResClasses
    ldr     r0, [r3, r1, lsl #2]        @ r0<- resolved class
#if defined(WITH_JIT)
    add     r10, r3, r1, lsl #2         @ r10<- &resolved_class
#endif
    EXPORT_PC()                         @ req‘d for init, resolve, alloc
    cmp     r0, #0                      @ already resolved?
    beq     .LOP_NEW_INSTANCE_resolve         @ no, resolve it now
.LOP_NEW_INSTANCE_resolved:   @ r0=class
    ldrb    r1, [r0, #offClassObject_status]    @ r1<- ClassStatus enum
    cmp     r1, #CLASS_INITIALIZED      @ has class been initialized?
    bne     .LOP_NEW_INSTANCE_needinit        @ no, init class now
.LOP_NEW_INSTANCE_initialized: @ r0=class
    mov     r1, #ALLOC_DONT_TRACK       @ flags for alloc call
    bl      dvmAllocObject              @ r0<- new object
    b       .LOP_NEW_INSTANCE_finish          @ continue


.LOP_NEW_INSTANCE_needinit:
    mov     r9, r0                      @ save r0
    bl      dvmInitClass                @ initialize class
    cmp     r0, #0                      @ check boolean result
    mov     r0, r9                      @ restore r0
    bne     .LOP_NEW_INSTANCE_initialized     @ success, continue
    b       common_exceptionThrown      @ failed, deal with init exception

    /*
     * Resolution required.  This is the least-likely path.
     *
     *  r1 holds BBBB
     */
.LOP_NEW_INSTANCE_resolve:
    ldr     r3, [rSELF, #offThread_method] @ r3<- self->method
    mov     r2, #0                      @ r2<- false
    ldr     r0, [r3, #offMethod_clazz]  @ r0<- method->clazz
    bl      dvmResolveClass             @ r0<- resolved ClassObject ptr
    cmp     r0, #0                      @ got null?
    bne     .LOP_NEW_INSTANCE_resolved        @ no, continue
    b       common_exceptionThrown      @ yes, handle exception


作者簡單介紹:

田力。網易彩票Android端創始人,小米視頻創始人。現任roobo技術經理、視頻雲技術總監

歡迎關註微信公眾號 磨劍石,定期推送技術心得以及源代碼分析等文章,謝謝

技術分享圖片

深入理解Dalvik虛擬機- 解釋器的執行機制