1. 程式人生 > >jvm原始碼分析之oop-klass物件模型

jvm原始碼分析之oop-klass物件模型

概述

HotSpot是基於c++實現,而c++是一門面向物件的語言,本身具備面向物件基本特徵,所以Java中的物件表示,最簡單的做法是為每個Java類生成一個c++類與之對應。

但HotSpot JVM並沒有這麼做,而是設計了一個OOP-Klass Model。這裡的 OOP 指的是 Ordinary Object Pointer (普通物件指標),它用來表示物件的例項資訊,看起來像個指標實際上是藏在指標裡的物件。而 Klass 則包含元資料和方法資訊,用來描述Java類。

之所以採用這個模型是因為HotSopt JVM的設計者不想讓每個物件中都含有一個vtable(虛擬函式表),所以就把物件模型拆成klass和oop,其中oop中不含有任何虛擬函式,而Klass就含有虛擬函式表,可以進行method dispatch。

oop-klass物件模型

klass

Klass簡單的說是Java類在HotSpot中的c++對等體,用來描述Java類。

Klass主要有兩個功能:

  • 實現語言層面的Java類
  • 實現Java物件的分發功能

Klass是什麼時候建立的呢?一般jvm在載入class檔案時,會在方法區建立instanceKlass,表示其元資料,包括常量池、欄位、方法等。

oop

Klass是在class檔案在載入過程中建立的,OOP則是在Java程式執行過程中new物件時建立的。

一個OOP物件包含以下幾個部分:

  • 物件頭 (header)
    • Mark Word,主要儲存物件執行時記錄資訊,如hashcode, GC分代年齡,鎖狀態標誌,執行緒ID,時間戳等
    • 元資料指標,即指向方法區的instanceKlass例項
  • 例項資料。儲存的是真正有效資料,如各種欄位內容,各欄位的分配策略為longs/doubles、ints、shorts/chars、bytes/boolean、oops(ordinary object pointers),相同寬度的欄位總是被分配到一起,便於之後取資料。父類定義的變數會出現在子類定義的變數的前面。
  • 對齊填充。僅僅起到佔位符的作用,並非必須。

例項說明

假設我們有以下程式碼:

class Model
{
    public static int a = 1;
    public int b;

    public Model(int b) {
        this.b = b;
    }
}

public static void main(String[] args) {
    int c = 10;
    Model modelA = new Model(2);
    Model modelB = new Model(3);
}

上述程式碼得OOP-Klass模型入下所示

oop-klass的jvm原始碼分析

oop.hpp

oopDesc類描述了java物件的格式。

oopDesc中包含兩個資料成員:_mark 和 _metadata。

  • _mark物件即為Mark World,儲存物件執行時記錄資訊,如hashcode, GC分代年齡,鎖狀態標誌,執行緒ID,時間戳等。
  • _metadata即為元資料指標,它是一個聯合體,其中_klass是普通指標,_compressed_klass是壓縮類指標,這兩個指標都指向instanceKlass物件。
// oopDesc is the top baseclass for objects classes.  The {name}Desc classes describe
// the format of Java objects so the fields can be accessed from C++.
//這個類描述了java物件的格式
// oopDesc is abstract.
// (see oopHierarchy for complete oop class hierarchy)
//
// no virtual functions allowed  不允許虛擬函式
class oopDesc {
  friend class VMStructs;
 private:
  volatile markOop  _mark;  //Mark Word
  union _metadata {    //元資料指標
    wideKlassOop    _klass;
    narrowOop       _compressed_klass;
  } _metadata;

}

instanceOop.hpp

 instanceOopDesc繼承了oopDesc,它代表了java類的一個例項化物件。

// An instanceOop is an instance of a Java Class
// Evaluating "new HashTable()" will create an instanceOop.

class instanceOopDesc : public oopDesc {
 public:
  // aligned header size.
  static int header_size() { return sizeof(instanceOopDesc)/HeapWordSize; }

  // If compressed, the offset of the fields of the instance may not be aligned.
  static int base_offset_in_bytes() {
    return UseCompressedOops ?
             klass_gap_offset_in_bytes() :
             sizeof(instanceOopDesc);
  }

  static bool contains_field_offset(int offset, int nonstatic_field_size) {
    int base_in_bytes = base_offset_in_bytes();
    return (offset >= base_in_bytes &&
            (offset-base_in_bytes) < nonstatic_field_size * heapOopSize);
  }
};

instanceKlass.hpp

instanceKlass是Java類的vm級別的表示。

其中,ClassState描述了類載入的狀態:分配、載入、連結、初始化。

instanceKlass的佈局包括:宣告介面、欄位、方法、常量池、原始檔名等等。

// An instanceKlass is the VM level representation of a Java class.
// It contains all information needed for at class at execution runtime.

class instanceKlass: public Klass {
  friend class VMStructs;
 public:

  enum ClassState {
    unparsable_by_gc = 0,               // object is not yet parsable by gc. Value of _init_state at object allocation.
    allocated,                          // allocated (but not yet linked)
    loaded,                             // loaded and inserted in class hierarchy (but not linked yet)
    linked,                             // successfully linked/verified (but not initialized yet)
    being_initialized,                  // currently running class initializer
    fully_initialized,                  // initialized (successfull final state)
    initialization_error                // error happened during initialization
  };

//部分內容省略
protected:
  // Method array.  方法陣列
  objArrayOop     _methods; 
  // Interface (klassOops) this class declares locally to implement.
  objArrayOop     _local_interfaces;  //該類宣告要實現的介面.
  // Instance and static variable information
  typeArrayOop    _fields; 
  // Constant pool for this class.
  constantPoolOop _constants;     //常量池
  // Class loader used to load this class, NULL if VM loader used.
  oop             _class_loader;  //類載入器
  typeArrayOop    _inner_classes;   //內部類
  Symbol*         _source_file_name;   //原始檔名
  

}

markOop.hpp

markOop描述了java的物件頭格式。

// The markOop describes the header of an object.
//markOop描述了Java的物件頭
// Note that the mark is not a real oop but just a word.
// It is placed in the oop hierarchy for historical reasons.
//
// Bit-format of an object header (most significant first, big endian layout below):
//
//  32 bits:
//  --------
//             hash:25 ------------>| age:4    biased_lock:1 lock:2 (normal object)
//             JavaThread*:23 epoch:2 age:4    biased_lock:1 lock:2 (biased object)
//             size:32 ------------------------------------------>| (CMS free block)
//             PromotedObject*:29 ---------->| promo_bits:3 ----->| (CMS promoted object)
//
//  64 bits:
//  --------
//  unused:25 hash:31 -->| unused:1   age:4    biased_lock:1 lock:2 (normal object)
//  JavaThread*:54 epoch:2 unused:1   age:4    biased_lock:1 lock:2 (biased object)
//  PromotedObject*:61 --------------------->| promo_bits:3 ----->| (CMS promoted object)
//  size:64 ----------------------------------------------------->| (CMS free block)
//
//  unused:25 hash:31 -->| cms_free:1 age:4    biased_lock:1 lock:2 (COOPs && normal object)
//  JavaThread*:54 epoch:2 cms_free:1 age:4    biased_lock:1 lock:2 (COOPs && biased object)
//  narrowOop:32 unused:24 cms_free:1 unused:4 promo_bits:3 ----->| (COOPs && CMS promoted object)
//  unused:21 size:35 -->| cms_free:1 unused:7 ------------------>| (COOPs && CMS free block)

class markOopDesc: public oopDesc {
 private:
  // Conversion
  uintptr_t value() const { return (uintptr_t) this; }

 public:
  // Constants
  enum { age_bits                 = 4,
         lock_bits                = 2,
         biased_lock_bits         = 1,
         max_hash_bits            = BitsPerWord - age_bits - lock_bits - biased_lock_bits,
         hash_bits                = max_hash_bits > 31 ? 31 : max_hash_bits,
         cms_bits                 = LP64_ONLY(1) NOT_LP64(0),
         epoch_bits               = 2
  };

  // The biased locking code currently requires that the age bits be
  // contiguous to the lock bits.
  enum { lock_shift               = 0,
         biased_lock_shift        = lock_bits,
         age_shift                = lock_bits + biased_lock_bits,
         cms_shift                = age_shift + age_bits,
         hash_shift               = cms_shift + cms_bits,
         epoch_shift              = hash_shift
  };
//部分內容省略
}

instanceOopDesc物件的建立過程

allocate_instance方法

instanceOopDesc物件通過instanceKlass::allocate_instance進行建立,實現過程如下:
1、has_finalizer判斷當前類是否包含不為空的finalize方法;
2、size_helper確定建立當前物件需要分配多大記憶體;
3、CollectedHeap::obj_allocate從堆中申請指定大小的記憶體,並建立instanceOopDesc物件

instanceKlass.cpp

instanceOop instanceKlass::allocate_instance(TRAPS) {
  assert(!oop_is_instanceMirror(), "wrong allocation path");
  bool has_finalizer_flag = has_finalizer(); // Query before possible GC
  int size = size_helper();  // Query before forming handle.

  KlassHandle h_k(THREAD, as_klassOop());

  instanceOop i;

  i = (instanceOop)CollectedHeap::obj_allocate(h_k, size, CHECK_NULL);
  if (has_finalizer_flag && !RegisterFinalizersAtInit) {
    i = register_finalizer(i, CHECK_NULL);
  }
  return i;
}

 obj_allocate方法

CollectedHeap::obj_allocate從堆中申請指定大小的記憶體,並建立instanceOopDesc物件,實現如下:

CollectedHeap.inline.hpp

oop CollectedHeap::obj_allocate(KlassHandle klass, int size, TRAPS) {
  debug_only(check_for_valid_allocation_state());
  assert(!Universe::heap()->is_gc_active(), "Allocation during gc not allowed");
  assert(size >= 0, "int won't convert to size_t");
  HeapWord* obj = common_mem_allocate_init(klass, size, CHECK_NULL);
  post_allocation_setup_obj(klass, obj);
  NOT_PRODUCT(Universe::heap()->check_for_bad_heap_word_value(obj, size));
  return (oop)obj;
}

common_mem_allocate_noinit方法

該方法的實現如下:

1、如果開啟了TLAB優化,從tlab分配記憶體並返回(TLAB全稱ThreadLocalAllocBuffer,是執行緒的一塊私有記憶體);

2、如果第一步不執行,呼叫Universe::heap()->mem_allocate方法在堆上分配記憶體並返回;

HeapWord* CollectedHeap::common_mem_allocate_noinit(KlassHandle klass, size_t size, TRAPS) {

  // Clear unhandled oops for memory allocation.  Memory allocation might
  // not take out a lock if from tlab, so clear here.
  CHECK_UNHANDLED_OOPS_ONLY(THREAD->clear_unhandled_oops();)

  if (HAS_PENDING_EXCEPTION) {
    NOT_PRODUCT(guarantee(false, "Should not allocate with exception pending"));
    return NULL;  // caller does a CHECK_0 too
  }

  HeapWord* result = NULL;
  if (UseTLAB) {  //如果開啟了TLAB優化
    result = allocate_from_tlab(klass, THREAD, size);
    if (result != NULL) {
      assert(!HAS_PENDING_EXCEPTION,
             "Unexpected exception, will result in uninitialized storage");
      return result;
    }
  }
  bool gc_overhead_limit_was_exceeded = false;
  result = Universe::heap()->mem_allocate(size,
                                          &gc_overhead_limit_was_exceeded);
  if (result != NULL) {
    NOT_PRODUCT(Universe::heap()->
      check_for_non_bad_heap_word_value(result, size));
    assert(!HAS_PENDING_EXCEPTION,
           "Unexpected exception, will result in uninitialized storage");
    THREAD->incr_allocated_bytes(size * HeapWordSize);

    AllocTracer::send_allocation_outside_tlab_event(klass, size * HeapWordSize);

    return result;
  }

mem_allocate方法

假設使用G1垃圾收集器,該方法實現如下:

g1CollectedHeap.cpp

HeapWord*
G1CollectedHeap::mem_allocate(size_t word_size,
                              bool*  gc_overhead_limit_was_exceeded) {
  assert_heap_not_locked_and_not_at_safepoint();

  // Loop until the allocation is satisfied, or unsatisfied after GC.
  for (int try_count = 1; /* we'll return */; try_count += 1) {
    unsigned int gc_count_before;

    HeapWord* result = NULL;
    if (!isHumongous(word_size)) {
      result = attempt_allocation(word_size, &gc_count_before);
    } else {
      result = attempt_allocation_humongous(word_size, &gc_count_before);
    }
    if (result != NULL) {
      return result;
    }

    // Create the garbage collection operation...
    VM_G1CollectForAllocation op(gc_count_before, word_size);
    // ...and get the VM thread to execute it.
    VMThread::execute(&op);

    if (op.prologue_succeeded() && op.pause_succeeded()) {
      // If the operation was successful we'll return the result even
      // if it is NULL. If the allocation attempt failed immediately
      // after a Full GC, it's unlikely we'll be able to allocate now.
      HeapWord* result = op.result();
      if (result != NULL && !isHumongous(word_size)) {
        // Allocations that take place on VM operations do not do any
        // card dirtying and we have to do it here. We only have to do
        // this for non-humongous allocations, though.
        dirty_young_block(result, word_size);
      }
      return result;
    } else {
      assert(op.result() == NULL,
             "the result should be NULL if the VM op did not succeed");
    }

    // Give a warning if we seem to be looping forever.
    if ((QueuedAllocationWarningCount > 0) &&
        (try_count % QueuedAllocationWarningCount == 0)) {
      warning("G1CollectedHeap::mem_allocate retries %d times", try_count);
    }
  }

  ShouldNotReachHere();
  return NULL;
}

成員變數在物件中的佈局

佈局策略

各欄位的分配策略為longs/doubles、ints、shorts/chars、bytes/boolean、oops(ordinary object pointers),相同寬度的欄位總是被分配到一起,便於之後取資料。父類定義的變數會出現在子類定義的變數的前面。

事實上,它有三種分配策略:

 First Fields order: oops, longs/doubles, ints, shorts/chars, bytes

Second Fields order: longs/doubles, ints, shorts/chars, bytes, oops

Third Fields allocation: oops fields in super and sub classes are together.

我們使用的一般是第二種分配策略。

jvm原始碼實現位於classFileParser.cpp

parseClassFile方法

該函式主要功能就是根據JVM Spec解析class檔案,它依次解析以下部分:

1. class檔案的一些元資訊,包括class檔案的magic number以及它的minor/major版本號。

2. constant pool。

3. 類的訪問標記以及類的屬性(是否是class/interface,當前類的index,父類的index)

4. interfaces的描述

5. fields的描述

6. methods的描述

5. attributes的描述

在Hotspot中,每個類在初始化時就會完成成員變數在物件佈局的初始化。具體而言就是在class檔案被解析的時候完成這個步驟的。

該步驟實現如下(以不存在父類和靜態欄位為例):

1、判斷父類是否存在,如果存在,獲取父類的非靜態欄位的大小;

// Field size and offset computation
	//判斷是否有父類,如果沒有父類,非靜態欄位的大小為0,否則設為父類的非靜態欄位的大小
    int nonstatic_field_size = super_klass() == NULL ? 0 : super_klass->nonstatic_field_size();

2、求出首個非靜態欄位在物件的偏移;

instanceOopDesc::base_offset_in_bytes()方法返回的其實是Java物件頭的大小。

假如父類不存在,即nonstatic_field_size為0,首個非靜態欄位在物件的偏移量即為Java物件頭的大小。

heapOopSize指的是oop的大小,它依賴於是否開啟UseCompressedOops(預設開啟)。開啟時為4-byte否則為8-byte。

因為nonstatic_field_size的單位是heapOopSize故要換算成offset需要乘上它。

first_nonstatic_field_offset = instanceOopDesc::base_offset_in_bytes() +
                                   nonstatic_field_size * heapOopSize;

 3、求出各種欄位型別的個數,初始化next指標為first;

next_nonstatic_field_offset變數相當於是一個pointer。

first_nonstatic_field_offset = instanceOopDesc::base_offset_in_bytes() +
                                   nonstatic_field_size * heapOopSize;
    next_nonstatic_field_offset = first_nonstatic_field_offset; //初始化next指標為first

    unsigned int nonstatic_double_count = fac.count[NONSTATIC_DOUBLE];//double和long欄位型別
    unsigned int nonstatic_word_count   = fac.count[NONSTATIC_WORD]; //int和float欄位型別
    unsigned int nonstatic_short_count  = fac.count[NONSTATIC_SHORT]; //short欄位型別
    unsigned int nonstatic_byte_count   = fac.count[NONSTATIC_BYTE]; //short欄位型別
    unsigned int nonstatic_oop_count    = fac.count[NONSTATIC_OOP];  //oop欄位型別

4、根據分配策略求出首個欄位型別在物件的偏移;

如果是第一種分配策略:先求出oop型別欄位和double型別欄位的偏移;

如果是第二種分配策略:先求出double型別欄位的偏移;

 if( allocation_style == 0 ) {  
      // Fields order: oops, longs/doubles, ints, shorts/chars, bytes
      next_nonstatic_oop_offset    = next_nonstatic_field_offset;
      next_nonstatic_double_offset = next_nonstatic_oop_offset +
                                      (nonstatic_oop_count * heapOopSize);
    } else if( allocation_style == 1 ) {
      // Fields order: longs/doubles, ints, shorts/chars, bytes, oops
      next_nonstatic_double_offset = next_nonstatic_field_offset;
    } else if( allocation_style == 2 ) {
    //第三種分配策略此處不討論
}

5、求出各種欄位型別在物件的偏移;

按照double >> word >> short >> btye的欄位順序:

 word欄位的偏移 =  double欄位的偏移 + (double欄位的個數 * 一個double欄位的位元組長度)

short欄位的偏移 =  word欄位的偏移 + (word欄位的個數 * 一個word欄位的位元組長度)

btye欄位的偏移 =  short欄位的偏移 + (short欄位的個數 * 一個short欄位的位元組長度)

  next_nonstatic_word_offset  = next_nonstatic_double_offset +
                                  (nonstatic_double_count * BytesPerLong);
    next_nonstatic_short_offset = next_nonstatic_word_offset +
                                  (nonstatic_word_count * BytesPerInt);
    next_nonstatic_byte_offset  = next_nonstatic_short_offset +
                                  (nonstatic_short_count * BytesPerShort);