1. 程式人生 > >hadoop 2.6.0 LightWeightGSet源碼分析

hadoop 2.6.0 LightWeightGSet源碼分析

lar therefore name ref implement urn round runtime info

LightWeightGSet的作用用一個數組來存儲元素,而且用鏈表來解決沖突。不能rehash。所以內部數組永遠不用改變大小。此類不支持空元素。

此類也不是線程安全的。有兩個類型參數。第一個用於查找元素,第二個類型參數必須是第一個類型參數的子類,而且必須實現LinkedElement接口。

/**
 * A low memory footprint [email protected] GSet} implementation,
 * which uses an array for storing the elements
 * and linked lists for collision resolution.
 *
 * No rehash will be performed.
 * Therefore, the internal array will never be resized.
 *
 * This class does not support null element.
 *
 * This class is not thread safe.
 *
 * @param <K> Key type for looking up the elements
 * @param <E> Element type, which must be
 *       (1) a subclass of K, and
 *       (2) implementing [email protected]
/* */ LinkedElement} interface. */


裏面各組件都很好理解,唯一不好理解的是Iterator,

public class SetIterator implements Iterator<E> {
    /** The starting modification for fail-fast. */
    private int iterModification = modification;
    /** The current index of the entry array. */
    private int index = -1;
    private LinkedElement cur = null;//
    private LinkedElement next = nextNonemptyEntry();//next總是指向下一個元素,在初始化時就完畢指下第一個元素。

//在調用next()方法之後,next置為空。直到調用 ensureNext()方法。 private boolean trackModification = true; /** Find the next nonempty entry starting at (index + 1). */ private LinkedElement nextNonemptyEntry() { for(index++; index < entries.length && entries[index] == null; index++); return index < entries.length? entries[index]: null; } private void ensureNext() { if (trackModification && modification != iterModification) { throw new ConcurrentModificationException("modification=" + modification + " != iterModification = " + iterModification); } if (next != null) { return; } if (cur == null) { return; } next = cur.getNext(); if (next == null) { next = nextNonemptyEntry(); } } @Override public boolean hasNext() { ensureNext(); return next != null; } @Override public E next() { ensureNext(); if (next == null) { throw new IllegalStateException("There are no more elements"); } cur = next; next = null; return convert(cur); } @SuppressWarnings("unchecked") @Override public void remove() { ensureNext(); if (cur == null) { throw new IllegalStateException("There is no current element " + "to remove"); } LightWeightGSet.this.remove((K)cur); iterModification++; cur = null; } public void setTrackModification(boolean trackModification) { this.trackModification = trackModification; } }</span>



computeCapacity()是一個工具方法,用於一定比例的內存的容器,能夠存儲多少對象。

參數,第一個是占最大內存的百分比,第二個是名稱。沒有什麽用,僅僅用作日誌輸出。



/**
   * Let t = percentage of max memory.
   * Let e = round(log_2 t).
   * Then, we choose capacity = 2^e/(size of reference),
   * unless it is outside the close interval [1, 2^30].
   */
  public static int computeCapacity(double percentage, String mapName) {
    return computeCapacity(Runtime.getRuntime().maxMemory(), percentage,
        mapName);
  }
  
  @VisibleForTesting
  static int computeCapacity(long maxMemory, double percentage,
      String mapName) {
    if (percentage > 100.0 || percentage < 0.0) {
      throw new HadoopIllegalArgumentException("Percentage " + percentage
          + " must be greater than or equal to 0 "
          + " and less than or equal to 100");
    }
    if (maxMemory < 0) {
      throw new HadoopIllegalArgumentException("Memory " + maxMemory
          + " must be greater than or equal to 0");
    }
    if (percentage == 0.0 || maxMemory == 0) {
      return 0;
    }
    //VM detection
    //See http://java.sun.com/docs/hotspot/HotSpotFAQ.html#64bit_detection
    final String vmBit = System.getProperty("sun.arch.data.model");

    //Percentage of max memory
    final double percentDivisor = 100.0/percentage;
    final double percentMemory = maxMemory/percentDivisor;
    
    //compute capacity
  /*
  具體描寫敘述例如以下:e1應該是以2為base的對數。如percentMemory為1024,結果為10。由於Math類不提供以2為base的對數,
所以採用了間接的方法,先求自然對數,再除以2的自然對數。例System.out.println(Math.log(1024)/Math.log(2));結果為10。
+0.5是為了四舍五入。
假設占用內存為1G,則e1為30.
e2的值,假設系統為32位,則減2,由於2,e2為28,c為2的28次方,為256M個對象,每一個對象指針在32位系統中占4字節。總共1G.
假設系統為64位,e2為27,即對象個數為128M,每一個對象指針為8字節。所以共占1G.
*/ 
 final int e1 = (int)(Math.log(percentMemory)/Math.log(2.0) + 0.5);
    final int e2 = e1 - ("32".equals(vmBit)?

2: 3); final int exponent = e2 < 0? 0: e2 > 30? 30: e2; final int c = 1 << exponent; LOG.info("Computing capacity for map " + mapName); LOG.info("VM type = " + vmBit + "-bit"); LOG.info(percentage + "% max memory " + StringUtils.TraditionalBinaryPrefix.long2String(maxMemory, "B", 1) + " = " + StringUtils.TraditionalBinaryPrefix.long2String((long) percentMemory, "B", 1)); LOG.info("capacity = 2^" + exponent + " = " + c + " entries"); return c; }


hadoop 2.6.0 LightWeightGSet源碼分析