1. 程式人生 > >Java容器類(2)List原始碼解析

Java容器類(2)List原始碼解析

定義

在Java API中,官方給出的前兩段話如下:

An ordered collection (also known as a sequence). The user of this interface has precise control over where in the list each element is inserted. The user can access elements by their integer index (position in the list), and search for elements in the list.
Unlike sets, lists typically allow duplicate elements. More formally, lists typically allow pairs of elements e1 and e2 such that e1.equals(e2), and they typically allow multiple null elements if they allow null elements at all. It is not inconceivable that someone might wish to implement a list that prohibits duplicates, by throwing runtime exceptions when the user attempts to insert them, but we expect this usage to be rare.

大概的意思是:一個有序的Collection(或者叫做序列)。使用這個介面可以精確掌控元素的插入,還可以根據index獲取相應位置的元素。使用者可以通過int型索引(list的位置)訪問元素,並搜尋這個list的元素。
與Set集合不同,List允許重複元素的插入。更正式地講,列表通常允許元素e1和e2可以想e1.equals(e2)進行比較,並且如果它們通常允許多個空元素。有人可能為了禁止插入相同的值,希望自己去實現一個List,並且在重複元素插入的時候丟擲異常,但是不建議這麼做。

新介面listIterator

然後我們再看下List介面新增的介面,會發現add,get這些都多了index引數,說明在原來Collection的基礎上,List是一個可以指定索引,有序的容器。在這注意以下新增的2個新Iteractor方法。

//Returns a list iterator over the elements in this list (in proper sequence).
ListIterator<E> listIterator();
//Returns a list iterator over the elements in this list (in proper sequence),
//starting at the specified position in the list.
ListIterator<E> listIterator(int index);

我們再看ListIterator的程式碼,去掉不太重要的註釋

/**
 * An iterator for lists that allows the programmer
 * to traverse the list in either direction, modify
 * the list during iteration, and obtain the iterator's
 * current position in the list.
 * 列表的迭代器,它允許程式設計師在任意方向遍歷列表,
 * 在迭代期間修改列表,並獲得迭代器在列表中的當前位置。
 */
public interface ListIterator<E> extends Iterator<E> {
    /**
     * Returns {@code true} if this list iterator has more elements when
     * traversing the list in the forward direction. (In other words,
     * returns {@code true} if {@link #next} would return an element rather
     * than throwing an exception.)
     * 如果在往前遍歷過程中有更多元素,並且沒有丟擲異常。則返回true
     */
    boolean hasNext();

    /**
     * Returns the next element in the list and advances the cursor position.
     * This method may be called repeatedly to iterate through the list,
     * or intermixed with calls to {@link #previous} to go back and forth.
     * (Note that alternating calls to {@code next} and {@code previous}
     * will return the same element repeatedly.)
     * 返回列表中的下一個元素並前進游標位置。可重複呼叫此方法以遍歷列表
     */
    E next();

    /**
     * Returns {@code true} if this list iterator has more elements when
     * traversing the list in the reverse direction.  (In other words,
     * returns {@code true} if {@link #previous} would return an element
     * rather than throwing an exception.)
     * 如果這個列表在反向遍歷列表時有更多的元素。
     */
    boolean hasPrevious();

    /**
     * Returns the previous element in the list and moves the cursor
     * position backwards.  This method may be called repeatedly to
     * iterate through the list backwards, or intermixed with calls to
     * {@link #next} to go back and forth.  (Note that alternating calls
     * to {@code next} and {@code previous} will return the same
     * element repeatedly.)
     *返回列表中的前一個元素並向後移動游標位置
     */
    E previous();

    /**
     * Returns the index of the element that would be returned by a
     * subsequent call to {@link #next}. (Returns list size if the list
     * iterator is at the end of the list.)
     * 返回後續呼叫將返回的元素的索引
     */
    int nextIndex();

    /**
     * Returns the index of the element that would be returned by a
     * subsequent call to {@link #previous}. (Returns -1 if the list
     * iterator is at the beginning of the list.)
     */
    int previousIndex();
    void remove();
    /**
     * Replaces the last element returned by {@link #next} or
     * {@link #previous} with the specified element (optional operation).
     * This call can be made only if neither {@link #remove} nor {@link
     * #add} have been called after the last call to {@code next} or
     * {@code previous}.
     */
    void set(E e);

    /**
     * Inserts the specified element into the list (optional operation).
     * The element is inserted immediately before the element that
     * would be returned by {@link #next}, if any, and after the element
     * that would be returned by {@link #previous}, if any.  (If the
     * list contains no elements, the new element becomes the sole element
     * on the list.)  The new element is inserted before the implicit
     * cursor: a subsequent call to {@code next} would be unaffected, and a
     * subsequent call to {@code previous} would return the new element.
     * (This call increases by one the value that would be returned by a
     * call to {@code nextIndex} or {@code previousIndex}.)
     */
    void add(E e);
}

一個集合在遍歷過程中進行插入刪除操作很容易造成錯誤,特別是無序佇列,是無法在遍歷過程中進行這些操作的。但是List是一個有序集合,所以在這實現了一個ListIteractor,可以在遍歷過程中進行元素操作,並且可以雙向訪問。

ArrayList

就Java文件的解釋,整理出以下幾點特點:

  1. ArrayList是一個實現了List介面的可變陣列
  2. 可以插入null
  3. 它的size, isEmpty, get, set, iterator,add這些方法的時間複雜度是O(1),如果add n個數據則時間複雜度是O(n).
  4. ArrayList不是synchronized的。
    然後我們來簡單看下ArrayList原始碼實現。這裡只寫部分原始碼分析。
    所有元素都是儲存在一個Object陣列中,然後通過size控制長度。
public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
    private static final long serialVersionUID = 8683452581122892189L;

    /**
     * Default initial capacity.
     */
    private static final int DEFAULT_CAPACITY = 10;

    /**
     * Shared empty array instance used for empty instances.
     */
    private static final Object[] EMPTY_ELEMENTDATA = {};

    /**
     * Shared empty array instance used for default sized empty instances. We
     * distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
     * first element is added.
     */
    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

    /**
     * The array buffer into which the elements of the ArrayList are stored.
     * The capacity of the ArrayList is the length of this array buffer. Any
     * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
     * will be expanded to DEFAULT_CAPACITY when the first element is added.
     */
    transient Object[] elementData; // non-private to simplify nested class access

    /**
     * The size of the ArrayList (the number of elements it contains).
     */
    private int size;
}

這時候看下add的程式碼分析

public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}

private void ensureCapacityInternal(int minCapacity) {
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
    }

    ensureExplicitCapacity(minCapacity);
}

private void ensureExplicitCapacity(int minCapacity) {
    modCount++;

    // overflow-conscious code
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}

其實在每次add的時候會判斷資料長度,如果不夠的話會呼叫Arrays.copyOf,複製一份更長的陣列,並把前面的資料放進去。
我們再看下remove的程式碼是如何實現的。

public E remove(int index) {
    rangeCheck(index);
    modCount++;
    E oldValue = elementData(index);
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    elementData[--size] = null; // clear to let GC do its work
    return oldValue;
}

其實就是直接使用System.arraycopy把需要刪除index後面的都往前移一位然後再把最後一個去掉。

LinkedList

LinkedList是一個連結串列維護的序列容器。和ArrayList都是序列容器,一個使用陣列儲存,一個使用連結串列儲存。

陣列和連結串列

  1. 查詢方面。陣列的效率更高,可以直接索引出查詢,而連結串列必須從頭查詢。
  2. 插入刪除方面。特別是在中間進行插入刪除,這時候連結串列體現出了極大的便利性,只需要在插入或者刪除的地方斷掉鏈然後插入或者移除元素,然後再將前後鏈重新組裝,但是陣列必須重新複製一份將所有資料後移或者前移。
  3. 在記憶體申請方面,當陣列達到初始的申請長度後,需要重新申請一個更大的陣列然後把資料遷移過去才行。而連結串列只需要動態建立即可。
    如上LinkedList和ArrayList的區別也就在此。根據使用場景選擇更加適合的List。

原始碼解析。

LinkedList原始碼的屬性結構

public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable, java.io.Serializable
{
    transient int size = 0;

    /**
     * Pointer to first node.
     * Invariant: (first == null && last == null) ||
     *            (first.prev == null && first.item != null)
     */
    transient Node<E> first;

    /**
     * Pointer to last node.
     * Invariant: (first == null && last == null) ||
     *            (last.next == null && last.item != null)
     */
    transient Node<E> last;

    /**
     * Constructs an empty list.
     */
    public LinkedList() {
    }
}

節點Node的定義,它在LinkedList中是一個靜態內部類。

private static class Node<E> {
    E item;
    Node<E> next;
    Node<E> prev;
    Node(Node<E> prev, E element, Node<E> next) {
        this.item = element;
        this.next = next;
        this.prev = prev;
    }
}

每個LinkedList中會持有連結串列的頭指標和尾指標

列舉最基本的插入和刪除的連結串列操作

private void linkFirst(E e) {
    final Node<E> f = first;
    final Node<E> newNode = new Node<>(null, e, f);
    first = newNode;
    if (f == null)
        last = newNode;
    else
        f.prev = newNode;
    size++;
    modCount++;
}

void linkLast(E e) {
    final Node<E> l = last;
    final Node<E> newNode = new Node<>(l, e, null);
    last = newNode;
    if (l == null)
        first = newNode;
    else
        l.next = newNode;
    size++;
    modCount++;
}
    
void linkBefore(E e, Node<E> succ) {
    // assert succ != null;
    final Node<E> pred = succ.prev;
    final Node<E> newNode = new Node<>(pred, e, succ);
    succ.prev = newNode;
    if (pred == null)
        first = newNode;
    else
        pred.next = newNode;
    size++;
    modCount++;
}

private E unlinkFirst(Node<E> f) {
    // assert f == first && f != null;
    final E element = f.item;
    final Node<E> next = f.next;
    f.item = null;
    f.next = null; // help GC
    first = next;
    if (next == null)
        last = null;
    else
        next.prev = null;
    size--;
    modCount++;
    return element;
}

private E unlinkLast(Node<E> l) {
    // assert l == last && l != null;
    final E element = l.item;
    final Node<E> prev = l.prev;
    l.item = null;
    l.prev = null; // help GC
    last = prev;
    if (prev == null)
        first = null;
    else
        prev.next = null;
    size--;
    modCount++;
    return element;
}

E unlink(Node<E> x) {
    // assert x != null;
    final E element = x.item;
    final Node<E> next = x.next;
    final Node<E> prev = x.prev;

    if (prev == null) {
        first = next;
    } else {
        prev.next = next;
        x.prev = null;
    }

    if (next == null) {
        last = prev;
    } else {
        next.prev = prev;
        x.next = null;
    }

    x.item = null;
    size--;
    modCount++;
    return element;
}

上面方法就是連結串列的核心,頭尾中間插入,頭尾中間刪除。其他對外的呼叫都是圍繞這幾個方法進行操作的,同時LinkedList還實現了Deque介面,Deque介面是繼承Queue的。所以LinkedList還支援佇列的pop,push,peek操作。

總結

ArrayList 陣列形式訪問List鏈式集合資料,元素可重複,訪問元素較快 陣列
LinkedList 連結串列方式的List鏈式集合,元素可重複,元素的插入刪除較快 雙向連結串列