跟著原始碼看ArrayList、LinkedList、HashMap、HashSet的內部儲存機制

阿新 • • 發佈：2019-02-15

近來閒著沒事，就突發奇想來研究下java中常用的各種集合的內部儲存機制。為什麼呢，因為不同的儲存機制是為了適用不同的使用場景。如鏈式儲存的特性就是儲存長度可以隨意改變，插入刪除方便，缺點就是每次讀取都要從頭一個一個的找，讀取不方便；線性儲存的特性就是可以快速隨意查詢，讀取方便，但插入刪除的話可能就要挪移其它的資料位置了，就是插入刪除不方便。因為在日常程式設計中常碰到對集合資料的存取操作，為了達到對資料的高效率使用，我們就有必要了解這些資料在計算機的內部儲存機制。

ArrayList
當從名字我們就可以判斷出它的底層儲存是陣列儲存，也就是線性儲存，array翻譯成英文就是陣列。下面看下ArrayList的原始碼

public class ArrayList<E> extends AbstractList<E> implements Cloneable, Serializable, RandomAccess {
    /**
     * The minimum amount by which the capacity of an ArrayList will increase.
     * This tuning parameter controls a time-space tradeoff. This value (12)
     * gives empirically good results and is arguably consistent with the
     * RI's specified default initial capacity of 10: instead of 10, we start
     * with 0 (sans allocation) and jump to 12.
     */
    private static final int MIN_CAPACITY_INCREMENT = 12;

    /**
     * The number of elements in this list.
     */
    int size;

    /**
     * The elements in this list, followed by nulls.
     */
    transient Object[] array;

    /**
     * Constructs a new instance of {@code ArrayList} with the specified
     * initial capacity.
     *
     * @param capacity
     *            the initial capacity of this {@code ArrayList}.
     */
    public ArrayList(int capacity) {
        if (capacity < 0) {
            throw new IllegalArgumentException("capacity < 0: " + capacity);
        }
        array = (capacity == 0 ? EmptyArray.OBJECT : new Object[capacity]);
    }

    /**
     * Constructs a new {@code ArrayList} instance with zero initial capacity.
     */
    public ArrayList() {
        array = EmptyArray.OBJECT;
    }

ArrayList的預設無參建構函式裡就一句程式碼，而array 的型別是Object[]，java中語法規定這是陣列的申明形式，而陣列是線性儲存的一種形式。j陣列的一個特性就是初始化陣列時必須設定它的一個儲存長度，且之後不能改變。所以上面的判斷沒錯，ArrayList適用於需要頻繁讀取操作的場景。

LinkedList
Link的意思就是連結，連結是鏈式儲存的一種，所以它就是鏈式儲存。

 /**
     * Constructs a new empty instance of {@code LinkedList}.
     */
    public LinkedList() {
        voidLink = new Link<E>(null, null, null);
        voidLink.previous = voidLink;
        voidLink.next = voidLink;
    }

private static final class Link<ET> {
        ET data;

        Link<ET> previous, next;

        Link(ET o, Link<ET> p, Link<ET> n) {
            data = o;
            previous = p;
            next = n;
        }
    }

上面就是LinkedList的一個構造方法和Link的一個構造方法，LinkedList裡的一個數據就是Link型別。Link的中儲存的是它儲存的資料和它自己的前後資料指向。
LinkedList適用於需要頻繁地資料插入刪除操作的場景。

HashMap
它的話從名字就不好判讀是哪一種儲存型別了。從名字看它是根據雜湊值儲存的鍵值對集合，但是這個集合底層又是怎麼儲存的呢？看程式碼

   /**
     * Constructs a new empty {@code HashMap} instance.
     */
    @SuppressWarnings("unchecked")
    public HashMap() {
        table = (HashMapEntry<K, V>[]) EMPTY_TABLE;
        threshold = -1; // Forces first put invocation to replace EMPTY_TABLE
    }

/**
     * The hash table. If this hash map contains a mapping for null, it is
     * not represented this hash table.
     */
    transient HashMapEntry<K, V>[] table;

構造方法中HashMap儲存的是一個table，而table的型別是陣列，因而HashMap底層儲存屬於執行緒陣列儲存。因它帶了一個雜湊值，故HashMap裡陣列的資料的位置會因每個資料的雜湊值不同而動態改變。上面講到陣列的長度不能改變，當HashMap儲存的資料長度超過它的容量的時候，它又是怎麼增加資料的呢？

/**
     * Maps the specified key to the specified value.
     *
     * @param key
     *            the key.
     * @param value
     *            the value.
     * @return the value of any previous mapping with the specified key or
     *         {@code null} if there was no such mapping.
     */
    @Override public V put(K key, V value) {
        if (key == null) {
            return putValueForNullKey(value);
        }

        int hash = Collections.secondaryHash(key);
        HashMapEntry<K, V>[] tab = table;
        int index = hash & (tab.length - 1);
        for (HashMapEntry<K, V> e = tab[index]; e != null; e = e.next) {
            if (e.hash == hash && key.equals(e.key)) {
                preModify(e);
                V oldValue = e.value;
                e.value = value;
                return oldValue;
            }
        }

        // No entry for (non-null) key is present; create one
        modCount++;
        if (size++ > threshold) {
            tab = doubleCapacity();
            index = hash & (tab.length - 1);
        }
        addNewEntry(key, value, hash, index);
        return null;
    }

我們看tab = doubleCapacity();

/**
     * Doubles the capacity of the hash table. Existing entries are placed in
     * the correct bucket on the enlarged table. If the current capacity is,
     * MAXIMUM_CAPACITY, this method is a no-op. Returns the table, which
     * will be new unless we were already at MAXIMUM_CAPACITY.
     */
    private HashMapEntry<K, V>[] doubleCapacity() {
        HashMapEntry<K, V>[] oldTable = table;
        int oldCapacity = oldTable.length;
        if (oldCapacity == MAXIMUM_CAPACITY) {
            return oldTable;
        }
        int newCapacity = oldCapacity * 2;
        HashMapEntry<K, V>[] newTable = makeTable(newCapacity);
        if (size == 0) {
            return newTable;
        }

        for (int j = 0; j < oldCapacity; j++) {
            /*
             * Rehash the bucket using the minimum number of field writes.
             * This is the most subtle and delicate code in the class.
             */
            HashMapEntry<K, V> e = oldTable[j];
            if (e == null) {
                continue;
            }
            int highBit = e.hash & oldCapacity;
            HashMapEntry<K, V> broken = null;
            newTable[j | highBit] = e;
            for (HashMapEntry<K, V> n = e.next; n != null; e = n, n = n.next) {
                int nextHighBit = n.hash & oldCapacity;
                if (nextHighBit != highBit) {
                    if (broken == null)
                        newTable[j | nextHighBit] = n;
                    else
                        broken.next = n;
                    broken = e;
                    highBit = nextHighBit;
                }
            }
            if (broken != null)
                broken.next = null;
        }
        return newTable;
    }

當插入資料時長度超過它的容量時，內部又new了一個長度為原有長度兩倍的陣列，然後把原來的資料儲存到新資料中。
HashMap也是適用於需要頻繁讀取操作的場景。

HashSet
基於雜湊值的set集合，它是怎麼儲存的呢；請看下面

 */
public class HashSet<E> extends AbstractSet<E> implements Set<E>, Cloneable,
        Serializable {

    private static final long serialVersionUID = -5024744406713321676L;

    transient HashMap<E, HashSet<E>> backingMap;

    /**
     * Constructs a new empty instance of {@code HashSet}.
     */
    public HashSet() {
        this(new HashMap<E, HashSet<E>>());
    }

HashSet(HashMap<E, HashSet<E>> backingMap) {
        this.backingMap = backingMap;
    }

它裡面new了一個HashMap，天哪！原來HashSet裡面是這樣的。也就是HashSet的資料是儲存在HashMap中，所以 HashSet也是適用於需要頻繁讀取操作的場景。

跟著原始碼看ArrayList、LinkedList、HashMap、HashSet的內部儲存機制

跟著原始碼看ArrayList、LinkedList、HashMap、HashSet的內部儲存機制

ArrayList和LinkedList的區別、優缺點以及應用場景

（一）ArrayList和LinkedList的原理、Java程式碼實現、效能比較

從原始碼角度認識ArrayList，LinkedList與HashMap

[原始碼分析]ArrayList和LinkedList如何實現的？我看你還有機會！

JDK類集框架實驗（ArrayList，LinkedList，TreeSet，HashSet，TreeMap，HashMap）

Java基礎複習第五天，陣列定義（靜態、動態初始化），陣列儲存機制及陣列的反轉、排序、遍歷

JDK原始碼學習-ArrayList、LinkedList、HashMap

JAVA 基本數據結構--數組、鏈表、ArrayList、Linkedlist、hashmap、hashtab等

ArrayList、LinkedList、HashMap、TreeMap 存儲速度對比

Java原始碼分析——java.util工具包解析（一）——ArrayList、LinkedList、Vector類解析

各種集合框架的總結ArrayList、LinkedList、Vector、HashMap、HashTable、HashSet、LinkedHaSet、TreeSet、ConcurrentHashMap

HashSet、HashMap、ArrayList、LinkedList、Vector區別

HashMap、LinkedHashMap、ConcurrentHashMap、ArrayList、LinkedList的底層實現。

ArrayList、Vector、LinkedList的區別及其優缺點？HashMap、HashTable的區別及其優缺點？

Java 中集合型別包含ArrayList、LinkedList、HashMap等類，下列描述正確的是（多選）？

Java基礎之集合類如ArrayList、LinkedList、HashMap、HashTable的區別

原始碼淺析 ArrayList、Vector、LinkedList 的區別

簡單ArrayList、LinkedList、HashSet、HashMap實現（一）

ArrayList與LinkedList、TreeSet與HashSet、HashMap與LinkedHashMap之間的比較

跟著原始碼看ArrayList、LinkedList、HashMap、HashSet的內部儲存機制

相關推薦