1. 程式人生 > >【jdk1.8】String原始碼分析

【jdk1.8】String原始碼分析

String

類的宣告

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence

首先可以看到String類是一個不變類,被final修飾,所以是不可繼承的。
它實現了Serializable介面,還有Comparable(主要就是compareTo方法)與CharSequence(如下圖)。
CharSequence

類的成員變數

    /** 底層字元的儲存*/
    private final char value[];

    /** 雜湊碼*/
private int hash; // Default to 0

類的構造方法

主要有三類類的構造方法,第一種是和byte[]相關的,第二種是和char[]相關的,第三種是和StringBuilder和StringBuffer相關的。

byte[]類

byte[]類裡比較重要或者說比較常用的一個方法就是解碼了。

public String(byte bytes[], String charsetName)
            throws UnsupportedEncodingException {
        this(bytes, 0, bytes.length, charsetName);
    }
    public
String(byte bytes[], int offset, int length, String charsetName) throws UnsupportedEncodingException { if (charsetName == null) throw new NullPointerException("charsetName"); checkBounds(bytes, offset, length); this.value = StringCoding.decode(charsetName, bytes, offset, length); }

我們來看一個例子:

        String test = "中文";
        String[] csn = new String[] {"ISO-8859-1", "GBK", "UTF-8"};
        for(int i=0;i<csn.length;i++){
            byte[] bt = test.getBytes(csn[i]);
            for(int j=0;j<csn.length;j++){
                String str = new String(bt, csn[j]);
                String res = new String(str.getBytes(csn[j]), csn[i]);
                System.out.print(res+"\t");
            }
            System.out.println();
        }

結果是:

        ISO GBK UTF-8
ISO     ??  ??  ??
GBK     中文  中文  錕斤拷錕斤拷
UTF-8   中文  中文  中文

為什麼ISO-8859-1那一行編碼組合再還原都不行呢?
因為ISO-8859-1編碼的編碼表中,沒有包含漢字字元,當然也就無法通過["中文".getBytes("ISO8859-1");]來得到正確的”中文”在ISO-8859-1中的編碼值了,所以再通過new String()來還原就無從談起了。

char[] 和 StringXxx

主要就是Arrays.copyOf()的應用咯。

類的關鍵方法

hashCode()

    public int hashCode() {
        int h = hash;
        if (h == 0 && value.length > 0) {
            char val[] = value;

            for (int i = 0; i < value.length; i++) {
                h = 31 * h + val[i];
            }
            hash = h;
        }
        return h;
    }

其實就是公式s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]的值。

intern()

    /**本地方法*/    /**
     * Returns a canonical representation for the string object.
     * <p>
     * A pool of strings, initially empty, is maintained privately by the
     * class {@code String}.
     * <p>
     * When the intern method is invoked, if the pool already contains a
     * string equal to this {@code String} object as determined by
     * the {@link #equals(Object)} method, then the string from the pool is
     * returned. Otherwise, this {@code String} object is added to the
     * pool and a reference to this {@code String} object is returned.
     * <p>
     * It follows that for any two strings {@code s} and {@code t},
     * {@code s.intern() == t.intern()} is {@code true}
     * if and only if {@code s.equals(t)} is {@code true}.
     * <p>
     * All literal strings and string-valued constant expressions are
     * interned. String literals are defined in section 3.10.5 of the
     * <cite>The Java&trade; Language Specification</cite>.
     *
     * @return  a string that has the same contents as this string, but is
     *          guaranteed to be from a pool of unique strings.
     */
    public native String intern();