深入學習java原始碼之Character.Subset與Character.UnicodeBlock

阿新 • • 發佈：2019-01-13

hashMap的載入因子

new HashMap<>(128);

    public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }

載入因子 loadfactor

     /**
     * 預設的初始化的容量，必須是2的冪次數<br>
     * The default initial capacity - MUST be a power of two.
     */
    static final int DEFAULT_INITIAL_CAPACITY = 16;

    /**
     * 預設的載入因子
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**
     * 閾值。等於容量乘以載入因子。<br>
     * 也就是說，一旦容量到了這個數值，HashMap將會擴容。
     * The next size value at which to resize (capacity * load factor).
     * @serial
     */
    int threshold;

載入因子越高空間利用率提高了但是查詢時間和新增時間增加

載入因子 loadfactor 是表示 Hsah 表中元素的填滿的程度.若:載入因子越大,填滿的元素越多,好處是,空間利用率高了,但:衝突的機會加大了.反之,載入因子越小,填滿的元素越少,好處是:衝突的機會減小了,但:空間浪費多了.

衝突的機會越大,則查詢的成本越高.反之,查詢的成本越小.因而,查詢時間就越小.

因此,必須在 “衝突的機會”與”空間利用率”之間尋找一種平衡與折衷. 這種平衡與折衷本質上是資料結構中有名的”時-空”矛盾的平衡與折衷.
預設的容量是 16，而 threshold 是 16*0.75 = 12;

hashmap 是這樣存的

先利用hashcode 找到需要存的地方
但是存的地方肯定是有限的就是hashMap分配到的空間比如是 10
現在你第一個元素來了那麼他會根據你 hashcode%10 得到你在 10個位置中該存到哪裡

這個時候就有一個問題，就是，如果hashcode%10 找到存的地方   當你要存進去時候你發現裡面已經有另外一個物件了，
那麼這時候就要呼叫 equals方法進行比較，如果相同，就說明是一個相同的物件。就替換掉。
如果不同，那麼就形成雜湊桶，  就是2個物件一起，不過有先後，後進來的在後面。

hashmap 查詢物件，要的是效率，直接通過hashcode找到存放的地址，直接取出，只需一次。
但是像我們前面說的這種情況，是會讓運算元增加的，
你找到了 hashcode 所對應的實體地址，發現裡面有2個物件，這時就不能確定那個是你要找的，那麼就要通過equals和你傳入的key進行比對，相同則返回。

前面的講述已經發現  當你空間只有僅僅為10的時候是很容易造成，2個物件的hashcode 所對應的地址是一個位置的情況
這樣就造成 2個物件會形成雜湊桶，使查詢和插入的時間增加。

這時就有一個載入因子的引數，如果載入因子為0.75 ，如果你hashmap的空間有 100  那麼  當你插入了75個元素的時候 hashmap就需要擴容了，不然的話會形成很長雜湊桶，對於查詢和插入都會增加時間，因為他要一個一個的equals。
但是你又不能讓載入因子很小，0.01 這樣是不合適的，因為他會大大消耗你的記憶體，你一加入一個物件hashmap就擴容。

java的enum列舉

原始的介面定義常量

public interface IConstants {
    String MON = "Mon";
    String TUE = "Tue";
    String WED = "Wed";
    String THU = "Thu";
    String FRI = "Fri";
    String SAT = "Sat";
    String SUN = "Sun";
}

建立列舉型別要使用 enum 關鍵字，隱含了所建立的型別都是 java.lang.Enum 類的子類（java.lang.Enum 是一個抽象類）。列舉型別符合通用模式 Class Enum<E extends Enum<E>>，而 E 表示列舉型別的名稱。列舉型別的每一個值都將對映到 protected Enum(String name, int ordinal) 建構函式中，在這裡，每個值的名稱都被轉換成一個字串，並且序數設定表示了此設定被建立的順序。

public enum EnumTest {
    MON, TUE, WED, THU, FRI, SAT, SUN;
}

這段程式碼實際上呼叫了7次 Enum(String name, int ordinal)：

new Enum<EnumTest>("MON",0);
new Enum<EnumTest>("TUE",1);
new Enum<EnumTest>("WED",2);
    ... ...

public class Test {
    public static void main(String[] args) {
        for (EnumTest e : EnumTest.values()) {
            System.out.println(e.toString());
        }
         
        System.out.println("----------------我是分隔線------------------");
         
        EnumTest test = EnumTest.TUE;
        switch (test) {
        case MON:
            System.out.println("今天是星期一");
            break;
        case TUE:
            System.out.println("今天是星期二");
            break;
        // ... ...
        default:
            System.out.println(test);
            break;
        }
    }
}


MON
TUE
WED
THU
FRI
SAT
SUN
----------------我是分隔線------------------
今天是星期二

可以把 enum 看成是一個普通的 class，它們都可以定義一些屬性和方法，不同之處是：enum 不能使用 extends 關鍵字繼承其他類，因為 enum 已經繼承了 java.lang.Enum（java是單一繼承）。

enum 物件的常用方法介紹

int compareTo(E o)
比較此列舉與指定物件的順序。

Class<E> getDeclaringClass()
返回與此列舉常量的列舉型別相對應的 Class 物件。

String name()
返回此列舉常量的名稱，在其列舉宣告中對其進行宣告。

int ordinal()
返回列舉常量的序數（它在列舉宣告中的位置，其中初始常量序數為零）。

String toString()

返回列舉常量的名稱，它包含在宣告中。

static <T extends Enum<T>> T valueOf(Class<T> enumType, String name)
返回帶指定名稱的指定列舉型別的列舉常量。

public class Test {
    public static void main(String[] args) {
        EnumTest test = EnumTest.TUE;
         
        //compareTo(E o)
        switch (test.compareTo(EnumTest.MON)) {
        case -1:
            System.out.println("TUE 在 MON 之前");
            break;
        case 1:
            System.out.println("TUE 在 MON 之後");
            break;
        default:
            System.out.println("TUE 與 MON 在同一位置");
            break;
        }
         
        //getDeclaringClass()
        System.out.println("getDeclaringClass(): " + test.getDeclaringClass().getName());
         
        //name() 和  toString()
        System.out.println("name(): " + test.name());
        System.out.println("toString(): " + test.toString());
         
        //ordinal()， 返回值是從 0 開始
        System.out.println("ordinal(): " + test.ordinal());
    }
}



TUE 在 MON 之後
getDeclaringClass(): com.test.EnumTest
name(): TUE
toString(): TUE
ordinal(): 1

Modifier and Type	Method and Description
`static int`	`charCount(int codePoint)` 確定代表指定字元（Unicode程式碼點）所需的 `char`值。
`char`	`charValue()` 返回此 `Character`物件的值。
`static int`	`codePointAt(char[] a, int index)` 返回 `char`陣列的給定索引處的程式碼點。
`static int`	`codePointAt(char[] a, int index, int limit)` 返回 `char`陣列的給定索引處的程式碼點，其中只能使用 `index`小於 `limit`陣列元素。
`static int`	`codePointAt(CharSequence seq, int index)` 返回 `CharSequence`給定索引處的程式碼點。
`static int`	`codePointBefore(char[] a, int index)` 返回 `char`陣列給定索引之前的程式碼點。
`static int`	`codePointBefore(char[] a, int index, int start)` 返回 `char`陣列給定索引之前的程式碼點，只能使用 `index`大於等於 `start`陣列元素。
`static int`	`codePointBefore(CharSequence seq, int index)` 返回的給定索引前面的程式碼點 `CharSequence` 。
`static int`	`codePointCount(char[] a, int offset, int count)` 返回 `char`陣列引數的子陣列中的Unicode程式碼點數。
`static int`	`codePointCount(CharSequence seq, int beginIndex, int endIndex)` 返回指定字元序列的文字範圍內的Unicode程式碼點數。
`static int`	`compare(char x, char y)` 數值比較兩個 `char`數值。
`int`	`compareTo(Character anotherCharacter)` 數字比較兩個 `Character`物件。
`static int`	`digit(char ch, int radix)` 返回指定基數中字元 `ch`的數值。
`static int`	`digit(int codePoint, int radix)` 返回指定基數中指定字元（Unicode程式碼點）的數值。
`boolean`	`equals(Object obj)` 將此物件與指定物件進行比較。
`static char`	`forDigit(int digit, int radix)` 確定指定基數中特定數字的字元表示。
`static byte`	`getDirectionality(char ch)` 返回給定字元的Unicode方向屬性。
`static byte`	`getDirectionality(int codePoint)` 返回給定字元的Unicode方向性屬性（Unicode程式碼點）。
`static String`	`getName(int codePoint)` 返回指定字元的Unicode名稱 `codePoint` ，或者如果程式碼點是空 `unassigned` 。
`static int`	`getNumericValue(char ch)` 返回指定的Unicode字元代表的 `int`值。
`static int`	`getNumericValue(int codePoint)` 返回 `int`值指定字元（Unicode程式碼點）表示。
`static int`	`getType(char ch)` 返回一個值，表示一個字元的一般類別。
`static int`	`getType(int codePoint)` 返回一個值，表示一個字元的一般類別。
`int`	`hashCode()` 返回這個`Character`的雜湊碼; 等於呼叫`charValue()`的結果。
`static int`	`hashCode(char value)` 返回一個`char`值的雜湊碼; 相容`Character.hashCode()` 。
`static char`	`highSurrogate(int codePoint)` 返回主導替代（一個 high surrogate code unit所述的） surrogate pair表示在UTF-16編碼指定的補充的字元（Unicode程式碼點）。
`static boolean`	`isAlphabetic(int codePoint)` 確定指定的字元（Unicode程式碼點）是否是字母表。
`static boolean`	`isBmpCodePoint(int codePoint)` 確定指定的字元（Unicode程式碼點）是否在 Basic Multilingual Plane (BMP)中。
`static boolean`	`isDefined(char ch)` 確定字元是否以Unicode定義。
`static boolean`	`isDefined(int codePoint)` 確定Unicode中是否定義了一個字元（Unicode程式碼點）。
`static boolean`	`isDigit(char ch)` 確定指定的字元是否是數字。
`static boolean`	`isDigit(int codePoint)` 確定指定的字元（Unicode程式碼點）是否為數字。
`static boolean`	`isHighSurrogate(char ch)` 確定給定的 `char`值是否為 Unicode high-surrogate code unit （也稱為引導代理單元）。
`static boolean`	`isIdentifierIgnorable(char ch)` 確定指定的字元是否應被視為Java識別符號或Unicode識別符號中的可忽略字元。
`static boolean`	`isIdentifierIgnorable(int codePoint)` 確定指定字元（Unicode程式碼點）是否應被視為Java識別符號或Unicode識別符號中的可忽略字元。
`static boolean`	`isIdeographic(int codePoint)` 確定指定字元（Unicode程式碼點）是否是Unicode標準定義的CJKV（中文，日文，韓文和越南文）表意文字。
`static boolean`	`isISOControl(char ch)` 確定指定的字元是否是ISO控制字元。
`static boolean`	`isISOControl(int codePoint)` 確定引用的字元（Unicode程式碼點）是否是ISO控制字元。
`static boolean`	`isJavaIdentifierPart(char ch)` 確定指定的字元是否可以是Java識別符號的一部分，而不是第一個字元。
`static boolean`	`isJavaIdentifierPart(int codePoint)` 確定字元（Unicode程式碼點）可能是Java識別符號的一部分，而不是第一個字元。
`static boolean`	`isJavaIdentifierStart(char ch)` 確定指定字元是否允許作為Java識別符號中的第一個字元。
`static boolean`	`isJavaIdentifierStart(int codePoint)` 確定字元（Unicode程式碼點）是否允許作為Java識別符號中的第一個字元。
`static boolean`	`isJavaLetter(char ch)`已棄用替換為isJavaIdentifierStart（char）。
`static boolean`	`isJavaLetterOrDigit(char ch)`已棄用由isJavaIdentifierPart（char）替代。
`static boolean`	`isLetter(char ch)` 確定指定的字元是否是一個字母。
`static boolean`	`isLetter(int codePoint)` 確定指定的字元（Unicode程式碼點）是否是一個字母。
`static boolean`	`isLetterOrDigit(char ch)` 確定指定的字元是字母還是數字。
`static boolean`	`isLetterOrDigit(int codePoint)` 確定指定的字元（Unicode程式碼點）是字母還是數字。
`static boolean`	`isLowerCase(char ch)` 確定指定的字元是否是小寫字元。
`static boolean`	`isLowerCase(int codePoint)` 確定指定的字元（Unicode程式碼點）是否是小寫字元。
`static boolean`	`isLowSurrogate(char ch)` 確定給定的 `char`值是否為 Unicode low-surrogate code unit （也稱為尾隨代理單元）。
`static boolean`	`isMirrored(char ch)` 根據Unicode規範確定字元是否映象。
`static boolean`	`isMirrored(int codePoint)` 確定是否根據Unicode規範映象指定的字元（Unicode程式碼點）。
`static boolean`	`isSpace(char ch)`已棄用替換為isWhitespace（char）。
`static boolean`	`isSpaceChar(char ch)` 確定指定的字元是否是Unicode空格字元。
`static boolean`	`isSpaceChar(int codePoint)` 確定指定字元（Unicode程式碼點）是否為Unicode空格字元。
`static boolean`	`isSupplementaryCodePoint(int codePoint)` 確定指定字元（Unicode程式碼點）是否在 supplementary character範圍內。
`static boolean`	`isSurrogate(char ch)` 確定給定的 `char`值是否是Unicode 代理程式碼單元。
`static boolean`	`isSurrogatePair(char high, char low)` 確定指定的一對 `char`值是否有效 Unicode surrogate pair 。
`static boolean`	`isTitleCase(char ch)` 確定指定的字元是否是一個titlecase字元。
`static boolean`	`isTitleCase(int codePoint)` 確定指定的字元（Unicode程式碼點）是否是一個titlecase字元。
`static boolean`	`isUnicodeIdentifierPart(char ch)` 確定指定的字元是否可以是Unicode識別符號的一部分，而不是第一個字元。
`static boolean`	`isUnicodeIdentifierPart(int codePoint)` 確定指定的字元（Unicode程式碼點）是否可能是Unicode識別符號的一部分，而不是第一個字元。
`static boolean`	`isUnicodeIdentifierStart(char ch)` 確定指定字元是否允許為Unicode識別符號中的第一個字元。
`static boolean`	`isUnicodeIdentifierStart(int codePoint)` 確定Unicode識別符號中的第一個字元是否允許指定的字元（Unicode程式碼點）。
`static boolean`	`isUpperCase(char ch)` 確定指定的字元是否為大寫字元。
`static boolean`	`isUpperCase(int codePoint)` 確定指定的字元（Unicode程式碼點）是否為大寫字元。
`static boolean`	`isValidCodePoint(int codePoint)` 確定指定的程式碼點是否有效 Unicode code point value 。
`static boolean`	`isWhitespace(char ch)` 根據Java確定指定的字元是否為空格。
`static boolean`	`isWhitespace(int codePoint)` 根據Java確定指定字元（Unicode程式碼點）是否為空格。
`static char`	`lowSurrogate(int codePoint)` 返回尾隨替代（一個 low surrogate code unit所述的） surrogate pair表示在UTF-16編碼指定的補充的字元（Unicode程式碼點）。
`static int`	`offsetByCodePoints(char[] a, int start, int count, int index, int codePointOffset)` 返回給定的 `char`子陣列中的索引，該子陣列與 `index`由 `codePointOffset`程式碼點偏移。
`static int`	`offsetByCodePoints(CharSequence seq, int index, int codePointOffset)` 返回給定的char序列中與 `index` （ `codePointOffset`程式碼點偏移的索引。
`static char`	`reverseBytes(char ch)` 返回通過反轉指定的 char值中的位元組順序獲得的值。
`static char[]`	`toChars(int codePoint)` 將指定的字元（Unicode程式碼點）轉換為儲存在 `char`陣列中的UTF-16 `char`形式。
`static int`	`toChars(int codePoint, char[] dst, int dstIndex)` 將指定的字元（Unicode程式碼點）轉換為其UTF-16表示形式。
`static int`	`toCodePoint(char high, char low)` 將指定的代理對轉換為其補充程式碼點值。
`static char`	`toLowerCase(char ch)` 使用UnicodeData檔案中的大小寫對映資訊將字元引數轉換為小寫。
`static int`	`toLowerCase(int codePoint)` 使用UnicodeData檔案中的大小寫對映資訊將字元（Unicode程式碼點）引數轉換為小寫。
`String`	`toString()` 返回 `String`表示此物件 `Character`的價值。
`static String`	`toString(char c)` 返回一個 `String`物件，表示指定的 `char` 。
`static char`	`toTitleCase(char ch)` 使用UnicodeData檔案中的案例對映資訊將字元引數轉換為titlecase。
`static int`	`toTitleCase(int codePoint)` 使用UnicodeData檔案中的案例對映資訊將字元（Unicode程式碼點）引數轉換為titlecase。
`static char`	`toUpperCase(char ch)` 使用UnicodeData檔案中的案例對映資訊將字元引數轉換為大寫。
`static int`	`toUpperCase(int codePoint)` 使用UnicodeData檔案中的案例對映資訊將字元（Unicode程式碼點）引數轉換為大寫。
`static Character`	`valueOf(char c)` 返回一個表示指定的 char值的 Character例項。

java原始碼

package java.lang;

import java.util.Arrays;
import java.util.Map;
import java.util.HashMap;
import java.util.Locale;


public final
class Character implements java.io.Serializable, Comparable<Character> {

    public static final int MIN_RADIX = 2;

    public static final int MAX_RADIX = 36;
	
    public static final char MIN_VALUE = '\u0000';	
	
    public static final char MAX_VALUE = '\uFFFF';
	
    @SuppressWarnings("unchecked")
    public static final Class<Character> TYPE = (Class<Character>) Class.getPrimitiveClass("char");

    public static final byte UNASSIGNED = 0;

    public static final byte UPPERCASE_LETTER = 1;	

    public static final byte LOWERCASE_LETTER = 2;
	
	......
	
    public static final byte DECIMAL_DIGIT_NUMBER = 9;
	
    static final int ERROR = 0xFFFFFFFF;
	
    public static final byte DIRECTIONALITY_UNDEFINED = -1;	
	
    public static final char MIN_HIGH_SURROGATE = '\uD800';
	
    public static final char MIN_LOW_SURROGATE  = '\uDC00';
	
    public static final char MIN_SURROGATE = MIN_HIGH_SURROGATE;
	
    public static final char MAX_SURROGATE = MAX_LOW_SURROGATE;
	
    public static final int MIN_SUPPLEMENTARY_CODE_POINT = 0x010000;

    private final char value;

    private static final long serialVersionUID = 3786198910865385080L;
	
    public Character(char value) {
        this.value = value;
    }

    private static class CharacterCache {
        private CharacterCache(){}

        static final Character cache[] = new Character[127 + 1];

        static {
            for (int i = 0; i < cache.length; i++)
                cache[i] = new Character((char)i);
        }
    }

    public static Character valueOf(char c) {
        if (c <= 127) { // must cache
            return CharacterCache.cache[(int)c];
        }
        return new Character(c);
    }

    public char charValue() {
        return value;
    }
	
    @Override
    public int hashCode() {
        return Character.hashCode(value);
    }	
	
    public static int hashCode(char value) {
        return (int)value;
    }

    public boolean equals(Object obj) {
        if (obj instanceof Character) {
            return value == ((Character)obj).charValue();
        }
        return false;
    }
	
    public String toString() {
        char buf[] = {value};
        return String.valueOf(buf);
    }

    public static String toString(char c) {
        return String.valueOf(c);
    }
	
    public static int getType(char ch) {
        return getType((int)ch);
    }

    public static int getType(int codePoint) {
        return CharacterData.of(codePoint).getType(codePoint);
    }
	
    public static int compare(char x, char y) {
        return x - y;
    }	
	
    public int compareTo(Character anotherCharacter) {
        return compare(this.value, anotherCharacter.value);
    }	
	
    public static final int BYTES = SIZE / Byte.SIZE;	
	
    public static final int SIZE = 16;	
	
    public static String getName(int codePoint) {
        if (!isValidCodePoint(codePoint)) {
            throw new IllegalArgumentException();
        }
        String name = CharacterName.get(codePoint);
        if (name != null)
            return name;
        if (getType(codePoint) == UNASSIGNED)
            return null;
        UnicodeBlock block = UnicodeBlock.of(codePoint);
        if (block != null)
            return block.toString().replace('_', ' ') + " "
                   + Integer.toHexString(codePoint).toUpperCase(Locale.ENGLISH);
        // should never come here
        return Integer.toHexString(codePoint).toUpperCase(Locale.ENGLISH);
    }
	
    public static class Subset  {

        private String name;	
	
        protected Subset(String name) {
            if (name == null) {
                throw new NullPointerException("name");
            }
            this.name = name;
        }
        public final boolean equals(Object obj) {
            return (this == obj);
        }	
        public final int hashCode() {
            return super.hashCode();
        }	
        public final String toString() {
            return name;
        }
    }
	
    public static final class UnicodeBlock extends Subset {

        private static Map<String, UnicodeBlock> map = new HashMap<>(256);

        /**
         * Creates a UnicodeBlock with the given identifier name.
         * This name must be the same as the block identifier.
         */
        private UnicodeBlock(String idName) {
            super(idName);
            map.put(idName, this);
        }

        private UnicodeBlock(String idName, String alias) {
            this(idName);
            map.put(alias, this);
        }

        private UnicodeBlock(String idName, String... aliases) {
            this(idName);
            for (String alias : aliases)
                map.put(alias, this);
        }

        public static final UnicodeBlock  BASIC_LATIN =
            new UnicodeBlock("BASIC_LATIN",
                             "BASIC LATIN",
                             "BASICLATIN");		
							 
        public static final UnicodeBlock ARMENIAN =
            new UnicodeBlock("ARMENIAN");		

        public static final UnicodeBlock PHAGS_PA =
            new UnicodeBlock("PHAGS_PA",
                             "PHAGS-PA");
							 
        private static final int blockStarts[] = {
            0x0000,   // 0000..007F; Basic Latin
            0x0080,   // 0080..00FF; Latin-1 Supplement
            0x0100,   // 0100..017F; Latin Extended-A
            0x0180,   // 0180..024F; Latin Extended-B
            0x0250,   // 0250..02AF; IPA Extensions
		};

        private static final UnicodeBlock[] blocks = {
            BASIC_LATIN,
            LATIN_1_SUPPLEMENT,
            LATIN_EXTENDED_A,
            LATIN_EXTENDED_B,							 
        };	
		
        public static UnicodeBlock of(char c) {
            return of((int)c);
        }

        public static UnicodeBlock of(int codePoint) {
            if (!isValidCodePoint(codePoint)) {
                throw new IllegalArgumentException();
            }

            int top, bottom, current;
            bottom = 0;
            top = blockStarts.length;
            current = top/2;

            // invariant: top > current >= bottom && codePoint >= unicodeBlockStarts[bottom]
            while (top - bottom > 1) {
                if (codePoint >= blockStarts[current]) {
                    bottom = current;
                } else {
                    top = current;
                }
                current = (top + bottom) / 2;
            }
            return blocks[current];
        }	

        public static final UnicodeBlock forName(String blockName) {
            UnicodeBlock block = map.get(blockName.toUpperCase(Locale.US));
            if (block == null) {
                throw new IllegalArgumentException();
            }
            return block;
        }		
}							 

    public static enum UnicodeScript {
        /**
         * Unicode script "Common".
         */
        COMMON,

        /**
         * Unicode script "Latin".
         */
        LATIN,

        /**
         * Unicode script "Greek".
         */
        GREEK,
		/**
         * Unicode script "Takri".
         */
        TAKRI,

        /**
         * Unicode script "Miao".
         */
        MIAO,

        /**
         * Unicode script "Unknown".
         */
        UNKNOWN;

        private static final int[] scriptStarts = {
            0x0000,   // 0000..0040; COMMON
            0x0041,   // 0041..005A; LATIN
            0x005B,   // 005B..0060; COMMON
            0x0061,   // 0061..007A; LATIN
            0x20000,  // 20000..E0000; HAN
            0xE0001,  // E0001..E00FF; COMMON
            0xE0100,  // E0100..E01EF; INHERITED
            0xE01F0   // E01F0..10FFFF; UNKNOWN
        };

        private static final UnicodeScript[] scripts = {
            COMMON,
            LATIN,
            COMMON,
            LATIN,
            COMMON,	
            INHERITED,
            UNKNOWN
        };
		
        private static HashMap<String, Character.UnicodeScript> aliases;
        static {
            aliases = new HashMap<>(128);
            aliases.put("ARAB", ARABIC);
            aliases.put("ZINH", INHERITED);
            aliases.put("ZYYY", COMMON);
            aliases.put("ZZZZ", UNKNOWN);
        }	
	
        public static UnicodeScript of(int codePoint) {
            if (!isValidCodePoint(codePoint))
                throw new IllegalArgumentException();
            int type = getType(codePoint);
            // leave SURROGATE and PRIVATE_USE for table lookup
            if (type == UNASSIGNED)
                return UNKNOWN;
            int index = Arrays.binarySearch(scriptStarts, codePoint);
            if (index < 0)
                index = -index - 2;
            return scripts[index];
        }
	
        public static final UnicodeScript forName(String scriptName) {
            scriptName = scriptName.toUpperCase(Locale.ENGLISH);
                                 //.replace(' ', '_'));
            UnicodeScript sc = aliases.get(scriptName);
            if (sc != null)
                return sc;
            return valueOf(scriptName);
        }
    }
	
    public static boolean isJavaIdentifierStart(char ch) {
        return isJavaIdentifierStart((int)ch);
    }	
	
    public static boolean isJavaIdentifierStart(int codePoint) {
        return CharacterData.of(codePoint).isJavaIdentifierStart(codePoint);
    }

    public static boolean isJavaIdentifierPart(char ch) {
        return isJavaIdentifierPart((int)ch);
    }
	
    public static boolean isJavaIdentifierPart(int codePoint) {
        return CharacterData.of(codePoint).isJavaIdentifierPart(codePoint);
    }
	
    public static boolean isUnicodeIdentifierStart(char ch) {
        return isUnicodeIdentifierStart((int)ch);
    }

    public static boolean isUnicodeIdentifierStart(int codePoint) {
        return CharacterData.of(codePoint).isUnicodeIdentifierStart(codePoint);
    }

    public static boolean isUnicodeIdentifierPart(char ch) {
        return isUnicodeIdentifierPart((int)ch);
    }

    public static boolean isUnicodeIdentifierPart(int codePoint) {
        return CharacterData.of(codePoint).isUnicodeIdentifierPart(codePoint);
    }

    public static boolean isIdentifierIgnorable(char ch) {
        return isIdentifierIgnorable((int)ch);
    }
    public static boolean isIdentifierIgnorable(int codePoint) {
        return CharacterData.of(codePoint).isIdentifierIgnorable(codePoint);
    }
	
    public static char toLowerCase(char ch) {
        return (char)toLowerCase((int)ch);
    }

    public static int toLowerCase(int codePoint) {
        return CharacterData.of(codePoint).toLowerCase(codePoint);
    }	

    public static char toUpperCase(char ch) {
        return (char)toUpperCase((int)ch);
    }

   public static int toUpperCase(int codePoint) {
        return CharacterData.of(codePoint).toUpperCase(codePoint);
    }	
	
    public static char toTitleCase(char ch) {
        return (char)toTitleCase((int)ch);
    }	
	
    public static int toTitleCase(int codePoint) {
        return CharacterData.of(codePoint).toTitleCase(codePoint);
    }
	
    public static int digit(char ch, int radix) {
        return digit((int)ch, radix);
    }	
	
    public static int digit(int codePoint, int radix) {
        return CharacterData.of(codePoint).digit(codePoint, radix);
    }
	
    public static int getNumericValue(char ch) {
        return getNumericValue((int)ch);
    }
	
    public static int getNumericValue(int codePoint) {
        return CharacterData.of(codePoint).getNumericValue(codePoint);
    }
	
    @Deprecated
    public static boolean isSpace(char ch) {
        return (ch <= 0x0020) &&
            (((((1L << 0x0009) |
            (1L << 0x000A) |
            (1L << 0x000C) |
            (1L << 0x000D) |
            (1L << 0x0020)) >> ch) & 1L) != 0);
    }
	
    public static boolean isSpaceChar(char ch) {
        return isSpaceChar((int)ch);
    }
	
    public static boolean isSpaceChar(int codePoint) {
        return ((((1 << Character.SPACE_SEPARATOR) |
                  (1 << Character.LINE_SEPARATOR) |
                  (1 << Character.PARAGRAPH_SEPARATOR)) >> getType(codePoint)) & 1)
            != 0;
    }
	
    public static boolean isWhitespace(char ch) {
        return isWhitespace((int)ch);
    }
	
    public static boolean isWhitespace(int codePoint) {
        return CharacterData.of(codePoint).isWhitespace(codePoint);
    }

    public static boolean isISOControl(char ch) {
        return isISOControl((int)ch);
    }
	
    public static boolean isISOControl(int codePoint) {
        // Optimized form of:
        //     (codePoint >= 0x00 && codePoint <= 0x1F) ||
        //     (codePoint >= 0x7F && codePoint <= 0x9F);
        return codePoint <= 0x9F &&
            (codePoint >= 0x7F || (codePoint >>> 5 == 0));
    }
	
    public static char forDigit(int digit, int radix) {
        if ((digit >= radix) || (digit < 0)) {
            return '\0';
        }
        if ((radix < Character.MIN_RADIX) || (radix > Character.MAX_RADIX)) {
            return '\0';
        }
        if (digit < 10) {
            return (char)('0' + digit);
        }
        return (char)('a' - 10 + digit);
    }	
	
    public static byte getDirectionality(char ch) {
        return getDirectionality((int)ch);
    }
	
    public static byte getDirectionality(int codePoint) {
        return CharacterData.of(codePoint).getDirectionality(codePoint);
    }

    public static boolean isMirrored(char ch) {
        return isMirrored((int)ch);
    }	
	
    public static boolean isMirrored(int codePoint) {
        return CharacterData.of(codePoint).isMirrored(codePoint);
    }
	
    static int toUpperCaseEx(int codePoint) {
        assert isValidCodePoint(codePoint);
        return CharacterData.of(codePoint).toUpperCaseEx(codePoint);
    }
	
    static char[] toUpperCaseCharArray(int codePoint) {
        // As of Unicode 6.0, 1:M uppercasings only happen in the BMP.
        assert isBmpCodePoint(codePoint);
        return CharacterData.of(codePoint).toUpperCaseCharArray(codePoint);
    }

    public static char reverseBytes(char ch) {
        return (char) (((ch & 0xFF00) >> 8) | (ch << 8));
    }
}

package java.lang;

abstract class CharacterData {
    abstract int getProperties(int ch);
    abstract int getType(int ch);
    abstract boolean isWhitespace(int ch);
    abstract boolean isMirrored(int ch);
    abstract boolean isJavaIdentifierStart(int ch);
    abstract boolean isJavaIdentifierPart(int ch);
    abstract boolean isUnicodeIdentifierStart(int ch);
    abstract boolean isUnicodeIdentifierPart(int ch);
    abstract boolean isIdentifierIgnorable(int ch);
    abstract int toLowerCase(int ch);
    abstract int toUpperCase(int ch);
    abstract int toTitleCase(int ch);
    abstract int digit(int ch, int radix);
    abstract int getNumericValue(int ch);
    abstract byte getDirectionality(int ch);

    //need to implement for JSR204
    int toUpperCaseEx(int ch) {
        return toUpperCase(ch);
    }

    char[] toUpperCaseCharArray(int ch) {
        return null;
    }

    boolean isOtherLowercase(int ch) {
        return false;
    }

    boolean isOtherUppercase(int ch) {
        return false;
    }

    boolean isOtherAlphabetic(int ch) {
        return false;
    }

    boolean isIdeographic(int ch) {
        return false;
    }

    // Character <= 0xff (basic latin) is handled by internal fast-path
    // to avoid initializing large tables.
    // Note: performance of this "fast-path" code may be sub-optimal
    // in negative cases for some accessors due to complicated ranges.
    // Should revisit after optimization of table initialization.

    static final CharacterData of(int ch) {
        if (ch >>> 8 == 0) {     // fast-path
            return CharacterDataLatin1.instance;
        } else {
            switch(ch >>> 16) {  //plane 00-16
            case(0):
                return CharacterData00.instance;
            case(1):
                return CharacterData01.instance;
            case(2):
                return CharacterData02.instance;
            case(14):
                return CharacterData0E.instance;
            case(15):   // Private Use
            case(16):   // Private Use
                return CharacterDataPrivateUse.instance;
            default:
                return CharacterDataUndefined.instance;
            }
        }
    }
}


class CharacterData00 extends CharacterData {
    int getProperties(int ch) {
        char offset = (char)ch;
        int props = A[Y[X[offset>>5]|((offset>>1)&0xF)]|(offset&0x1)];
        return props;
    }

    int getPropertiesEx(int ch) {
        char offset = (char)ch;
        int props = B[Y[X[offset>>5]|((offset>>1)&0xF)]|(offset&0x1)];
        return props;
    }

    int getType(int ch) {
        int props = getProperties(ch);
        return (props & 0x1F);
    }
	
    boolean isJavaIdentifierPart(int ch) {
        int props = getProperties(ch);
        return ((props & 0x00003000) != 0);
    }

    boolean isUnicodeIdentifierStart(int ch) {
        int props = getProperties(ch);
        return ((props & 0x00007000) == 0x00007000);
    }

    boolean isUnicodeIdentifierPart(int ch) {
        int props = getProperties(ch);
        return ((props & 0x00001000) != 0);
    }
	
    int toLowerCase(int ch) {
        int mapChar = ch;
        int val = getProperties(ch);

        if ((val & 0x00020000) != 0) {
          if ((val & 0x07FC0000) == 0x07FC0000) {
            switch(ch) {
              // map the offset overflow chars
            case 0x0130 : mapChar = 0x0069; break;
            case 0x2126 : mapChar = 0x03C9; break;
            case 0x212A : mapChar = 0x006B; break;
            case 0x212B : mapChar = 0x00E5; break;
            case 0xA78D : mapChar = 0x0265; break;
            case 0xA7AA : mapChar = 0x0266; break;
              // default mapChar is already set, so no
              // need to redo it here.
              // default       : mapChar = ch;
            }
          }
          else {
            int offset = val << 5 >> (5+18);
            mapChar = ch + offset;
          }
        }
        return mapChar;
    }	

    static {
            charMap = new char[][][] {
        { {'\u00DF'}, {'\u0053', '\u0053', } },
        { {'\u0130'}, {'\u0130', } },
        { {'\u0149'}, {'\u02BC', '\u004E', } },
        { {'\uFB13'}, {'\u0544', '\u0546', } },
        { {'\uFB14'}, {'\u0544', '\u0535', } },
        { {'\uFB15'}, {'\u0544', '\u053B', } },
        { {'\uFB16'}, {'\u054E', '\u0546', } },
        { {'\uFB17'}, {'\u0544', '\u053D', } },
    };
        { // THIS CODE WAS AUTOMATICALLY CREATED BY GenerateCharacter:
            char[] data = A_DATA.toCharArray();
            assert (data.length == (930 * 2));
            int i = 0, j = 0;
            while (i < (930 * 2)) {
                int entry = data[i++] << 16;
                A[j++] = entry | data[i++];
            }
        }

    }        
}

package java.lang;

/** The CharacterData class encapsulates the large tables found in
    Java.lang.Character. */

class CharacterDataPrivateUse extends CharacterData {

    int getProperties(int ch) {
        return 0;
    }

    int getType(int ch) {
	return (ch & 0xFFFE) == 0xFFFE
	    ? Character.UNASSIGNED
	    : Character.PRIVATE_USE;
    }

    boolean isJavaIdentifierStart(int ch) {
		return false;
    }

    boolean isJavaIdentifierPart(int ch) {
		return false;
    }

    boolean isUnicodeIdentifierStart(int ch) {
		return false;
    }

    boolean isUnicodeIdentifierPart(int ch) {
		return false;
    }

    boolean isIdentifierIgnorable(int ch) {
		return false;
    }

    int toLowerCase(int ch) {
		return ch;
    }

    int toUpperCase(int ch) {
		return ch;
    }

    int toTitleCase(int ch) {
		return ch;
    }

    int digit(int ch, int radix) {
		return -1;
    }

    int getNumericValue(int ch) {
		return -1;
    }

    boolean isWhitespace(int ch) {
		return false;
    }

    byte getDirectionality(int ch) {
	return (ch & 0xFFFE) == 0xFFFE
	    ? Character.DIRECTIONALITY_UNDEFINED
	    : Character.DIRECTIONALITY_LEFT_TO_RIGHT;
    }

    boolean isMirrored(int ch) {
		return false;
    }

    static final CharacterData instance = new CharacterDataPrivateUse();
    private CharacterDataPrivateUse() {};
}

package java.lang;

/** The CharacterData class encapsulates the large tables found in
    Java.lang.Character. */

class CharacterDataUndefined extends CharacterData {

    int getProperties(int ch) {
        return 0;
    }

    int getType(int ch) {
	return Character.UNASSIGNED;
    }

    boolean isJavaIdentifierStart(int ch) {
		return false;
    }

    boolean isJavaIdentifierPart(int ch) {
		return false;
    }

    boolean isUnicodeIdentifierStart(int ch) {
		return false;
    }

    boolean isUnicodeIdentifierPart(int ch) {
		return false;
    }

    boolean isIdentifierIgnorable(int ch) {
		return false;
    }

    int toLowerCase(int ch) {
		return ch;
    }

    int toUpperCase(int ch) {
		return ch;
    }

    int toTitleCase(int ch) {
		return ch;
    }

    int digit(int ch, int radix) {
		return -1;
    }

    int getNumericValue(int ch) {
		return -1;
    }

    boolean isWhitespace(int ch) {
		return false;
    }

    byte getDirectionality(int ch) {
		return Character.DIRECTIONALITY_UNDEFINED;
    }

    boolean isMirrored(int ch) {
		return false;
    }

    static final CharacterData instance = new CharacterDataUndefined();
    private CharacterDataUndefined() {};
}

深入學習java原始碼之Character.Subset與Character.UnicodeBlock

深入學習java原始碼之Math.nextAfter()與 Math.nextUp()

深入學習java原始碼之Math.ulp()與 Math.signum()

深入學習java原始碼之Math.max()與 Math.min()

深入學習java原始碼之Math.addExact()與 Math.multiplyExact()

深入學習java原始碼之Math.floor()與 Math.rint()

深入學習java原始碼之Math.sin()與 Math.sqrt()

深入學習java原始碼之Math.toRadians()與 Math.toDegrees()

深入學習java原始碼之System.console()與System.load()

深入學習java原始碼之基本型別與引用型別

深入學習java原始碼之ArrayList.addAll()與ArrayList.retainAll()

深入學習java原始碼之lambda表示式與函式式介面

深入學習java原始碼之ArrayList.spliterator()與ArrayList.subList()

深入學習java原始碼之ArrayList.iterator()與ArrayList.listIterator()

深入學習java原始碼之 Array.newInstance()與Array.getLong()

深入學習java原始碼之Objects.deepEquals()與Objects.nonNull()

深入學習java原始碼之Integer.parseInt()與Integer.valueOf()

深入學習java原始碼之Arrays.asList()與Arrays.stream()

深入學習java原始碼之 Arrays.sort()與Arrays.parallelPrefix()

深入學習java原始碼之StringBuilder.indexOf()與StringBuilder.reverse()

深入學習java原始碼之StringBuilder.insert()與StringBuilder.replace()

深入學習java原始碼之Character.Subset與Character.UnicodeBlock

相關推薦