剖析golang map的實現
[TOC]
本文參考的是golang 1.10原始碼實現。
golang中map是一個kv對集合。 底層使用hash table,用連結串列來解決衝突,通過編譯器配合runtime,所有的map物件都是共用一份程式碼。
對比其他語言
c++使用紅黑樹組織,效能稍低但是穩定性很好。使用模版在編譯期生成程式碼,好處是效率高,但是缺點是程式碼膨脹、編譯時間也會變長。
java使用的是hash table+連結串列/紅黑樹,當bucket內元素超過某個閾值時,該bucket的連結串列會轉換成紅黑樹。java為了所有map共用一份程式碼,規定了只有Object的子類才能使用作為map的key,缺點是基礎資料型別必須使用object包裝一下才能使用map。
1. 函式選擇
hash函式,有加密型和非加密型。加密型的一般用於加密資料、數字摘要等,典型代表就是md5、sha1、sha256、aes256這種;非加密型的一般就是查詢。在map的應用場景中,用的是查詢。選擇hash函式主要考察的是兩點:效能、碰撞概率。
具體hash函式的效能比較可以看: ofollow,noindex">http://aras-p.info/blog/2016/08/09/More-Hash-Function-Tests/
golang使用的hash演算法根據硬體選擇,如果cpu支援aes,那麼使用aes hash,否則使用memhash,memhash是參考xxhash、cityhash實現的,效能炸裂。
把hash值對映到buckte時,golang會把bucket的數量規整為2的次冪,而有m=2 b ,則n%m=n&(m-1),用位運算規避mod的昂貴代價。
2. 結構組成
首先我們看下map的結構:
// A header for a Go map. type hmap struct { // Note: the format of the hmap is also encoded in cmd/compile/internal/gc/reflect.go. // Make sure this stays in sync with the compiler's definition. countint // # live cells == size of map.Must be first (used by len() builtin) flagsuint8 Buint8// log_2 of # of buckets (can hold up to loadFactor * 2^B items) noverflow uint16 // approximate number of overflow buckets; see incrnoverflow for details hash0uint32 // hash seed bucketsunsafe.Pointer // array of 2^B Buckets. may be nil if count==0. oldbuckets unsafe.Pointer // previous bucket array of half the size, non-nil only when growing nevacuateuintptr// progress counter for evacuation (buckets less than this have been evacuated) extra *mapextra // optional fields } // mapextra holds fields that are not present on all maps. type mapextra struct { // If both key and value do not contain pointers and are inline, then we mark bucket // type as containing no pointers. This avoids scanning such maps. // However, bmap.overflow is a pointer. In order to keep overflow buckets // alive, we store pointers to all overflow buckets in hmap.extra.overflow and hmap.extra.oldoverflow. // overflow and oldoverflow are only used if key and value do not contain pointers. // overflow contains overflow buckets for hmap.buckets. // oldoverflow contains overflow buckets for hmap.oldbuckets. // The indirection allows to store a pointer to the slice in hiter. overflow*[]*bmap oldoverflow *[]*bmap // nextOverflow holds a pointer to a free overflow bucket. nextOverflow *bmap } // A bucket for a Go map. type bmap struct { // tophash generally contains the top byte of the hash value // for each key in this bucket. If tophash[0] < minTopHash, // tophash[0] is a bucket evacuation state instead. tophash [bucketCnt]uint8 // Followed by bucketCnt keys and then bucketCnt values. // NOTE: packing all the keys together and then all the values together makes the // code a bit more complicated than alternating key/value/key/value/... but it allows // us to eliminate padding which would be needed for, e.g., map[int64]int8. // Followed by an overflow pointer. }
一個map主要是由三個結構構成:
- hmap --- map的最外層的資料結構,包括了map的各種基礎資訊、如大小、bucket。
- mapextra --- 記錄map的額外資訊,例如overflow bucket。
- bmap --- 代表bucket,每一個bucket最多放8個kv,最後由一個overflow欄位指向下一個bmap,注意key、value、overflow欄位都不顯示定義,而是通過maptype計算偏移獲取的。

hmap.001.png
其中hmap.extra.nextOverflow指向的是預分配的overflow bucket,預分配的用完了那麼值就變成nil。
hmap.noverflow是overflow bucket的數量,當B小於16時是準確值,大於等於16時是大概的值。
hmap.count是當前map的元素個數,也就是len()返回的值。
2.1 設計原理
介紹完結構,我們就細說一下這麼設計的原因。
2.1.1 bmap細節
在golang map中出現衝突時,不是每一個key都申請一個結構通過連結串列串起來, 而是以bmap為最小粒度掛載,一個bmap可以放8個kv。這樣減少物件數量,減輕管理記憶體的負擔,利於gc。
如果插入時,bmap中key超過8,那麼就會申請一個新的bmap(overflow bucket)掛在這個bmap的後面形成連結串列, 優先用預分配的overflow bucket,如果預分配的用完了,那麼就malloc一個掛上去。注意golang的map不會shrink,記憶體只會越用越多,overflow bucket中的key全刪了也不會釋放
hash值的高8位儲存在bucket中的tophash欄位。每個桶最多放8個kv對,所以tophash型別是陣列[8]uint8。 把高八位儲存起來,這樣不用完整比較key就能過濾掉不符合的key,加快查詢速度。實際上當hash值的高八位小於常量minTopHash時,會加上minTopHash,區間[0, minTophash)的值用於特殊標記。 查詢key時,計算hash值,用hash值的高八位在tophash中查詢,有tophash相等的,再去比較key值是否相同。
????? 這裡我不太清楚,1.為啥小於minTopHash才加 2.為什麼不是位運算而用加。 剛好top在[0,minHash),或著加上minHash之後溢位到這個區間,豈不是可能誤判?
// tophash calculates the tophash value for hash. func tophash(hash uintptr) uint8 { top := uint8(hash >> (sys.PtrSize*8 - 8)) if top < minTopHash { top += minTopHash } return top }
bmap中所有key存在一塊,所有value存在一塊,這樣做方便記憶體對齊。
當key大於128位元組時,bucket的key欄位儲存的會是指標,指向key的實際內容;value也是一樣。
我們還知道golang中沒有範型,為了支援map的範型,golang定義了一個maptype型別,定義了這類key用什麼hash函式、bucket的大小、怎麼比較之類的,通過這個變數來實現範型。
2.1.2 擴容設計
bcuket掛接的連結串列越來越長,效能會退化,那麼就要進行擴容,擴大bucket的數量。
當元素個數/bucket個數大於等於6.5時,就會進行擴容,把bucket數量擴成原本的兩倍,當hash表擴容之後,需要將那些老資料遷移到新table上(原始碼中稱之為evacuate), 資料搬遷不是一次性完成,而是逐步的完成(在insert和remove時進行搬移),這樣就分攤了擴容的耗時。同時為了避免有個bucket一直訪問不到導致擴容無法完成,還會進行一個順序擴容,每次因為寫操作搬遷對應bucket後,還會按順序搬遷未搬遷的bucket,所以最差情況下n次寫操作,就保證搬遷完大小為n的map。
擴容會建立一個大小是原來2倍的新的表,將舊的bucket搬到新的表中之後,並不會將舊的bucket從oldbucket中刪除,而是加上一個已刪除的標記。
只有當所有的bucket都從舊錶移到新表之後,才會將oldbucket釋放掉。 如果擴容過程中,閾值又超了呢?如果正在擴容,那麼不會再進行擴容。
總體思路描述完,就看原始碼建立、查詢、賦值、刪除的具體實現。
3. 原始碼實現
3.1 建立
// makemap implements Go map creation for make(map[k]v, hint). // If the compiler has determined that the map or the first bucket // can be created on the stack, h and/or bucket may be non-nil. // If h != nil, the map can be created directly in h. // If h.buckets != nil, bucket pointed to can be used as the first bucket. func makemap(t *maptype, hint int, h *hmap) *hmap { if hint < 0 || hint > int(maxSliceCap(t.bucket.size)) { hint = 0 } // initialize Hmap if h == nil { h = new(hmap) } h.hash0 = fastrand() // find size parameter which will hold the requested # of elements B := uint8(0) for overLoadFactor(hint, B) { B++ } h.B = B // allocate initial hash table // if B == 0, the buckets field is allocated lazily later (in mapassign) // If hint is large zeroing this memory could take a while. if h.B != 0 { var nextOverflow *bmap h.buckets, nextOverflow = makeBucketArray(t, h.B, nil) if nextOverflow != nil { h.extra = new(mapextra) h.extra.nextOverflow = nextOverflow } } return h }
hint是一個啟發值,啟發初建map時建立多少個bucket,如果hint是0那麼就先不分配bucket,lazy分配。大概流程就是設定一下hash seed、bucket數量、實際申請bucket之類的,流程很簡單。
然後我們在看下申請bucket實際幹了啥:
// makeBucketArray initializes a backing array for map buckets. // 1<<b is the minimum number of buckets to allocate. // dirtyalloc should either be nil or a bucket array previously // allocated by makeBucketArray with the same t and b parameters. // If dirtyalloc is nil a new backing array will be alloced and // otherwise dirtyalloc will be cleared and reused as backing array. func makeBucketArray(t *maptype, b uint8, dirtyalloc unsafe.Pointer) (buckets unsafe.Pointer, nextOverflow *bmap) { base := bucketShift(b) nbuckets := base // For small b, overflow buckets are unlikely. // Avoid the overhead of the calculation. if b >= 4 { // Add on the estimated number of overflow buckets // required to insert the median number of elements // used with this value of b. nbuckets += bucketShift(b - 4) sz := t.bucket.size * nbuckets up := roundupsize(sz) if up != sz { nbuckets = up / t.bucket.size } } if dirtyalloc == nil { buckets = newarray(t.bucket, int(nbuckets)) } else { // dirtyalloc was previously generated by // the above newarray(t.bucket, int(nbuckets)) // but may not be empty. buckets = dirtyalloc size := t.bucket.size * nbuckets if t.bucket.kind&kindNoPointers == 0 { memclrHasPointers(buckets, size) } else { memclrNoHeapPointers(buckets, size) } } if base != nbuckets { // We preallocated some overflow buckets. // To keep the overhead of tracking these overflow buckets to a minimum, // we use the convention that if a preallocated overflow bucket's overflow // pointer is nil, then there are more available by bumping the pointer. // We need a safe non-nil pointer for the last overflow bucket; just use buckets. nextOverflow = (*bmap)(add(buckets, base*uintptr(t.bucketsize))) last := (*bmap)(add(buckets, (nbuckets-1)*uintptr(t.bucketsize))) last.setoverflow(t, (*bmap)(buckets)) } return buckets, nextOverflow }
預設建立2 b 個bucket,如果 b大於等於4,那麼就預先額外建立一些overflow bucket。除了最後一個overflow bucket,其餘overflow bucket的overflow指標都是nil,最後一個overflow bucket的overflow指標指向bucket陣列第一個元素,作為哨兵,說明到了到結尾了.

建立簡單流程
3.2 查詢
// mapaccess1 returns a pointer to h[key].Never returns nil, instead // it will return a reference to the zero object for the value type if // the key is not in the map. // NOTE: The returned pointer may keep the whole map live, so don't // hold onto it for very long. func mapaccess1(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer { if raceenabled && h != nil { callerpc := getcallerpc() pc := funcPC(mapaccess1) racereadpc(unsafe.Pointer(h), callerpc, pc) raceReadObjectPC(t.key, key, callerpc, pc) } if msanenabled && h != nil { msanread(key, t.key.size) } if h == nil || h.count == 0 { return unsafe.Pointer(&zeroVal[0]) } if h.flags&hashWriting != 0 { throw("concurrent map read and map write") } alg := t.key.alg hash := alg.hash(key, uintptr(h.hash0)) m := bucketMask(h.B) b := (*bmap)(add(h.buckets, (hash&m)*uintptr(t.bucketsize))) if c := h.oldbuckets; c != nil { if !h.sameSizeGrow() { // There used to be half as many buckets; mask down one more power of two. m >>= 1 } oldb := (*bmap)(add(c, (hash&m)*uintptr(t.bucketsize))) if !evacuated(oldb) { b = oldb } } top := tophash(hash) for ; b != nil; b = b.overflow(t) { for i := uintptr(0); i < bucketCnt; i++ { if b.tophash[i] != top { continue } k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize)) if t.indirectkey { k = *((*unsafe.Pointer)(k)) } if alg.equal(key, k) { v := add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.valuesize)) if t.indirectvalue { v = *((*unsafe.Pointer)(v)) } return v } } } return unsafe.Pointer(&zeroVal[0]) }
-
先定位出bucket,如果正在擴容,並且這個bucket還沒搬到新的hash表中,那麼就從老的hash表中查詢。
-
在bucket中進行順序查詢,使用高八位進行快速過濾,高八位相等,再比較key是否相等,找到就返回value。如果當前bucket找不到,就往下找overflow bucket,都沒有就返回零值。
這裡我們可以看到, 訪問的時候,並不進行擴容的資料搬遷。並且併發有寫操作時拋異常 。
這裡要注意的是,t.bucketsize並不是bmap的size,而是bmap加上儲存key、value、overflow指標,所以查詢bucket的時候時候用的不是bmap的szie。

查詢簡單流程
3.3 賦值
// Like mapaccess, but allocates a slot for the key if it is not present in the map. func mapassign(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer { if h == nil { panic(plainError("assignment to entry in nil map")) } if raceenabled { callerpc := getcallerpc() pc := funcPC(mapassign) racewritepc(unsafe.Pointer(h), callerpc, pc) raceReadObjectPC(t.key, key, callerpc, pc) } if msanenabled { msanread(key, t.key.size) } if h.flags&hashWriting != 0 { throw("concurrent map writes") } alg := t.key.alg hash := alg.hash(key, uintptr(h.hash0)) // Set hashWriting after calling alg.hash, since alg.hash may panic, // in which case we have not actually done a write. h.flags |= hashWriting if h.buckets == nil { h.buckets = newobject(t.bucket) // newarray(t.bucket, 1) } again: bucket := hash & bucketMask(h.B) if h.growing() { growWork(t, h, bucket) } b := (*bmap)(unsafe.Pointer(uintptr(h.buckets) + bucket*uintptr(t.bucketsize))) top := tophash(hash) var inserti *uint8 var insertk unsafe.Pointer var val unsafe.Pointer for { for i := uintptr(0); i < bucketCnt; i++ { if b.tophash[i] != top { if b.tophash[i] == empty && inserti == nil { inserti = &b.tophash[i] insertk = add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize)) val = add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.valuesize)) } continue } k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize)) if t.indirectkey { k = *((*unsafe.Pointer)(k)) } if !alg.equal(key, k) { continue } // already have a mapping for key. Update it. if t.needkeyupdate { typedmemmove(t.key, k, key) } val = add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.valuesize)) goto done } ovf := b.overflow(t) if ovf == nil { break } b = ovf } // Did not find mapping for key. Allocate new cell & add entry. // If we hit the max load factor or we have too many overflow buckets, // and we're not already in the middle of growing, start growing. if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) { hashGrow(t, h) goto again // Growing the table invalidates everything, so try again } if inserti == nil { // all current buckets are full, allocate a new one. newb := h.newoverflow(t, b) inserti = &newb.tophash[0] insertk = add(unsafe.Pointer(newb), dataOffset) val = add(insertk, bucketCnt*uintptr(t.keysize)) } // store new key/value at insert position if t.indirectkey { kmem := newobject(t.key) *(*unsafe.Pointer)(insertk) = kmem insertk = kmem } if t.indirectvalue { vmem := newobject(t.elem) *(*unsafe.Pointer)(val) = vmem } typedmemmove(t.key, insertk, key) *inserti = top h.count++ done: if h.flags&hashWriting == 0 { throw("concurrent map writes") } h.flags &^= hashWriting if t.indirectvalue { val = *((*unsafe.Pointer)(val)) } return val }
-
hash表如果正在擴容,並且這次要操作的bucket還沒搬到新hash表中,那麼先進行搬遷(擴容細節下面細說)。
-
在buck中尋找key,同時記錄下第一個空位置,如果找不到,那麼就在空位置中插入資料;如果找到了,那麼就更新對應的value;
-
找不到key就看下需不需要擴容,需要擴容並且沒有正在擴容,那麼就進行擴容,然後回到第一步。
-
找不到key,不需要擴容,但是沒有空slot,那麼就分配一個overflow bucket掛在連結串列結尾,用新bucket的第一個slot放存放資料。
3.4 刪除
func mapdelete(t *maptype, h *hmap, key unsafe.Pointer) { if raceenabled && h != nil { callerpc := getcallerpc() pc := funcPC(mapdelete) racewritepc(unsafe.Pointer(h), callerpc, pc) raceReadObjectPC(t.key, key, callerpc, pc) } if msanenabled && h != nil { msanread(key, t.key.size) } if h == nil || h.count == 0 { return } if h.flags&hashWriting != 0 { throw("concurrent map writes") } alg := t.key.alg hash := alg.hash(key, uintptr(h.hash0)) // Set hashWriting after calling alg.hash, since alg.hash may panic, // in which case we have not actually done a write (delete). h.flags |= hashWriting bucket := hash & bucketMask(h.B) if h.growing() { growWork(t, h, bucket) } b := (*bmap)(add(h.buckets, bucket*uintptr(t.bucketsize))) top := tophash(hash) search: for ; b != nil; b = b.overflow(t) { for i := uintptr(0); i < bucketCnt; i++ { if b.tophash[i] != top { continue } k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize)) k2 := k if t.indirectkey { k2 = *((*unsafe.Pointer)(k2)) } if !alg.equal(key, k2) { continue } // Only clear key if there are pointers in it. if t.indirectkey { *(*unsafe.Pointer)(k) = nil } else if t.key.kind&kindNoPointers == 0 { memclrHasPointers(k, t.key.size) } v := add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.valuesize)) if t.indirectvalue { *(*unsafe.Pointer)(v) = nil } else if t.elem.kind&kindNoPointers == 0 { memclrHasPointers(v, t.elem.size) } else { memclrNoHeapPointers(v, t.elem.size) } b.tophash[i] = empty h.count-- break search } } if h.flags&hashWriting == 0 { throw("concurrent map writes") } h.flags &^= hashWriting }
-
如果正在擴容,並且操作的bucket還沒搬遷完,那麼搬遷bucket。
-
找出對應的key,如果key、value是包含指標的那麼會清理指標指向的記憶體,否則不會回收記憶體。
3.5 擴容
首先通過賦值、刪除流程,我們可以知道, 觸發擴容的是賦值、刪除操作 ,具體判斷要不要擴容的程式碼片段如下:
// overLoadFactor reports whether count items placed in 1<<B buckets is over loadFactor. func overLoadFactor(count int, B uint8) bool { return count > bucketCnt && uintptr(count) > loadFactorNum*(bucketShift(B)/loadFactorDen) } // tooManyOverflowBuckets reports whether noverflow buckets is too many for a map with 1<<B buckets. // Note that most of these overflow buckets must be in sparse use; // if use was dense, then we'd have already triggered regular map growth. func tooManyOverflowBuckets(noverflow uint16, B uint8) bool { // If the threshold is too low, we do extraneous work. // If the threshold is too high, maps that grow and shrink can hold on to lots of unused memory. // "too many" means (approximately) as many overflow buckets as regular buckets. // See incrnoverflow for more details. if B > 15 { B = 15 } // The compiler doesn't see here that B < 16; mask B to generate shorter shift code. return noverflow >= uint16(1)<<(B&15) } { .... // If we hit the max load factor or we have too many overflow buckets, // and we're not already in the middle of growing, start growing. if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) { hashGrow(t, h) goto again // Growing the table invalidates everything, so try again } .... }
翻譯一下程式碼,意思就是:
func overLoadFactor(countint, Buint8) bool { // return count>bucketCnt&&uintptr(count) >loadFactorNum*(bucketShift(B)/loadFactorDen) return 元素個數>8 && count>bucket數量*6.5 其中loadFactorNum是常量13,loadFactorDen是常量2,所以是6.5 bucket數量不算overflow bucket } func tooManyOverflowBuckets(noverflowuint16, Buint8) bool{ if B > 15 { B=15 } // The compiler doesn't see here that B < 16; mask B to generate shorter shift code. return noverflow>=uint16(1)<<(B&15) } if (不是正在擴容 && (元素個數/bucket數超過某個值 || 太多overflow bucket)) { 進行擴容 }
判斷完擴容後,如果需要擴容,那麼第一步需要做的,就是對hash表進行擴容:
//僅對hash表進行擴容,這裡不進行搬遷 func hashGrow(t *maptype, h *hmap) { // If we've hit the load factor, get bigger. // Otherwise, there are too many overflow buckets, // so keep the same number of buckets and "grow" laterally. bigger := uint8(1) if !overLoadFactor(h.count+1, h.B) { bigger = 0 h.flags |= sameSizeGrow } oldbuckets := h.buckets newbuckets, nextOverflow := makeBucketArray(t, h.B+bigger, nil) flags := h.flags &^ (iterator | oldIterator) if h.flags&iterator != 0 { flags |= oldIterator } // commit the grow (atomic wrt gc) h.B += bigger h.flags = flags h.oldbuckets = oldbuckets h.buckets = newbuckets h.nevacuate = 0 h.noverflow = 0 if h.extra != nil && h.extra.overflow != nil { // Promote current overflow buckets to the old generation. if h.extra.oldoverflow != nil { throw("oldoverflow is not nil") } h.extra.oldoverflow = h.extra.overflow h.extra.overflow = nil } if nextOverflow != nil { if h.extra == nil { h.extra = new(mapextra) } h.extra.nextOverflow = nextOverflow } // the actual copying of the hash table data is done incrementally // by growWork() and evacuate(). }
擴容的函式hashGrow其實僅僅是進行一些空間分配,欄位的初始化,實際的搬遷操作是在growWork函式中
func growWork(t *maptype, h *hmap, bucket uintptr) { // make sure we evacuate the oldbucket corresponding // to the bucket we're about to use evacuate(t, h, bucket&h.oldbucketmask()) // evacuate one more oldbucket to make progress on growing if h.growing() { evacuate(t, h, h.nevacuate) } }
evacuate是進行具體搬遷某個bucket的函式,可以看出 growWork會搬遷兩個bucket,一個是入參bucket;另一個是h.nevacuate。這個nevacuate是一個順序累加的值 。可以想想如果每次僅僅搬遷進行寫操作(賦值/刪除)的bucket,那麼有可能某些bucket就是一直沒有機會訪問到,那麼擴容就一直沒法完成,總是在擴容中的狀態,因此會額外進行一次順序遷移,理論上,有N個old bucket,最多N次寫操作,那麼必定會搬遷完。
然後我們再看下evacuate具體的實現
func evacuate(t *maptype, h *hmap, oldbucket uintptr) { b := (*bmap)(add(h.oldbuckets, oldbucket*uintptr(t.bucketsize))) newbit := h.noldbuckets() if !evacuated(b) { // TODO: reuse overflow buckets instead of using new ones, if there // is no iterator using the old buckets.(If !oldIterator.) // xy contains the x and y (low and high) evacuation destinations. var xy [2]evacDst x := &xy[0] x.b = (*bmap)(add(h.buckets, oldbucket*uintptr(t.bucketsize))) x.k = add(unsafe.Pointer(x.b), dataOffset) x.v = add(x.k, bucketCnt*uintptr(t.keysize)) if !h.sameSizeGrow() { // Only calculate y pointers if we're growing bigger. // Otherwise GC can see bad pointers. y := &xy[1] y.b = (*bmap)(add(h.buckets, (oldbucket+newbit)*uintptr(t.bucketsize))) y.k = add(unsafe.Pointer(y.b), dataOffset) y.v = add(y.k, bucketCnt*uintptr(t.keysize)) } for ; b != nil; b = b.overflow(t) { k := add(unsafe.Pointer(b), dataOffset) v := add(k, bucketCnt*uintptr(t.keysize)) for i := 0; i < bucketCnt; i, k, v = i+1, add(k, uintptr(t.keysize)), add(v, uintptr(t.valuesize)) { top := b.tophash[I] if top == empty { b.tophash[i] = evacuatedEmpty continue } if top < minTopHash { throw("bad map state") } k2 := k if t.indirectkey { k2 = *((*unsafe.Pointer)(k2)) } var useY uint8 if !h.sameSizeGrow() { // Compute hash to make our evacuation decision (whether we need // to send this key/value to bucket x or bucket y). hash := t.key.alg.hash(k2, uintptr(h.hash0)) if h.flags&iterator != 0 && !t.reflexivekey && !t.key.alg.equal(k2, k2) { // If key != key (NaNs), then the hash could be (and probably // will be) entirely different from the old hash. Moreover, // it isn't reproducible. Reproducibility is required in the // presence of iterators, as our evacuation decision must // match whatever decision the iterator made. // Fortunately, we have the freedom to send these keys either // way. Also, tophash is meaningless for these kinds of keys. // We let the low bit of tophash drive the evacuation decision. // We recompute a new random tophash for the next level so // these keys will get evenly distributed across all buckets // after multiple grows. useY = top & 1 top = tophash(hash) } else { if hash&newbit != 0 { useY = 1 } } } if evacuatedX+1 != evacuatedY { throw("bad evacuatedN") } b.tophash[i] = evacuatedX + useY // evacuatedX + 1 == evacuatedY dst := &xy[useY]// evacuation destination if dst.i == bucketCnt { dst.b = h.newoverflow(t, dst.b) dst.i = 0 dst.k = add(unsafe.Pointer(dst.b), dataOffset) dst.v = add(dst.k, bucketCnt*uintptr(t.keysize)) } dst.b.tophash[dst.i&(bucketCnt-1)] = top // mask dst.i as an optimization, to avoid a bounds check if t.indirectkey { *(*unsafe.Pointer)(dst.k) = k2 // copy pointer } else { typedmemmove(t.key, dst.k, k) // copy value } if t.indirectvalue { *(*unsafe.Pointer)(dst.v) = *(*unsafe.Pointer)(v) } else { typedmemmove(t.elem, dst.v, v) } dst.i++ // These updates might push these pointers past the end of the // key or value arrays.That's ok, as we have the overflow pointer // at the end of the bucket to protect against pointing past the // end of the bucket. dst.k = add(dst.k, uintptr(t.keysize)) dst.v = add(dst.v, uintptr(t.valuesize)) } } // Unlink the overflow buckets & clear key/value to help GC. if h.flags&oldIterator == 0 && t.bucket.kind&kindNoPointers == 0 { b := add(h.oldbuckets, oldbucket*uintptr(t.bucketsize)) // Preserve b.tophash because the evacuation // state is maintained there. ptr := add(b, dataOffset) n := uintptr(t.bucketsize) - dataOffset memclrHasPointers(ptr, n) } } if oldbucket == h.nevacuate { advanceEvacuationMark(h, t, newbit) } }
在advanceEvacuationMark中進行nevacuate的累加,遇到已經遷移的bucket會繼續累加,一次最多加1024。