1. 程式人生 > >geth結構解析和原始碼分析

geth結構解析和原始碼分析

第一部分 看看geth客戶端的整體結構
建立私鏈的時候已經指定所有的資訊都放在private-geth目錄下,現在是已經有過挖礦的目錄。

當時我們把創世檔案genesis.json放在該目錄下了、

[email protected]5tthrr8u:/home/ubuntu/private-geth# ll
total 16
drwxr-xr-x 3 root   root   4096 Jul  2 17:02 ./
drwxr-xr-x 6 ubuntu ubuntu 4096 Jul  4 14:07 ../
drwx------ 5 root   root   4096 Jul  2 17:41 data/
-rw-r--r-- 1 root root 529 Jul 2 16:29 genesis.json

進入真正的存放資料的目錄private-geth/data/00
geth中儲存的是區塊鏈的相關資料
keystore中儲存的是該鏈條中的使用者資訊

root@i-5tthrr8u:/home/ubuntu/private-geth/data/00# ll
total 20
drwx------ 4 root root 4096 Jul  2 17:23 ./
drwx------ 5 root root 4096 Jul  2 17:41 ../
drwxr-xr-x 5 root root 4096
Jul 2 17:02 geth/ -rw------- 1 root root 1391 Jul 2 17:58 history drwx------ 2 root root 4096 Jul 2 17:10 keystore/

之前我們這個節點已經建立了兩個賬戶,現在我們可以看到keystore裡面有兩個賬戶資訊的檔案

[email protected]-5tthrr8u:/home/ubuntu/private-geth/data/00/keystore# ll
total 16
drwx------ 2 root root 4096 Jul  2 17:10 ./
drwx------ 4 root
root 4096 Jul 2 17:23 ../ -rw------- 1 root root 491 Jul 2 17:02 UTC--2017-07-02T09-02-56.470592674Z--28b769b3b9109afd1e9e50a9312c5a3bfae8a699 -rw------- 1 root root 491 Jul 2 17:10 UTC--2017-07-02T09-10-28.087401309Z--b4e2e2514eae3684157bf34a0cee2c07c431cf92

每個賬戶都由一對鑰匙定義,一個私鑰和一個公鑰。 賬戶以地址為索引,地址由公鑰衍生而來,取公鑰的最後 20個位元組。每對私鑰 /地址都編碼在一個鑰匙檔案裡。鑰匙檔案是JSON文字檔案,可以用任何文字編輯器開啟和瀏覽。鑰匙檔案的關鍵部分,賬戶私鑰,通常用你建立帳戶時設定的密碼進行加密。鑰匙檔案的檔名格式為UTC。賬號列出時是按字母順序排列,但是由於時間戳格式,實際上它是按建立順序排列。如果把祕鑰丟了鑰匙檔案可以在以太坊節點資料目錄的keystore子目錄下找到,接下來我們進入一個keystore目錄檔案看看他的資訊:

root@i-5tthrr8u:/home/ubuntu/private-geth/data/00/keystore# vim UTC--2017-07-02T09-02-56.470592674Z--28b769b3b9109afd1e9e50a9312c5a3bfae8a699 

{"address":"28b769b3b9109afd1e9e50a9312c5a3bfae8a699",
"crypto":{
"cipher":"aes-128-ctr",
"ciphertext":"89ce1513b4b5a325735891b559c361ce696bb2c173a7a1b290549e79dad8f847",
"cipherparams":{"iv":"982c86418fae2dd39e04d1e51528cffa"},
"kdf":"scrypt",
"kdfparams":{"dklen":32,"n":262144,"p":1,"r":8,"salt":"4227384ea0e3d15af1bac190f7e01d392543d0a5ca1ec931c1d340f87845f771"},
"mac":"46cffc6e4f57fa27b69e53dc4ae43a03ce1b93f24c132aa4655f53ddf215f112"},
"id":"e516b9d4-2161-4648-b3db-fc2ef1c3739c",
"version":3
}

警告:記住密碼並”備份鑰匙檔案”。為了從賬號傳送交易,包括髮送以太幣,你必須同時有鑰匙檔案和密碼。確保鑰匙檔案有個備份並牢記密碼,儘可能安全地儲存它們。這裡沒有逃亡路徑,如果鑰匙檔案丟失或忘記密碼,就會丟失所有的以太幣。沒有密碼不可能進入賬號,也沒有忘記密碼選項。所以一定不要忘記密碼。

接下來進入geth可以看到chaindata,lightchaindata,nodes目錄

root@i-5tthrr8u:/home/ubuntu/private-geth/data/00/geth# ll
total 24
drwxr-xr-x 5 root root 4096 Jul  2 17:02 ./
drwx------ 4 root root 4096 Jul  2 17:23 ../
drwxr-xr-x 2 root root 4096 Jul  4 14:12 chaindata/
drwxr-xr-x 2 root root 4096 Jul  2 17:02 lightchaindata/
-rw-r--r-- 1 root root    0 Jul  2 17:02 LOCK
-rw------- 1 root root   64 Jul  2 17:02 nodekey
drwxr-xr-x 2 root root 4096 Jul  4 15:55 nodes/

進入nodes(我們這條私鏈有三個節點,所以這裡有三個ldb檔案)

[email protected]-5tthrr8u:/home/ubuntu/private-geth/data/00/geth/nodes# ll
total 5316
drwxr-xr-x 2 root root    4096 Jul  4 15:55 ./
drwxr-xr-x 5 root root    4096 Jul  2 17:02 ../
-rw-r--r-- 1 root root  405250 Jul  4 15:57 000033.log
-rw-r--r-- 1 root root 2132979 Jul  4 15:55 000035.ldb
-rw-r--r-- 1 root root 2131238 Jul  4 15:55 000036.ldb
-rw-r--r-- 1 root root  739354 Jul  4 15:55 000037.ldb
-rw-r--r-- 1 root root      16 Jul  4 14:12 CURRENT
-rw-r--r-- 1 root root       0 Jul  2 17:02 LOCK
-rw-r--r-- 1 root root    8187 Jul  4 15:55 LOG
-rw-r--r-- 1 root root    4557 Jul  4 15:55 MANIFEST-000013

進入chaindata,區塊鏈最後的本地儲存都是以ldb檔案的形勢(但這裡是不是應該每個區塊一個ldb檔案呢?)

[email protected]-5tthrr8u:/home/ubuntu/private-geth/data/00/geth/chaindata# ll
total 52
drwxr-xr-x 2 root root  4096 Jul  5 09:51 ./
drwxr-xr-x 5 root root  4096 Jul  2 17:02 ../
-rw-r--r-- 1 root root  5288 Jul  2 17:56 000008.ldb
-rw-r--r-- 1 root root 11681 Jul  4 14:12 000009.ldb
-rw-r--r-- 1 root root  8921 Jul  4 14:13 000010.log
-rw-r--r-- 1 root root    16 Jul  4 14:12 CURRENT
-rw-r--r-- 1 root root     0 Jul  2 17:02 LOCK
-rw-r--r-- 1 root root  2807 Jul  4 14:12 LOG
-rw-r--r-- 1 root root   346 Jul  4 14:12 MANIFEST-000011

進入Lightchaindata

[email protected]-5tthrr8u:/home/ubuntu/private-geth/data/00/geth/lightchaindata# ll
total 24
drwxr-xr-x 2 root root 4096 Jul  2 17:02 ./
drwxr-xr-x 5 root root 4096 Jul  2 17:02 ../
-rw-r--r-- 1 root root 1237 Jul  2 17:02 000001.log
-rw-r--r-- 1 root root   16 Jul  2 17:02 CURRENT
-rw-r--r-- 1 root root    0 Jul  2 17:02 LOCK
-rw-r--r-- 1 root root  358 Jul  2 17:02 LOG
-rw-r--r-- 1 root root   54 Jul  2 17:02 MANIFEST-000000

第二部分 看看原始碼的結構

1 Core/types/block.go
首先看到的是一個區塊的結構

// Block represents an entire block in the Ethereum blockchain.

type Block struct {
    header       *Header
    uncles       []*Header
    transactions Transactions

    // caches    hashsize欄位是cache之用,避免多次 hash/sign導致效能損失
    hash atomic.Value
    size atomic.Value

    // Td is used by package core to store the total difficulty
    // of the chain up to and including the block.挖礦難度
    td *big.Int

    // These fields are used by package eth to track
    // inter-peer block relay.
    ReceivedAt   time.Time
    ReceivedFrom interface{}
}
這是一個區塊體的結構,區塊體是動態的儲存資料的,主要包含了交易列表和uncle列表
// Body is a simple (mutable, non-safe) data container for storing and moving
// a block's data contents (transactions and uncles) together.
type Body struct {
    Transactions []*Transaction
    Uncles       []*Header
}
區塊頭的結構體,裡面的引數我們都很熟悉就不解釋了
// Header represents a block header in the Ethereum blockchain.
type Header struct {
    ParentHash  common.Hash    `json:"parentHash"       gencodec:"required"`
    UncleHash   common.Hash    `json:"sha3Uncles"       gencodec:"required"`
    Coinbase    common.Address `json:"miner"            gencodec:"required"`
    Root        common.Hash    `json:"stateRoot"        gencodec:"required"`
    TxHash      common.Hash    `json:"transactionsRoot" gencodec:"required"`
    ReceiptHash common.Hash    `json:"receiptsRoot"     gencodec:"required"`
    Bloom       Bloom          `json:"logsBloom"        gencodec:"required"`
    Difficulty  *big.Int       `json:"difficulty"       gencodec:"required"`
    Number      *big.Int       `json:"number"           gencodec:"required"`
    GasLimit    *big.Int       `json:"gasLimit"         gencodec:"required"`
    GasUsed     *big.Int       `json:"gasUsed"          gencodec:"required"`
    Time        *big.Int       `json:"timestamp"        gencodec:"required"`
    Extra       []byte         `json:"extraData"        gencodec:"required"`
    MixDigest   common.Hash    `json:"mixHash"          gencodec:"required"`
    Nonce       BlockNonce     `json:"nonce"            gencodec:"required"`
}

2 這是一個交易的結構體
Core/types/transaction.go

1ContractTransaction的區別在於:Recipient == nil ; 2. Transaction能以RLP演算法進行Encode和Decode; 3. hash/size/from欄位是cache之用,避免多次 hash/sign導致效能損失;
type Transaction struct {
    data txdata
    // caches
    hash atomic.Value
    size atomic.Value
    from atomic.Value
}

type txdata struct {
    AccountNonce uint64          `json:"nonce"    gencodec:"required"`
    Price        *big.Int        `json:"gasPrice" gencodec:"required"`
    GasLimit     *big.Int        `json:"gas"      gencodec:"required"`
    Recipient    *common.Address `json:"to"       rlp:"nil"` // nil means contract creation
    Amount       *big.Int        `json:"value"    gencodec:"required"`
    Payload      []byte          `json:"input"    gencodec:"required"`

    // Signature values 簽名
    V *big.Int `json:"v" gencodec:"required"`
    R *big.Int `json:"r" gencodec:"required"`
    S *big.Int `json:"s" gencodec:"required"`

    // This is only used when marshaling to JSON.
    Hash *common.Hash `json:"hash" rlp:"-"`
}

3 Receiptroot我們剛剛在區塊頭有看到,那他具體包含的是什麼呢?它是一個交易的結果,主要包括了poststate,交易所花費的gas,bloom和logs


// Receipt represents the results of a transaction.
type Receipt struct {
    // Consensus fields
    PostState         []byte   `json:"root"              gencodec:"required"`
    CumulativeGasUsed *big.Int `json:"cumulativeGasUsed" gencodec:"required"`
    Bloom             Bloom    `json:"logsBloom"         gencodec:"required"`
    Logs              []*Log   `json:"logs"              gencodec:"required"`

    // Implementation fields (don't reorder!)
    TxHash          common.Hash    `json:"transactionHash" gencodec:"required"`
    ContractAddress common.Address `json:"contractAddress"`
    GasUsed         *big.Int       `json:"gasUsed" gencodec:"required"`
}

4 一個個交易被打包到區塊上面,那區塊又是怎麼變成去快鏈的呢?
Core/blockchain.go

// BlockChain represents the canonical chain given a database with a genesis block. The Blockchain manages chain imports, reverts, chain reorganisations.
// Importing blocks in to the block chain happens according to the set of rules defined by the two stage Validator. (需要兩個階段的驗證)Processing of blocks is done using the Processor which processes the included transaction.(第一階段交易的驗證) The validation of the state is done in the second part of the Validator.(第二階段state的驗證) Failing results in aborting of the import.
// The BlockChain also helps in returning blocks from **any** chain included in the database as well as blocks that represents the canonical chain. It's important to note that GetBlock can return any block and does not need to be included in the canonical one where as GetBlockByNumber always represents the canonical chain.


type BlockChain struct {
    config *params.ChainConfig // chain & network configuration

    hc           *HeaderChain
    chainDb      **ethdb**.Database 本地資料庫
    eventMux     *event.TypeMux
    genesisBlock *types.Block

    mu      sync.RWMutex // global mutex for locking chain operations
    chainmu sync.RWMutex // blockchain insertion lock
    procmu  sync.RWMutex // block processor lock

    checkpoint       int          // checkpoint counts towards the new checkpoint
    currentBlock     *types.Block // Current head of the block chain
    currentFastBlock *types.Block // Current head of the fast-sync chain (may be above the block chain!)

    stateCache   *state.StateDB // State database to reuse between imports (contains state cache)
    bodyCache    *lru.Cache     // Cache for the most recent block bodies
    bodyRLPCache *lru.Cache     // Cache for the most recent block bodies in RLP encoded format
    blockCache   *lru.Cache     // Cache for the most recent entire blocks
    futureBlocks *lru.Cache     // future blocks are blocks added for later processing

    quit    chan struct{} // blockchain quit channel
    running int32         // running must be called atomically
    // procInterrupt must be atomically called
    procInterrupt int32          // interrupt signaler for block processing
    wg            sync.WaitGroup // chain processing wait group for shutting down

    engine    consensus.Engine
    processor Processor // block processor interface
    validator Validator // block and state validator interface
    vmConfig  vm.Config

    badBlocks *lru.Cache // Bad block cache
}

注意:1. BlockChain無結構化查詢需求,僅Hash查詢, Key/Value資料庫最方便; 2. 低層用LevelDB儲存,效能好

5 stateDB用來儲存世界狀態
Core/state/statedb.go

// StateDBs within the ethereum protocol are used to store anything
// within the merkle trie. StateDBs take care of caching and storing
// nested states. It's the general query interface to retrieve:
// * Contracts
// * Accounts
type StateDB struct {
    db            ethdb.Database //本地資料庫
    trie          *trie.SecureTrie
    pastTries     []*trie.SecureTrie
    codeSizeCache *lru.Cache

    // This map holds 'live' objects, which will get modified while processing a state transition.
    stateObjects           map[common.Address]*stateObject
    stateObjectsDirty      map[common.Address]struct{}
    stateObjectsDestructed map[common.Address]struct{}

    // The refund counter, also used by state transitioning.
    refund *big.Int

    thash, bhash common.Hash
    txIndex      int
    logs         map[common.Hash][]*types.Log
    logSize      uint

    preimages map[common.Hash][]byte

    // Journal of state modifications. This is the backbone of
    // Snapshot and RevertToSnapshot.
    journal        journal
    validRevisions []revision
    nextRevisionId int

    lock sync.Mutex
}

注意:1. StateDB完整記錄Transaction的執行情況; 2. StateDB的重點是StateObjects; 3. StateDB中的 stateObjects,Account的Address為 key,記錄其Balance、nonce、code、codeHash ,以及tire中的 {string:Hash}等資訊;

那我們接下來看看stateObject結構體
Core/state/state_object.go

// stateObject represents an Ethereum account which is being modified.
//
// The usage pattern is as follows:
// First you need to obtain a state object.
// Account values can be accessed and modified through the object.
// Finally, call CommitTrie to write the modified storage trie into a database.
type stateObject struct {
    address common.Address // Ethereum address of this account
    data    Account
    db      *StateDB

    // DB error.
    // State objects are used by the consensus core and VM which are
    // unable to deal with database-level errors. Any error that occurs
    // during a database read is memoized here and will eventually be returned
    // by StateDB.Commit.
    dbErr error

    // Write caches.
    trie *trie.SecureTrie // storage trie, which becomes non-nil on first access
    code Code             // contract bytecode, which gets set when code is loaded

    cachedStorage Storage // Storage entry cache to avoid duplicate reads
    dirtyStorage  Storage // Storage entries that need to be flushed to disk

    // Cache flags.
    // When an object is marked suicided it will be delete from the trie
    // during the "update" phase of the state transition.
    dirtyCode bool // true if the code was updated
    suicided  bool
    touched   bool
    deleted   bool
    onDirty   func(addr common.Address) // Callback method to mark a state object newly dirty
}

再看看state的一個介面,可以檢視賬戶的餘額,nonce,程式碼和storage

// ChainStateReader wraps access to the state trie of the canonical blockchain. Note that implementations of the interface may be unable to return state values for old blocks.
// In many cases, using CallContract can be preferable to reading raw contract storage.

type ChainStateReader interface {
    BalanceAt(ctx context.Context, account common.Address, blockNumber *big.Int) (*big.Int, error)
    StorageAt(ctx context.Context, account common.Address, key common.Hash, blockNumber *big.Int) ([]byte, error)
    CodeAt(ctx context.Context, account common.Address, blockNumber *big.Int) ([]byte, error)
    NonceAt(ctx context.Context, account common.Address, blockNumber *big.Int) (uint64, error)
}

所有的結構湊明朗了,那具體的驗證過程是怎麼樣的呢
Core/state_processor.go
Core/state_transition.go
Core/block_validator.go

StateProcessor 1. 呼叫StateTransition,驗證(執行)Transaction; 2. 計算Gas、Recipt、Uncle Reward

// StateProcessor is a basic Processor, which takes care of transitioning
// state from one point to another.
//
// StateProcessor implements Processor.
type StateProcessor struct {
    config *params.ChainConfig // Chain configuration options
    bc     *BlockChain         // Canonical block chain
    engine consensus.Engine    // Consensus engine used for block rewards
}

StateTransition
1. 驗證(執行)Transaction;
3. 扣除transaction.data.payload計算資料所需要消耗的gas;
4. 在vm中執行code(生成contract or 執行contract);vm執 行過程中,其gas會被自動消耗。如果gas不足,vm會自 選退出;
5. 將多餘的gas退回到sender.balance中;
6. 將消耗的gas換成balance加到當前env.Coinbase()中;

/*
The State Transitioning Model

A state transition is a change made when a transaction is applied to the current world state
The state transitioning model does all all the necessary work to work out a valid new state root.

1) Nonce handling
2) Pre pay gas
3) Create a new state object if the recipient is \0*32
4) Value transfer
== If contract creation ==
  4a) Attempt to run transaction data
  4b) If valid, use result as code for the new state object
== end ==
5) Run Script section
6) Derive new state root
*/
type StateTransition struct {
    gp         *GasPool
    msg        Message
    gas        uint64
    gasPrice   *big.Int
    initialGas *big.Int
    value      *big.Int
    data       []byte
    state      vm.StateDB

    evm *vm.EVM
}

BlockValidator
1. 驗證UsedGas
2. 驗證Bloom
3. 驗證receiptSha
4. 驗證stateDB.IntermediateRoot

// BlockValidator is responsible for validating block headers, uncles and
// processed state.
//
// BlockValidator implements Validator.
type BlockValidator struct {
    config *params.ChainConfig // Chain configuration options
    bc     *BlockChain         // Canonical block chain
    engine consensus.Engine    // Consensus engine used for validating
}

可以注意到剛才的state和block都是寫進db資料庫的,那我們看一下leveldb資料庫結構

type LDBDatabase struct {
    fn string      // filename for reporting
    db *leveldb.DB // LevelDB instance

    getTimer       gometrics.Timer // Timer for measuring the database get request counts and latencies
    putTimer       gometrics.Timer // Timer for measuring the database put request counts and latencies
    delTimer       gometrics.Timer // Timer for measuring the database delete request counts and latencies
    missMeter      gometrics.Meter // Meter for measuring the missed database get requests
    readMeter      gometrics.Meter // Meter for measuring the database get request data usage