The Science of the Blockchain筆記（一）

阿新 • • 發佈：2018-12-01

一、分散式系統

1. 什麼是分散式系統

分散式系統（distributed system）是建立在網路之上的軟體系統。今天的計算和資訊系統本質上都是分散式的，比如我們的手機，具有與雲分享資料以及存在多個處理器和儲存單元的特性。

2. 分散式的原因

（1）地理位置：現在大的公司都會分佈在不同的地方，每個地方會有很多臺的計算機用於處理計算等事務。
（2）並行性：為了加速計算，還會使用多核處理器和計算叢集。
（3）可靠性：資料在不同機器上都有備份，有效防止資料丟失。
（4）可用性：備份的資料使得我們能夠快速訪問。

3. 分散式的優缺點

分散式系統有很多的優點：增加了儲存、計算能力以及連線空間分離位置的可能性。但同時也具有缺點：一致性問題，此問題在分散式系統中經常發生，一臺機器可能在幾年內發生一次故障，對於一個有著數百萬節點的分散式系統來說，可能每分鐘就會發生一次故障。

二、容錯性&Paxos

物理上我們不能改變分散式系統會頻繁發生故障的事實，但可以希望系統可以容忍一些故障並繼續工作。那麼如何建立具有容錯性（fault-tolerance）的分散式系統呢？

1. 簡單的客戶端-伺服器演算法

Algorithm 1. Naive Client-Server Algorithm
1: Client sends commands one at a time to server

一個分散式系統由很多節點組成，每個節點可以執行本地計算，還可以傳送訊息給其他的節點（訊息傳遞，message passing）。

       演算法1則是實現了客戶端與伺服器之間進行訊息傳遞，但是它存在兩個問題：
      （1）訊息損壞（message corruption），即成功接收訊息但其內容被損壞，在訊息中增加附加資訊，如校驗和，就可以解決這個問題。
      （2）訊息丟失（message loss），即訊息未能成功抵達接收器。這樣演算法1則不能正確執行，於是需要對它改進。

2. 具有確認的客戶端-伺服器演算法

Algorithm 2. Client-Server Algorithm with Acknowledgments
1: Client sends commands one at a time to server 2: Server acknowledges every command 3: If the client does not receive an acknowledgment within a reasonable time, the client resends the command

       演算法2實現了只有收到上一條命令的確認後才會傳送下一條（Line 1），但確認也會丟失，於是客戶端會重新發送命令（Line 3）。此演算法是很多可靠協議的基礎，如TCP。
       該演算法可以很容易地擴充套件到多個伺服器：客戶端將命令傳送到所有的伺服器，一旦客戶端收到來自所有伺服器的確認，該命令即被認為是成功執行的。
       但是對於多個客戶端的情況呢？這時就會出現“可變訊息延遲（variable message delay）”的問題，即儘管在兩個相同節點傳輸，也可能會有不同的傳輸時間，導致伺服器以不同的順序執行命令，從而會出現狀態不一致。舉例說明，假如演算法2應用於有2個客戶端s1和s2，2個伺服器u1和u2的系統，兩個客戶端都發布更新伺服器上變數x的命令，初始時x=0。客戶端u1傳送命令x=x+1，u2傳送命令x=2*x，由於有可變訊息延遲，可能存在此種情況：s1先接收到u1的訊息，s2先接收到u2的訊息，因此s1計算x = (0 + 1) * 2 = 2，s2計算x = 2 * 0 + 1 = 1。這樣就導致狀態不一致的問題。

3. 狀態複製

為了解決可變訊息延遲導致的狀態不一致問題，提出了狀態複製（state replication）的方法：如果所有節點以相同順序執行命令c1, c2, c3,…,那這一組節點就實現了狀態複製。有兩種演算法可以實現狀態複製：
（1）使用序列器進行狀態複製

Algorithm 3. State Replication with a Serializer
1: Clients send commands one at a time to the serializer 2: Serializer forwards commands one at a time to all other servers 3: Once the serializer received all acknowledgments, it notifies the client about the success

由於單個伺服器的狀態複製很簡單，因此可以將某個伺服器指定為序列器（serializer），通過它分發命令，就可以自動實現狀態複製。演算法3則是使用序列器進行狀態複製。
序列器的作用是轉發，但是如果序列器出現故障怎麼辦？由於序列器聯絡著整個系統，一旦故障，整個系統就陷入癱瘓！那麼有沒有一個更分佈化的方式來實現狀態複製呢？
（2）兩階段協議

Algorithm 4. Two-Phase Protocol
Phase 1 1: Client asks all servers for the lock Phase 2 2: if client receives lock from every server then 3: Client sends command reliably to each server, and gives the lock back 4: else 5: Clients gives the received locks back 6: Client waits, and then starts with Phase 1 again 7: end if

演算法4使用互斥鎖實現狀態複製，但是次演算法並沒有解決節點故障問題，事實上比使用序列器的演算法3還要糟糕，因為演算法3只要求序列器這一個節點響應，而演算法4要求所有節點都響應！
假如將演算法的Phase 1修改為嘗試獲取大多數鎖，是否可行？這時就會有一些問題：如果2個或更多的客戶端同時嘗試獲取大多數鎖，這時就需要其中1個或幾個客戶端放棄他們已經獲得的所有鎖，以防死鎖。但是如果在這些客戶端釋放鎖之前出現故障，這樣系統就會進入死鎖狀態。那麼為了解決這些問題，我們是否需要一個不同的概念呢？

4. 簡單的票證協議

票（ticket）的定義：票是比鎖的更弱的一種形式，它有以下的特點：
（1）可重新發行：即使之前的票未被返還，伺服器可發行票；
（2）過期票：伺服器只接受最新發行的票t。

Algorithm 5. Naive Ticket Protocol
Phase 1 1: Client asks all servers for a ticket Phase 2 2: if a majority of the servers replied then 3: Client sends command together with ticket to each server 4: Server stores command only if ticket is still valid, and replies to client 5: else 6: Client waits, and then starts with Phase 1 again 7: end if Phase 3 8: if client hears a positive answer from a majority of the servers then 9: Client tells servers to execute the stored command 10: else 11: Client waits, and then starts with Phase 1 again 12: end if

Algorithm 5. Naive Ticket Protocol

     Phase 1
1: Client asks all servers for a ticket
     Phase 2
2: if a majority of the servers replied then
3:     Client sends command together with ticket to each server
4:     Server stores command only if ticket is still valid, and replies to client
5: else
6:    Client waits, and then starts with Phase 1 again
7: end if
     Phase 3
8: if client hears a positive answer from a majority of the servers then
9:     Client tells servers to execute the stored command
10: else
11:    Client waits, and then starts with Phase 1 again
12: end if

此演算法存在這樣的問題：假如有2個客戶端u1和u2，3個伺服器s1, s2和s3，①u1已經成功在s1, s2和s3上儲存命令c1；②u2在u1的第三階段之前傳送命令c2給s2和s3，併成功儲存；③u1和u2都通知伺服器執行命令。這時一些伺服器執行c1一些執行c2，就造成了不一致的狀態。

5. Paxos

Algorithm 6. Paxos
Client (Proposer) Server (Acceptor) Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . c ~ command to execute Tmax = 0 ~ largest issued ticket t = 0 ~ ticket number to try C = ⊥ ~ stored command Tstore = 0 ~ ticket used to store C Phase 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1: t = t + 1 2: Ask all servers for ticket t 3: if t > Tmax then 4: Tmax = t 5: Answer with ok(Tstore, C) 6: end if Phase 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7: if a majority answers ok then 8: Pick (Tstore, C) with largest Tstore 9: if Tstore > 0 then 10: c = C 11: end if 12: Send propose(t, c) to same majority 13: end if 14: if t = Tmax then 15: C = c 16: Tstore = t 17: Answer success 18: end if Phase 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19: if a majority answers success then 20: Send execute(c) to every server 21: end if

Algorithm 6. Paxos

      Client (Proposer)                                                 Server (Acceptor)
      Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
      c          ~ command to execute                              Tmax = 0     ~ largest issued ticket
      t = 0     ~ ticket number to try
                                                                                    C = ⊥          ~ stored command
                                                                                    Tstore = 0     ~ ticket used to store C
      Phase 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
  1: t = t + 1
  2: Ask all servers for ticket t
                                                                                    3: if t > Tmax then
                                                                                    4:     Tmax = t
                                                                                    5:     Answer with ok(Tstore, C)
                                                                                    6: end if
      Phase 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
  7: if a majority answers ok then
  8:     Pick (Tstore, C) with largest Tstore
  9:     if Tstore > 0 then
10:         c = C
11:     end if
12:     Send propose(t, c) to same majority
13: end if
                                                                                    14: if t = Tmax then
                                                                                    15:     C = c
                                                                                    16:     Tstore = t
                                                                                    17:     Answer success
                                                                                    18: end if
      Phase 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19: if a majority answers success then
20:     Send execute(c) to every server
21: end if

       Paxos演算法改進了票證協議演算法，此演算法中伺服器在第一階段不再只是分發票證，還會通知客戶端它目前儲存的命令。如果u2已經得知u1已經成功儲存了c1，就不再試圖儲存c2，而是支援u1傳送儲存c1的命令,這樣客戶端都執行相同的命令，伺服器收到命令的順序就不再會導致不一致問題。
       不過Paxos要求伺服器崩潰數目少於一半,如果有一半（或更多）的伺服器崩潰，Paxos將無法得到進展，由於客戶端無法達到大多數。
       最後，我們就得到了一個相對較好的演算法 - Paxos，即使系統中有少數節點崩潰，也能實現狀態複製，達成一致狀態。

The Science of the Blockchain筆記（一）

一、分散式系統

1. 什麼是分散式系統

2. 分散式的原因

3. 分散式的優缺點

二、容錯性&Paxos

1. 簡單的客戶端-伺服器演算法

2. 具有確認的客戶端-伺服器演算法

3. 狀態複製

4. 簡單的票證協議

5. Paxos

Hacking: The Art of Exploitation 讀書筆記（一）程式碼除錯技巧

The Science of the Blockchain筆記（一）

The Science of the Blockchain筆記（四）

The Science of the Blockchain筆記（三）

The Science of the Blockchain筆記（二）

Deep Learning讀書筆記（一）：Reducing the Dimensionality of Data with Neural Networks

《The Practice of Programming》讀書筆記（一）

inside the c++ object筆記（一）

《程式除錯思想與實踐》.(The.Science.of.Debugging)讀書筆記

The Linux Command Line讀書筆記（二）

iOS CoreAudio學習筆記（一）—— Overview of CoreAudio

《代碼閱讀》讀書筆記（一）

python框架之 Tornado 學習筆記（一）

Scala學習筆記（一）編程基礎

3D Game Programming withDX11 學習筆記（一）數學知識總結

系統分析與設計學習筆記（一）

最大熵學習筆記（一）預備知識

Logstash筆記（一）

Nginx模塊之Nginx-Ts-Module學習筆記（一）搶險體驗

Hadoop自學筆記（一）常見Hadoop相關項目一覽

The Science of the Blockchain筆記（一）

一、分散式系統

1. 什麼是分散式系統

2. 分散式的原因

3. 分散式的優缺點

二、 容錯性&Paxos

1. 簡單的客戶端-伺服器演算法

2. 具有確認的客戶端-伺服器演算法

3. 狀態複製

4. 簡單的票證協議

5. Paxos

相關推薦

二、容錯性&Paxos