1. 程式人生 > >Zookeeper筆記之使用zk實現叢集選主

Zookeeper筆記之使用zk實現叢集選主

 

一、需求

在主從結構的叢集中,我們假設硬體機器是很脆弱的,隨時可能會宕機,當master掛掉之後需要從slave中選出一個節點作為新的master,使用zookeeper可以很簡單的實現叢集選主功能。

 

二、分析

叢集選主涉及到兩個問題:

1. 誰來做leader

2. leader掛掉了怎麼被follower感知到

首先是第一個問題,誰來做leader,其實可以將這個問題看做是多執行緒中的互斥鎖搶佔,鎖只有一把,並且只能被一個人搶到,這裡就把一個zookeeper上的一個節點/leader-info看做是鎖,叢集中的每臺機器都嘗試去建立這個節點,因為zookeeper建立節點是原子性操作,所以只有一臺機器能夠建立成功其它都會失敗,建立成功的那臺機器就作為leader,其它機器做follower,一般還會在/leader-info節點上儲存一些leader相關的資訊,以讓follower去連線leader進行一些資料交換或指令控制之類的,那就是選主之後的事了不在此篇文章的討論範圍之內。

第二個問題是leader掛掉了怎麼通知其它的follower,zookeeper中的節點按照有效時間分為持久節點和臨時節點,臨時節點跟session繫結,當session失效的時候它建立的臨時節點就會被刪除,利用這個特性可以檢測到節點是否還在存活狀態,實現follower對leader下線的感知,只需要在建立/leader-info節點的時候將其建立為臨時節點,然後follower在這個節點上新增一個watcher監聽其刪除事件,這樣當leader掛掉的時候zookeepr會將/leader-info節點刪除,同時給所有的follower傳送事件通知,follower一看leader掛了就燥起來了,將自己的狀態置為looking,開始新一輪的選舉。

 

總結一下選主的流程:

1. 叢集中的所有機器將自己置為looking狀態,準備開始選舉。

2. 所有looking狀態的機器嘗試去建立/leader-info節點。

3. 建立成功的將自己的狀態修改為leader,同時將自己的一些資訊寫入到這個節點上;建立失敗的將自己的狀態置為follower,同時嘗試從/leader-info獲取leader資訊進行一些leader改變的邏輯。

4. 在follower去獲取/leader-info節點的資料的時候,是有可能報KeeperException.NoNodeException異常的,因為leader剛成為leader就掛掉了(或者因為一些網路抖動原因,總之是session失效了),這個時候follower檢測到KeeperException.NoNodeException,說明叢集中已經沒有了leader,將自己的狀態置為looking開始新一輪的選舉。

 

三、實現

Node.java:

package cc11001100.zookeeper.leaderElection;

import cc11001100.zookeeper.utils.ZooKeeperUtil;
import org.apache.zookeeper.CreateMode;
import org.apache.zookeeper.KeeperException;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.ZooDefs;
import org.apache.zookeeper.ZooKeeper;

import java.io.IOException;
import java.io.UnsupportedEncodingException;

/**
 * 表示叢集中的一個節點,會通過選舉決定自己是leader還是follower
 *
 * @author CC11001100
 */
public class Node {

	private Status status;
	private String nodeForLeaderInfo;
	private ZooKeeper zooKeeper;

	public Node(String listenerNodeForLeader) throws IOException {
		this.nodeForLeaderInfo = listenerNodeForLeader;
		this.zooKeeper = ZooKeeperUtil.getZooKeeper();
		lookingForLeader();
	}

	public void lookingForLeader() {
		status = Status.LOOKING;
		try {
			String leaderInfo = Thread.currentThread().getName();
			zooKeeper.create(nodeForLeaderInfo, leaderInfo.getBytes(), ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL);
			// 如果上一步沒有拋異常,說明自己已經是leader了
			status = Status.LEADER;
			String logMsg = Thread.currentThread().getName() + " is leader";
			System.out.println(logMsg);
		} catch (KeeperException.NodeExistsException e) {
			// 節點已經存在,說明leader已經被別人註冊成功了,自己是follower
			status = Status.FOLLOWER;
			try {
				byte[] leaderInfoBytes = zooKeeper.getData(nodeForLeaderInfo, event -> {
					if (event.getType() == Watcher.Event.EventType.NodeDeleted) {
						lookingForLeader();
					}
				}, null);
				String logMsg = Thread.currentThread().getName() + " is follower, master is " + new String(leaderInfoBytes, "UTF-8");
				System.out.println(logMsg);
			} catch (KeeperException.NoNodeException e1) {
				// 如果在獲取leader資訊的時候報了節點不存在,說明這個leader比較短命,剛搶到leader就又掛掉了
				lookingForLeader();
			} catch (KeeperException | InterruptedException | UnsupportedEncodingException e1) {
				e1.printStackTrace();
			}
		} catch (KeeperException | InterruptedException e) {
			e.printStackTrace();
		}
	}

	public void shutdown() {
		try {
			if (zooKeeper != null) {
				zooKeeper.close();
			}
		} catch (InterruptedException e) {
			e.printStackTrace();
		}
	}

	public Status getStatus() {
		return status;
	}

	// 當前節點的身份
	public enum Status {
		LOOKING, // 選舉中
		LEADER, // 選舉完畢,當前節點為leader
		FOLLOWER; // 選舉完畢,當前節點為follower
	}

}

LeaderElectionTest.java:

package cc11001100.zookeeper.leaderElection;

import java.io.IOException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.AtomicLong;

/**
 * @author CC11001100
 */
public class LeaderElectionTest {

	private static void sleep(long mils) {
		try {
			TimeUnit.MILLISECONDS.sleep(mils);
		} catch (InterruptedException e) {
			e.printStackTrace();
		}
	}

	public static void main(String[] args) throws IOException {

		final String LEADER_INFO_NODE = "/leader-info";
		int nodeNum = 10;
		AtomicLong idGenerator = new AtomicLong();
		AtomicInteger activeNodeCount = new AtomicInteger();
		while (true) {
			if (activeNodeCount.get() >= nodeNum) {
				sleep(10);
				continue;
			}

			// 執行緒啟動需要一定時間,將執行緒啟動看做開機過程,在開機之前就算一臺新的機器加入了
			activeNodeCount.incrementAndGet();
			new Thread(() -> {
				try {
					Node node = new Node(LEADER_INFO_NODE);
					while (true) {
						sleep(1000);
						// 這裡為了試驗就讓leader有輕微自殺傾向...
						if (node.getStatus() == Node.Status.LEADER && Math.random() < 0.3) {
							String logMsg = "----------------------------- " + Thread.currentThread().getName() + " shutdown -----------------------------";
							System.out.println(logMsg);
							node.shutdown();
							break;
						}
					}
				} catch (IOException e) {
					e.printStackTrace();
				} finally {
					activeNodeCount.decrementAndGet();
				}
			}, "node-" + idGenerator.getAndIncrement()).start();
		}
	}

}

控制檯輸出:

...
node-4 is leader
node-3 is follower, master is node-4
node-0 is follower, master is node-4
node-9 is follower, master is node-4
node-7 is follower, master is node-4
node-5 is follower, master is node-4
node-1 is follower, master is node-4
node-6 is follower, master is node-4
node-8 is follower, master is node-4
node-2 is follower, master is node-4
----------------------------- node-4 shutdown -----------------------------
node-0-EventThread is leader
node-6-EventThread is follower, master is node-0-EventThread
node-3-EventThread is follower, master is node-0-EventThread
node-7-EventThread is follower, master is node-0-EventThread
node-1-EventThread is follower, master is node-0-EventThread
node-5-EventThread is follower, master is node-0-EventThread
node-9-EventThread is follower, master is node-0-EventThread
node-2-EventThread is follower, master is node-0-EventThread
node-8-EventThread is follower, master is node-0-EventThread
node-10 is follower, master is node-0-EventThread
----------------------------- node-0 shutdown -----------------------------
node-6-EventThread is leader
node-7-EventThread is follower, master is node-6-EventThread
node-1-EventThread is follower, master is node-6-EventThread
node-3-EventThread is follower, master is node-6-EventThread
node-10-EventThread is follower, master is node-6-EventThread
node-9-EventThread is follower, master is node-6-EventThread
node-5-EventThread is follower, master is node-6-EventThread
node-2-EventThread is follower, master is node-6-EventThread
node-8-EventThread is follower, master is node-6-EventThread
node-11 is follower, master is node-6-EventThread
----------------------------- node-6 shutdown -----------------------------
node-1-EventThread is leader
node-10-EventThread is follower, master is node-1-EventThread
node-7-EventThread is follower, master is node-1-EventThread
node-11-EventThread is follower, master is node-1-EventThread
node-8-EventThread is follower, master is node-1-EventThread
node-5-EventThread is follower, master is node-1-EventThread
node-9-EventThread is follower, master is node-1-EventThread
node-3-EventThread is follower, master is node-1-EventThread
node-2-EventThread is follower, master is node-1-EventThread
node-12 is follower, master is node-1-EventThread
----------------------------- node-1 shutdown -----------------------------
node-3-EventThread is leader
node-12-EventThread is follower, master is node-3-EventThread
node-11-EventThread is follower, master is node-3-EventThread
node-5-EventThread is follower, master is node-3-EventThread
node-7-EventThread is follower, master is node-3-EventThread
node-9-EventThread is follower, master is node-3-EventThread
node-2-EventThread is follower, master is node-3-EventThread
node-10-EventThread is follower, master is node-3-EventThread
node-8-EventThread is follower, master is node-3-EventThread
node-13 is follower, master is node-3-EventThread
----------------------------- node-3 shutdown -----------------------------
node-5-EventThread is leader
node-13-EventThread is follower, master is node-5-EventThread
node-12-EventThread is follower, master is node-5-EventThread
node-7-EventThread is follower, master is node-5-EventThread
node-11-EventThread is follower, master is node-5-EventThread
node-10-EventThread is follower, master is node-5-EventThread
node-9-EventThread is follower, master is node-5-EventThread
node-2-EventThread is follower, master is node-5-EventThread
node-8-EventThread is follower, master is node-5-EventThread
node-14 is follower, master is node-5-EventThread
----------------------------- node-5 shutdown -----------------------------
node-7-EventThread is leader
node-13-EventThread is follower, master is node-7-EventThread
node-12-EventThread is follower, master is node-7-EventThread
node-9-EventThread is follower, master is node-7-EventThread
node-11-EventThread is follower, master is node-7-EventThread
node-14-EventThread is follower, master is node-7-EventThread
node-10-EventThread is follower, master is node-7-EventThread
node-8-EventThread is follower, master is node-7-EventThread
node-2-EventThread is follower, master is node-7-EventThread
node-15 is follower, master is node-7-EventThread
----------------------------- node-7 shutdown -----------------------------
node-14-EventThread is leader
node-13-EventThread is follower, master is node-14-EventThread
node-11-EventThread is follower, master is node-14-EventThread
node-2-EventThread is follower, master is node-14-EventThread
node-12-EventThread is follower, master is node-14-EventThread
node-15-EventThread is follower, master is node-14-EventThread
node-10-EventThread is follower, master is node-14-EventThread
node-9-EventThread is follower, master is node-14-EventThread
node-8-EventThread is follower, master is node-14-EventThread
node-16 is follower, master is node-14-EventThread
----------------------------- node-14 shutdown -----------------------------
node-13-EventThread is leader
node-12-EventThread is follower, master is node-13-EventThread
node-15-EventThread is follower, master is node-13-EventThread
node-9-EventThread is follower, master is node-13-EventThread
node-10-EventThread is follower, master is node-13-EventThread
node-2-EventThread is follower, master is node-13-EventThread
node-8-EventThread is follower, master is node-13-EventThread
node-11-EventThread is follower, master is node-13-EventThread
node-16-EventThread is follower, master is node-13-EventThread
node-17 is follower, master is node-13-EventThread
...

 

 

.