1. 程式人生 > >MySQL-MHA叢集部署(binlog複製)

MySQL-MHA叢集部署(binlog複製)

MHA的理論知識網上有很多教程,這裡不會說明;僅推薦部落格連結!

MHA的理論說明:http://www.ywnds.com/?p=8094

MHA的安裝包需要在google上面下載,或者就是csdn上面花錢下載!

詳細說明怎麼搭建MHA

#四臺伺服器分配如下
10.0.102.214         test3            MHA的管理節點
10.0.102.204         test2            master節點
10.0.102.179         test1            slave節點(作為備用的管理節點)
10.0.102.221         mgt01            slave節點

#這裡我們一主兩從的架構基於binlog複製,首先需要配置好一主兩從的架構。
#需要注意的是,作為備用主的slave伺服器需要開通二進位制日誌和配置log_slave_updates引數
#MySQL基於binglog複製過程如下:
https://www.cnblogs.com/wxzhe/p/10051114.html

#部署過程中不會說明怎麼搭建MySQL主從架構

第一步:搭建好主從架構,也就是一主兩從的架構。【MHA的官方不支援一主一從,但是傳聞阿里修改了原始碼使其支援一主一從,這裡使用官方的結構】

需要注意的是要在作為備用主的伺服器新增如下配置:

log-bin=                        #開啟二進位制日誌
log_slave_updates               #把SQL執行緒的動作寫入二進位制日誌

第二步:安裝MHA

在MHA的叢集的所有伺服器上需要安裝MHA-node節點,

[[email protected] ~]# yum install epel-release perl-DBD-MySQL perl-CPAN -y             #安裝依賴包
[[email protected] src]# ls
mha4mysql-node-0.56.tar.gz
[[email protected] src]# tar zxvf mha4mysql-node-0.56.tar.gz    -C  ../                 #解壓
[[email protected] src]# cd ../
[[email protected]
local]# cd mha4mysql
-node-0.56/ [[email protected] mha4mysql-node-0.56]# ls AUTHORS bin COPYING debian inc lib Makefile.PL MANIFEST META.yml README rpm t [[email protected] mha4mysql-node-0.56]# perl Makefile.PL #編譯 [[email protected] mha4mysql-node-0.56]# make & make install #安裝

[[email protected] ~]# cd /usr/local/bin #安裝完成之後,會在/usr/local/bin目錄下面生成如下檔案
[[email protected] bin]# ls
apply_diff_relay_logs  filter_mysqlbinlog  purge_relay_logs  save_binary_logs
[[email protected] bin]# ll
total 44
-r-xr-xr-x 1 root root 16367 Dec  8 10:29 apply_diff_relay_logs
-r-xr-xr-x 1 root root  4807 Dec  8 10:29 filter_mysqlbinlog
-r-xr-xr-x 1 root root  8261 Dec  8 10:29 purge_relay_logs
-r-xr-xr-x 1 root root  7525 Dec  8 10:29 save_binary_logs

注意上面的這一步操作,需要在MHA叢集的每個節點上都執行!

安裝MHA-manager,也就是MHA叢集的管理節點!

#首先安裝MHA-manager需要安裝的包
yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -y
#安裝MHA-manager
tar
zxvf mha4mysql-manager-0.56.tar.gz cd mha4mysql-manager-0.56/ perl Makefile.PL make & make install
cp -frp samples/scripts/* /usr/local/bin #把這些指令碼檔案拷貝到/usr/local/bin下面,這樣不用再新增環境變數
master_ip_failover:故障自動切換時對vip管理的指令碼,不是必須。如果我們使用keepalived的,我們可以自己編寫指令碼完成對vip的管理,比如監控mysql,如果mysql異常,我們停止keepalived就行,這樣vip就會自動漂移。

master_ip_online_change:線上切換時對vip的管理,不是必須,同樣可以自行編寫簡單的shell完成。

power_manager:故障發生後關閉主機的指令碼,不是必須。

send_report:因故障切換後傳送報警的指令碼,不是必須,可自行編寫簡單的shell完成。
指令碼說明

第三步:配置MHA

配置MHA這一步主要做的就是寫MHA的配置檔案,然後建立對應的目錄!

在上面的samples/  目錄下還有一個目錄conf,裡面有兩個配置檔案模板:

 

[[email protected] conf]# ls
app1.cnf  masterha_default.cnf
[[email protected] conf]# 

 

把配置檔案模板拷貝到/etc下面:

mkdir /etc/masterha -p             #在/etc下面建立MHA使用的配置檔案的目錄【名字可以隨意,最好可以標識目錄的內容】
cp * /etc/masterha/

首先編輯masterha_default.cnf檔案

[[email protected] ~]# cat /etc/masterha_default.cnf
[server default]
# 設定監控使用者mha,需要有授權
user=mha
# 設定mysql中root使用者的密碼,這個密碼是前文中建立監控使用者的那個密碼; 
password=123456
# 設定複製環境中的複製使用者名稱; 
repl_user=repl
# 設定複製使用者的密碼; 
repl_password=123456
# 設定ssh的登入使用者名稱;
ssh_user=root
# 設定ssh的登入埠(不寫預設22埠);
ssh_port=22
# 設定監控主庫,傳送ping包的時間間隔,預設是3秒,嘗試三次沒有迴應的時候自動進行failover; 
ping_interval=3
# 設定mysql master儲存binlog的目錄,以便MHA可以找到master的二進位制日誌;
master_binlog_dir= /data/mysql/
# 設定mysql master在發生切換時儲存binlog的目錄,在mysql master上建立這個目錄(不寫預設為/var/tmp); 
remote_workdir=/data/log/masterha
 
# 一旦MHA到mysql01的監控之間出現問題,MHA Manager將會嘗試從mysql02,mysql03登入到mysql01;
secondary_check_script= masterha_secondary_check -s test1 -s mgt01 --user=root --port=22 --master_host=test2 --master_port=3306
# 設定自動failover時候的切換指令碼(指令碼有瑕疵,需要自行修改);
#master_ip_failover_script=/usr/local/bin/master_ip_failover
# 設定手動切換時候的切換指令碼(指令碼有瑕疵,需要自行修改);
#master_ip_online_change_script=/usr/local/bin/master_ip_online_change
# 設定發生切換後傳送的報警的指令碼(可自行編寫);
#report_script=/usr/local/bin/send_report
# 設定故障發生後關閉故障主機指令碼(該指令碼的主要作用是關閉主機放在發生腦裂,這裡沒有使用);
#shutdown_script=""
masterha_default.cnf配置引數說明

上面給出了masterha_default.cnf每個配置引數的說明情況,下面這個是我的配置

[server default]
user=root
password=123456
ssh_user=root
ssh_port=22
ping_interval=3
repl_user=repl
repl_password=123456

master_binlog_dir= /data/mysql/

remote_workdir=/data/log/masterha

secondary_check_script= masterha_secondary_check -s test1 -s mgt01 --user=root --port=22 --master_host=test2 --master_port=3306

master_ip_failover_script= /usr/local/bin/master_ip_failover
# shutdown_script= /script/masterha/power_manager
report_script= /usr/local/bin/send_report
# master_ip_online_change_script= /script/masterha/master_ip_online_change

然後編輯配置app1.conf檔案

 

只針對單個應用生效,但是app1.cnf的配置引數優先順序高於masterha_default.cnf,一般都會在app1.cnf包含masterha_default.cnf所有引數。MHA可以監控多個主從的叢集,每個叢集的配置檔案可以用名字區分,因為這裡只有一個叢集,因此只有app1.conf一個檔案!

[[email protected] masterha]# cat app1.cnf 

manager_log=/data/log/app1/manager.log
manager_workdir=/data/log/app1
master_binlog_dir=/data/mysql
password=123456
ping_interval=3
remote_workdir=/data/log/masterha
repl_password=123456
repl_user=repl
report_script=/usr/local/bin/send_report
secondary_check_script=masterha_secondary_check -s test1 -s mgt01 --user=root --port=22 --master_host=test2 --master_port=3306
ssh_port=22
ssh_user=root
user=root

[server1]
hostname=10.0.102.204
port=3306
candidate_master=1

[server2]
candidate_master=1
hostname=10.0.102.179
port=3306

[server3]
hostname=10.0.102.221
no_master=1
port=3306

這個配置檔案的引數基本都比較好理解,需要注意的是,配置檔案指定的目錄都需要另行建立!

mkdir -p /data/log/masterha
mkdir /data/log/app1

candidate_master設定為1時,表示為候選master,如果設定該引數以後,發生主從切換以後將會將此從庫提升為主庫,即使這個主庫不是叢集中事件最新的slave。預設情況下如果一個slave落後master 100M的relay logs的話,MHA將不會選擇該slave作為一個新的master,因為對於這個slave的恢復需要花費很長時間,通過設定check_repl_delay=0,MHA觸發切換在選擇一個新的master的時候將會忽略複製延時,這個引數對於設定了candidate_master=1的主機非常有用,因為這個候選主在切換的過程中一定是新的master check_repl_delay=0。

同樣設定為候選master的slave一定要開啟二進位制日誌和log_slave_updates引數!

設定relay log的清除方式(在每個slave節點上)

在配置檔案中加上relay_log_purge=0,需要重啟才能生效!

注意:MHA在發生切換的過程中,從庫的恢復過程中依賴於relay log的相關資訊,所以這裡要將relay log的自動清除設定為OFF,採用手動清除relay log的方式。在預設情況下,從伺服器上的中繼日誌會在SQL執行緒執行完畢後被自動刪除。但是在MHA環境中,這些中繼日誌在恢復其他從伺服器時可能會被用到,因此需要禁用中繼日誌的自動刪除功能。定期清除中繼日誌需要考慮到複製延時的問題。在ext3的檔案系統下,刪除大的檔案需要一定的時間,會導致嚴重的複製延時。為了避免複製延時,需要暫時為中繼日誌建立硬連結,因為在Linux系統中通過硬連結刪除大檔案速度會很快。(在mysql資料庫中,刪除大表時,通常也採用建立硬連結的方式)

MHA節點中包含了pure_relay_logs命令工具,它可以為中繼日誌建立硬連結,執行SET GLOBAL relay_log_purge=1,等待幾秒鐘以便SQL執行緒切換到新的中繼日誌,再執行SET GLOBAL relay_log_purge=0。

pure_relay_logs指令碼引數如下所示:

--user mysql                 #使用者名稱;
--password mysql             #密碼;
--port                       #埠號;
--workdir                    #指定建立relay log的硬連結的位置,預設是/var/tmp,由於系統不同分割槽建立硬連結檔案會失敗,故需要執行硬連結具體位置,成功執行指令碼後,硬連結的中繼日誌檔案被刪除;
--disable_relay_log_purge    #預設情況下,如果relay_log_purge=1,指令碼會什麼都不清理,自動退出,通過設定這個引數,當relay_log_purge=1的情況下會將relay_log_purge設定為0。清理relay log之後,最後將引數設定為OFF;

設定定期清理relay指令碼

[[email protected] ~]# cat !$
cat purge_relay.sh
#!/bin/bash
user=root
passwd=123456
port=3306
log_dir='/data/masterha/log'
work_dir='/data'
purge='/usr/local/bin/purge_relay_logs'
 
if [ ! -d $log_dir ];then
  mkdir $log_dir -p
fi
 
$purge --user=$user --password=$passwd --disable_relay_log_purge --port=$port --workdir=$work_dir >> $log_dir/purge_relay_logs.log 2>&1

把以上指令碼加入定時計劃任務:

[[email protected] log]# crontab -l
* 4 * * * sh /root/purge_relay.sh

purge_relay_logs指令碼刪除中繼日誌不會阻塞SQL執行緒

第四步: 設定ssh無密碼認證

MHA的管理節點可以無密碼訪問叢集中的其餘節點!

MySQL叢集需要互相之間可以無密碼訪問!

ssh無密碼訪問不再寫過程。

使用MHA檢查ssh是否成功

[[email protected] ~]# masterha_check_ssh --conf=/etc/masterha/app1.cnf

若成功則進行下一步,檢查複製

有一些部落格提到:暫時先註釋配置檔案中master_ip_failover_script= /usr/local/bin/master_ip_failover這個選項,不然這個檢查過不去的。但是我測試時候沒有註釋,也是可以檢查成功的】

[[email protected] ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf
[[email protected] ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf 
Sat Dec  8 17:03:38 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Dec  8 17:03:38 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Sat Dec  8 17:03:38 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Sat Dec  8 17:03:38 2018 - [info] MHA::MasterMonitor version 0.56.
Sat Dec  8 17:03:38 2018 - [info] GTID failover mode = 0
Sat Dec  8 17:03:38 2018 - [info] Dead Servers:
Sat Dec  8 17:03:38 2018 - [info] Alive Servers:
Sat Dec  8 17:03:38 2018 - [info]   10.0.102.204(10.0.102.204:3306)
Sat Dec  8 17:03:38 2018 - [info]   10.0.102.179(10.0.102.179:3306)
Sat Dec  8 17:03:38 2018 - [info]   10.0.102.221(10.0.102.221:3306)
Sat Dec  8 17:03:38 2018 - [info] Alive Slaves:
Sat Dec  8 17:03:38 2018 - [info]   10.0.102.179(10.0.102.179:3306)  Version=5.7.22-log (oldest major version between slaves) log-bin:enabled
Sat Dec  8 17:03:38 2018 - [info]     Replicating from 10.0.102.204(10.0.102.204:3306)
Sat Dec  8 17:03:38 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Sat Dec  8 17:03:38 2018 - [info]   10.0.102.221(10.0.102.221:3306)  Version=5.7.22 (oldest major version between slaves) log-bin:disabled
Sat Dec  8 17:03:38 2018 - [info]     Replicating from 10.0.102.204(10.0.102.204:3306)
Sat Dec  8 17:03:38 2018 - [info]     Not candidate for the new Master (no_master is set)
Sat Dec  8 17:03:38 2018 - [info] Current Alive Master: 10.0.102.204(10.0.102.204:3306)
Sat Dec  8 17:03:38 2018 - [info] Checking slave configurations..
Sat Dec  8 17:03:38 2018 - [info]  read_only=1 is not set on slave 10.0.102.179(10.0.102.179:3306).
Sat Dec  8 17:03:38 2018 - [info]  read_only=1 is not set on slave 10.0.102.221(10.0.102.221:3306).
Sat Dec  8 17:03:38 2018 - [warning]  log-bin is not set on slave 10.0.102.221(10.0.102.221:3306). This host cannot be a master.
Sat Dec  8 17:03:38 2018 - [info] Checking replication filtering settings..
Sat Dec  8 17:03:38 2018 - [info]  binlog_do_db= , binlog_ignore_db= 
Sat Dec  8 17:03:38 2018 - [info]  Replication filtering check ok.
Sat Dec  8 17:03:38 2018 - [info] GTID (with auto-pos) is not supported
Sat Dec  8 17:03:38 2018 - [info] Starting SSH connection tests..
Sat Dec  8 17:03:40 2018 - [info] All SSH connection tests passed successfully.
Sat Dec  8 17:03:40 2018 - [info] Checking MHA Node version..
Sat Dec  8 17:03:40 2018 - [info]  Version check ok.
Sat Dec  8 17:03:40 2018 - [info] Checking SSH publickey authentication settings on the current master..
Sat Dec  8 17:03:40 2018 - [info] HealthCheck: SSH to 10.0.102.204 is reachable.
Sat Dec  8 17:03:41 2018 - [info] Master MHA Node version is 0.56.
Sat Dec  8 17:03:41 2018 - [info] Checking recovery script configurations on 10.0.102.204(10.0.102.204:3306)..
Sat Dec  8 17:03:41 2018 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql --output_file=/data/log/masterha/save_binary_logs_test --manager_version=0.56 --start_file=test2-bin.000007 
Sat Dec  8 17:03:41 2018 - [info]   Connecting to [email protected]10.0.102.204(10.0.102.204:22).. 
  Creating /data/log/masterha if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /data/mysql, up to test2-bin.000007
Sat Dec  8 17:03:41 2018 - [info] Binlog setting check done.
Sat Dec  8 17:03:41 2018 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Sat Dec  8 17:03:41 2018 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=10.0.102.179 --slave_ip=10.0.102.179 --slave_port=3306 --workdir=/data/log/masterha --target_version=5.7.22-log --manager_version=0.56 --relay_log_info=/data/mysql/relay-log.info  --relay_dir=/data/mysql/  --slave_pass=xxx
Sat Dec  8 17:03:41 2018 - [info]   Connecting to [email protected]10.0.102.179(10.0.102.179:22).. 
  Checking slave recovery environment settings..
    Opening /data/mysql/relay-log.info ... ok.
    Relay log found at /data/mysql, up to test1-relay-bin.000002
    Temporary relay log file is /data/mysql/test1-relay-bin.000002
    Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Sat Dec  8 17:03:41 2018 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=10.0.102.221 --slave_ip=10.0.102.221 --slave_port=3306 --workdir=/data/log/masterha --target_version=5.7.22 --manager_version=0.56 --relay_log_info=/data/mysql/relay-log.info  --relay_dir=/data/mysql/  --slave_pass=xxx
Sat Dec  8 17:03:41 2018 - [info]   Connecting to [email protected]10.0.102.221(10.0.102.221:22).. 
  Checking slave recovery environment settings..
    Opening /data/mysql/relay-log.info ... ok.
    Relay log found at /data/mysql, up to mgt01-relay-bin.000002
    Temporary relay log file is /data/mysql/mgt01-relay-bin.000002
    Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Sat Dec  8 17:03:41 2018 - [info] Slaves settings check done.
Sat Dec  8 17:03:41 2018 - [info] 
10.0.102.204(10.0.102.204:3306) (current master)
 +--10.0.102.179(10.0.102.179:3306)
 +--10.0.102.221(10.0.102.221:3306)

Sat Dec  8 17:03:41 2018 - [info] Checking replication health on 10.0.102.179..
Sat Dec  8 17:03:41 2018 - [info]  ok.
Sat Dec  8 17:03:41 2018 - [info] Checking replication health on 10.0.102.221..
Sat Dec  8 17:03:41 2018 - [info]  ok.
Sat Dec  8 17:03:41 2018 - [warning] master_ip_failover_script is not defined.
Sat Dec  8 17:03:41 2018 - [warning] shutdown_script is not defined.
Sat Dec  8 17:03:41 2018 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.詳細過程
詳細過程

遇到過一次是複製檢查時,總是會dead servers下面有一個伺服器,但是叢集裡面是正常的,各種都是正常的,後來發現是本地的解析出錯!【/etc/hosts文化和ssh目錄下面的known_hosts檔案,新建的伺服器一般不會出現這問題】

檢視MHA-manger的狀態

[[email protected] masterha]# masterha_check_status --conf=/etc/masterha/app1.cnf 
app1 is stopped(2:NOT_RUNNING).
[[email protected] masterha]# 

開啟MHa-manager

[[email protected] masterha]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover &

#引數說明
remove_dead_master_conf:設定了這個引數後,如果MHA failover結束後,MHA Manager會自動在配置檔案中刪除dead master的相關項。如果不設定,
由於dead master的配置還存在檔案中,那麼當MHA failover後,當再次restart MHA manager後,會報錯(there is a dead slave previous dead master)。
ignore_last_failover:預設情況下,如果一個或者多個slave down掉了,master monitor程序就會停掉,就算你設定了ignore_fail。如果設定了–ignore_fail_on_start引數,ignore_fail標記了slave掛掉也不會讓master monitor程序停掉。

啟動之後檢視狀態:

[[email protected] masterha]# masterha_check_status --conf=/etc/masterha/app1.cnf 
app1 (pid:18866) is running(0:PING_OK), master:10.0.102.204

如果啟動沒有報錯,那麼一個MHA的叢集就已經搭建成功!

關閉MHA-manager可以使用如下命令:

 masterha_stop --conf=/etc/masterha/app1.cnf

最後:我們進行一個failover測試!

停掉MySQL主從叢集中的主,檢視是否會自動切換到從!在測試主從之前最後可以寫入一點資料,這裡我利用tpcc寫入了一些資料!

./tpcc_load -h 10.0.102.204 -P 3306 -d tpcc_test -u root -p 123456 -w 3

tpcc的測試使用:https://www.cnblogs.com/wxzhe/p/10027474.html

停掉當前的主伺服器!

[[email protected] ~]# service mysqld stop
Shutting down MySQL............ SUCCESS!

然後檢視MHA的管理日誌

Sat Dec  8 17:21:50 2018 - [info] Executing secondary network check script: masterha_secondary_check -s test1 -s mgt01 --user=root --port=22 --master_host=test2 --master_port=3306  --user=root  --master_host=10.0.102.204  --master_ip=10.0.102.204  --master_port=3306 --master_user=root --master_password=123456 --ping_type=SELECT
Sat Dec  8 17:21:50 2018 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql --output_file=/data/log/masterha/save_binary_logs_test --manager_version=0.56 --binlog_prefix=test2-bin
Monitoring server test1 is reachable, Master is not reachable from test1. OK.
Sat Dec  8 17:21:50 2018 - [info] HealthCheck: SSH to 10.0.102.204 is reachable.
Monitoring server mgt01 is reachable, Master is not reachable from mgt01. OK.
Sat Dec  8 17:21:50 2018 - [info] Master is not reachable from all other monitoring servers. Failover should start.
Sat Dec  8 17:21:53 2018 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Sat Dec  8 17:21:53 2018 - [warning] Connection failed 2 time(s)..
Sat Dec  8 17:21:56 2018 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Sat Dec  8 17:21:56 2018 - [warning] Connection failed 3 time(s)..
Sat Dec  8 17:21:59 2018 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Sat Dec  8 17:21:59 2018 - [warning] Connection failed 4 time(s)..
Sat Dec  8 17:21:59 2018 - [warning] Master is not reachable from health checker!
Sat Dec  8 17:21:59 2018 - [warning] Master 10.0.102.204(10.0.102.204:3306) is not reachable!
Sat Dec  8 17:21:59 2018 - [warning] SSH is reachable.
Sat Dec  8 17:21:59 2018 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status..
Sat Dec  8 17:21:59 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Dec  8 17:21:59 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Sat Dec  8 17:21:59 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Sat Dec  8 17:21:59 2018 - [info] GTID failover mode = 0
Sat Dec  8 17:21:59 2018 - [info] Dead Servers:
Sat Dec  8 17:21:59 2018 - [info]   10.0.102.204(10.0.102.204:3306)
Sat Dec  8 17:21:59 2018 - [info] Alive Servers:
Sat Dec  8 17:21:59 2018 - [info]   10.0.102.179(10.0.102.179:3306)
Sat Dec  8 17:21:59 2018 - [info]   10.0.102.221(10.0.102.221:3306)
Sat Dec  8 17:21:59 2018 - [info] Alive Slaves:
Sat Dec  8 17:21:59 2018 - [info]   10.0.102.179(10.0.102.179:3306)  Version=5.7.22-log (oldest major version between slaves) log-bin:enabled
Sat Dec  8 17:21:59 2018 - [info]     Replicating from 10.0.102.204(10.0.102.204:3306)
Sat Dec  8 17:21:59 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Sat Dec  8 17:21:59 2018 - [info]   10.0.102.221(10.0.102.221:3306)  Version=5.7.22 (oldest major version between slaves) log-bin:disabled
Sat Dec  8 17:21:59 2018 - [info]     Replicating from 10.0.102.204(10.0.102.204:3306)
Sat Dec  8 17:21:59 2018 - [info]     Not candidate for the new Master (no_master is set)
Sat Dec  8 17:21:59 2018 - [info] Checking slave configurations..
Sat Dec  8 17:21:59 2018 - [info]  read_only=1 is not set on slave 10.0.102.179(10.0.102.179:3306).
Sat Dec  8 17:21:59 2018 - [info]  read_only=1 is not set on slave 10.0.102.221(10.0.102.221:3306).
Sat Dec  8 17:21:59 2018 - [warning]  log-bin is not set on slave 10.0.102.221(10.0.102.221:3306). This host cannot be a master.
Sat Dec  8 17:21:59 2018 - [info] Checking replication filtering settings..
Sat Dec  8 17:21:59 2018 - [info]  Replication filtering check ok.
Sat Dec  8 17:21:59 2018 - [info] Master is down!
Sat Dec  8 17:21:59 2018 - [info] Terminating monitoring script.
Sat Dec  8 17:21:59 2018 - [info] Got exit code 20 (Master dead).
Sat Dec  8 17:21:59 2018 - [info] MHA::MasterFailover version 0.56.
Sat Dec  8 17:21:59 2018 - [info] Starting master failover.
Sat Dec  8 17:21:59 2018 - [info] 
Sat Dec  8 17:21:59 2018 - [info] * Phase 1: Configuration Check Phase..
Sat Dec  8 17:21:59 2018 - [info] 
Sat Dec  8 17:21:59 2018 - [info] GTID failover mode = 0
Sat Dec  8 17:21:59 2018 - [info] Dead Servers:
Sat Dec  8 17:21:59 2018 - [info]   10.0.102.204(10.0.102.204:3306)
Sat Dec  8 17:21:59 2018 - [info] Checking master reachability via MySQL(double check)...
Sat Dec  8 17:21:59 2018 - [info]  ok.
Sat Dec  8 17:21:59 2018 - [info] Alive Servers:
Sat Dec  8 17:21:59 2018 - [info]   10.0.102.179(10.0.102.179:3306)
Sat Dec  8 17:21:59 2018 - [info]   10.0.102.221(10.0.102.221:3306)
Sat Dec  8 17:21:59 2018 - [info] Alive Slaves:
Sat Dec  8 17:21:59 2018 - [info]   10.0.102.179(10.0.102.179:3306)  Version=5.7.22-log (oldest major version between slaves) log-bin:enabled
Sat Dec  8 17:21:59 2018 - [info]     Replicating from 10.0.102.204(10.0.102.204:3306)
Sat Dec  8 17:21:59 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Sat Dec  8 17:21:59 2018 - [info]   10.0.102.221(10.0.102.221:3306)  Version=5.7.22 (oldest major version between slaves) log-bin:disabled
Sat Dec  8 17:21:59 2018 - [info]     Replicating from 10.0.102.204(10.0.102.204:3306)
Sat Dec  8 17:21:59 2018 - [info]     Not candidate for the new Master (no_master is set)
Sat Dec  8 17:21:59 2018 - [info] Starting Non-GTID based failover. 
Sat Dec  8 17:21:59 2018 - [info] 
Sat Dec  8 17:21:59 2018 - [info] ** Phase 1: Configuration Check Phase completed.
Sat Dec  8 17:21:59 2018 - [info] 
Sat Dec  8 17:21:59 2018 - [info] * Phase 2: Dead Master Shutdown Phase..
Sat Dec  8 17:21:59 2018 - [info] 
Sat Dec  8 17:21:59 2018 - [info] Forcing shutdown so that applications never connect to the current master..
Sat Dec  8 17:21:59 2018 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master IP address.
Sat Dec  8 17:21:59 2018 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Sat Dec  8 17:21:59 2018 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Sat Dec  8 17:21:59 2018 - [info] 
Sat Dec  8 17:21:59 2018 - [info] * Phase 3: Master Recovery Phase..
Sat Dec  8 17:21:59 2018 - [info] 
Sat Dec  8 17:21:59 2018 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Sat Dec  8 17:21:59 2018 - [info] 
Sat Dec  8 17:21:59 2018 - [info] The latest binary log file/position on all slaves is test2-bin.000007:154
Sat Dec  8 17:21:59 2018 - [info] Latest slaves (Slaves that received relay log files to the latest):
Sat Dec  8 17:21:59 2018 - [info]   10.0.102.179(10.0.102.179:3306)  Version=5.7.22-log (oldest major version between slaves) log-bin:enabled
Sat Dec  8 17:21:59 2018 - [info]     Replicating from 10.0.102.204(10.0.102.204:3306)
Sat Dec  8 17:21:59 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Sat Dec  8 17:21:59 2018 - [info]   10.0.102.221(10.0.102.221:3306)  Version=5.7.22 (oldest major version between slaves) log-bin:disabled
Sat Dec  8 17:21:59 2018 - [info]     Replicating from 10.0.102.204(10.0.102.204:3306)
Sat Dec  8 17:21:59 2018 - [info]     Not candidate for the new Master (no_master is set)
Sat Dec  8 17:21:59 2018 - [info] The oldest binary log file/position on all slaves is test2-bin.000007:154
Sat Dec  8 17:21:59 2018 - [info] Oldest slaves:
Sat Dec  8 17:21:59 2018 - [info]   10.0.102.179(10.0.102.179:3306)  Version=5.7.22-log (oldest major version between slaves) log-bin:enabled
Sat Dec  8 17:21:59 2018 - [info]     Replicating from 10.0.102.204(10.0.102.204:3306)
Sat Dec  8 17:21:59 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Sat Dec  8 17:21:59 2018 - [info]   10.0.102.221(10.0.102.221:3306)  Version=5.7.22 (oldest major version between slaves) log-bin:disabled
Sat Dec  8 17:21:59 2018 - [info]     Replicating from 10.0.102.204(10.0.102.204:3306)
Sat Dec  8 17:21:59 2018 - [info]     Not candidate for the new Master (no_master is set)
Sat Dec  8 17:21:59 2018 - [info] 
Sat Dec  8 17:21:59 2018 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
Sat Dec  8 17:21:59 2018 - [info] 
Sat Dec  8 17:21:59 2018 - [info] Fetching dead master's binary logs..
Sat Dec  8 17:21:59 2018 - [info] Executing command on the dead master 10.0.102.204(10.0.102.204:3306): save_binary_logs --command=save --start_file=test2-bin.000007  --start_pos=154 --binlog_dir=/data/mysql --output_file=/data/log/masterha/saved_master_binlog_from_10.0.102.204_3306_20181208172159.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.56
  Creating /data/log/masterha if not exists..    ok.
 Concat binary/relay logs from test2-bin.000007 pos 154 to test2-bin.000007 EOF into /data/log/masterha/saved_master_binlog_from_10.0.102.204_3306_20181208172159.binlog ..
 Binlog Checksum enabled
  Dumping binlog format description event, from position 0 to 154.. ok.
  Dumping effective binlog data from /data/mysql/test2-bin.000007 position 154 to tail(177).. ok.
 Binlog Checksum enabled
 Concat succeeded.
Sat Dec  8 17:22:00 2018 - [info] scp from [email protected]10.0.102.204:/data/log/masterha/saved_master_binlog_from_10.0.102.204_3306_20181208172159.binlog to local:/data/log/app1/saved_master_binlog_from_10.0.102.204_3306_20181208172159.binlog succeeded.
Sat Dec  8 17:22:00 2018 - [info] HealthCheck: SSH to 10.0.102.179 is reachable.
Sat Dec  8 17:22:00 2018 - [info] HealthCheck: SSH to 10.0.102.221 is reachable.
Sat Dec  8 17:22:00 2018 - [info] 
Sat Dec  8 17:22:00 2018 - [info] * Phase 3.3: Determining New Master Phase..
Sat Dec  8 17:22:00 2018 - [info] 
Sat Dec  8 17:22:00 2018 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
Sat Dec  8 17:22:00 2018 - [info] All slaves received relay logs to the same position. No need to resync each other.
Sat Dec  8 17:22:00 2018 - [info] Searching new master from slaves..
Sat Dec  8 17:22:00 2018 - [info]  Candidate masters from the configuration file:
Sat Dec  8 17:22:00 2018 - [info]   10.0.102.179(10.0.102.179:3306)  Version=5.7.22-log (oldest major version between slaves) log-bin:enabled
Sat Dec  8 17:22:00 2018 - [info]     Replicating from 10.0.102.204(10.0.102.204:3306)
Sat Dec  8 17:22:00 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Sat Dec  8 17:22:00 2018 - [info]  Non-candidate masters:
Sat Dec  8 17:22:00 2018 - [info]   10.0.102.221(10.0.102.221:3306)  Version=5.7.22 (oldest major version between slaves) log-bin:disabled
Sat Dec  8 17:22:00 2018 - [info]     Replicating from 10.0.102.204(10.0.102.204:3306)
Sat Dec  8 17:22:00 2018 - [info]     Not candidate for the new Master (no_master is set)
Sat Dec  8 17:22:00 2018 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Sat Dec  8 17:22:00 2018 - [info] New master is 10.0.102.179(10.0.102.179:3306)
Sat Dec  8 17:22:00 2018 - [info] Starting master failover..
Sat Dec  8 17:22:00 2018 - [info] 
From:
10.0.102.204(10.0.102.204:3306) (current master)
 +--10.0.102.179(10.0.102.179:3306)
 +--10.0.102.221(10.0.102.221:3306)

To:
10.0.102.179(10.0.102.179:3306) (new master)
 +--10.0.102.221(10.0.102.221:3306)
Sat Dec  8 17:22:00 2018 - [info] 
Sat Dec  8 17:22:00 2018 - [info] * Phase 3.3: New Master Diff Log Generation Phase..
Sat Dec  8 17:22:00 2018 - [info] 
Sat Dec  8 17:22:00 2018 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Sat Dec  8 17:22:00 2018 - [info] Sending binlog..
Sat Dec  8 17:22:01 2018 - [info] scp from local:/data/log/app1/saved_master_binlog_from_10.0.102.204_3306_20181208172159.binlog to [email protected]10.0.102.179:/data/log/masterha/saved_master_binlog_from_10.0.102.204_3306_20181208172159.binlog succeeded.
Sat Dec  8 17:22:01 2018 - [info] 
Sat Dec  8 17:22:01 2018 - [info] * Phase 3.4: Master Log Apply Phase..
Sat Dec  8 17:22:01 2018 - [info] 
Sat Dec  8 17:22:01 2018 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
Sat Dec  8 17:22:01 2018 - [info] Starting recovery on 10.0.102.179(10.0.102.179:3306)..
Sat Dec  8 17:22:01 2018 - [info]  Generating diffs succeeded.
Sat Dec  8 17:22:01 2018 - [info] Waiting until all relay logs are applied.
Sat Dec  8 17:22:01 2018 - [info]  done.
Sat Dec  8 17:22:01 2018 - [info] Getting slave status..
Sat Dec  8 17:22:01 2018 - [info] This slave(10.0.102.179)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(test2-bin.000007:154). No need to recover from Exec_Master_Log_Pos.
Sat Dec  8 17:22:01 2018 - [info] Connecting to the target slave host 10.0.102.179, ru