1. 程式人生 > >MySQL高可用架構之MHA搭建以及測試(二)

MySQL高可用架構之MHA搭建以及測試(二)

一、MHA特點

MHA監控複製架構的主伺服器,一旦檢測到主伺服器故障,就會自動進行故障轉移。

即使有些從伺服器沒有收到最新的relay log,MHA自動從最新的從伺服器上識別差異的relay log並把這些日誌應用到其他從伺服器上,因此所有的從伺服器保持一致性了。MHA通常在幾秒內完成故障轉移,9-12秒可以檢測出主伺服器故障,7-10秒內關閉故障的主伺服器以避免腦裂,幾秒中內應用差異的relay log到新的主伺服器上,整個過程可以在10-30s內完成。還可以設定優先順序指定其中的一臺slave作為master的候選人。由於MHA在slaves之間修復一致性,因此可以將任何slave變成新的master,而不會發生一致性的問題,從而導致複製失敗。

二、注意問題

1.從資料庫需要設定為read_only;

2.一旦發生切換,管理程序將會退出;重新啟動mha_manager進行另一次的切換,需要手工刪除管理目錄裡面的app1.failover.complete;

3.MHA 在切換的時候需要用mysqlbinlog命令,如果不是標準安裝需要手動增加軟連線;

ln -s /usr/local/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog
ln -s /usr/local/mysql/bin/mysql /usr/bin/mysql

4.主從配置都要有

binlog-do-db=test
replicate-do-db=test
一般情況下,主伺服器需要包含binlog-do-db=test,從伺服器需要包含replicate-do-db=test,這樣主從就可以同步了。但是隻是這樣配置的話,會報以下錯誤
All log-bin enabled servers must have same binlog filtering rules (same binlog-do-db and binlog-ignore-db). Check SHOW MASTER STATUS output and set my.cnf correctly.
上面英文的意思是說,主從同步的資料庫要一樣,其實不是,而是配置檔案中,配置資料庫這一塊要一樣。
5.
從伺服器,要加上relay_log_purge=0,不加的話,會報出warning,relay_log_purge=0 is not set on slave(後期檔案會有說明以及測試)
6.MHA配置檔案中
candidate_master=1                              #slave 是否優先提升為master
no_master=1                                     #該server禁止提升為master

7.masterha_check_repl檢測步驟

a、讀取配置檔案
b、檢測配置檔案中列出的mysql伺服器(識別主從)
c、檢測從庫配置資訊
    read_only引數
    relay_log_purge引數
    複製過濾規則
d、ssh等效性驗證 
e、檢測主庫儲存binlog指令碼(save_binary_logs) ,主要是用於在master死掉後從binlog讀取日誌
f、檢測各從庫能否apply差量binlog(apply_diff_relay_logs)
g、檢測IP切換,如果有部署指令碼

8.online master switch 的條件

1. IO threads on all slaves are running   // 在所有slave上IO執行緒執行。
2. SQL threads on all slaves are running  //SQL執行緒在所有的slave上正常執行。
3. Seconds_Behind_Master on all slaves are less or equal than --running_updates_limit seconds  // 在所有的slaves上 Seconds_Behind_Master 要小於等於  running_updates_limit seconds
4. On master, none of update queries take more than --running_updates_limit seconds in the show processlist output  // 在主上,沒有更新查詢操作多於running_updates_limit seconds 在show processlist輸出結果上。


三、MHA環境搭建

#mysql版本
mysql> select version();
+------------+
| version()  |
+------------+
| 5.6.37-log |
+------------+
1 row in set (0.00 sec)
#主機
192.168.18.50  主資料庫
192.168.18.60  從資料庫
192.168.18.70  從資料庫

#三臺伺服器建立SSH互信
#每臺伺服器上執行ssh-keygen,然後將每臺的id_rsa.pub追加到authorized_keys,然後分別複製到/root/.ssh下即可
[[email protected] ~]# ssh-keygen 
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:+FbMFmsG8/Yeq9Enfg4fN/0N/Ff0kcu/l0hq6g7YDFI [email protected]
The key's randomart image is:
+---[RSA 2048]----+
|                 |
|                 |
|     E  o .     .|
|    .  . * o   o.|
|   . .. S @   ..+|
|    . =. * o o o+|
|     . +o . O *.*|
|       ..  * O.*B|
|        o++.+oo.B|

cat id_rsa.pub >> authorized_keys

#安裝Mysql,安裝後搭建一主二從結構
#安裝後Mysql資料庫初始化相關賬號
mysql> delete from mysql.user where user!='root' or host!='localhost';
Query OK, 5 rows affected (0.00 sec)

mysql> truncate table mysql.db;
Query OK, 0 rows affected (0.05 sec)

mysql> drop database test;
Query OK, 0 rows affected (0.00 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.03 sec)
mysql> grant all privileges on *.* to 'mha'@'%' identified by '123456';
Query OK, 0 rows affected (0.01 sec)

mysql> grant replication slave on *.* to 'repl' identified by 'repl4slave';
Query OK, 0 rows affected (0.03 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.04 sec)
#安裝MHA軟體
#yum install perl-DBD-MySQL
#yum install perl-Config-Tiny
#yum install perl-Log-Dispatch
#yum install perl-Parallel-ForkManager
rpm -ivh mha4mysql-manager
rpm -ivh mha4mysql-node
#配置
[[email protected] masterha]# more masterha_default.conf 
[server default]
#MySQL的使用者和密碼
user=mha
password=123456

#系統ssh使用者
ssh_user=root

#複製使用者
repl_user=repl
repl_password= repl4slave


#監控
ping_interval=1
shutdown_script=""

#切換呼叫的指令碼
master_ip_failover_script= /etc/masterha/master_ip_failover
master_ip_online_change_script= /etc/masterha/master_ip_online_change
[[email protected] masterha]# more app1.conf 
[server default]


#mha manager工作目錄
manager_workdir = /var/log/masterha/app1
manager_log = /var/log/masterha/app1/app1.log
remote_workdir = /var/log/masterha/app1

[server1]
hostname=test
port=3307
master_binlog_dir = /mydata/mysql/mysql_3307
candidate_master = 1
check_repl_delay = 0     #用防止master故障時,切換時slave有延遲,卡在那裡切不過來。

[server2]
hostname=test1
port=3307
master_binlog_dir=/mydata/mysql/mysql_3307
no_master =1
check_repl_delay=0


[server3]
hostname=test2
port=3307
master_binlog_dir=/mydata/mysql/mysql_3307
candidate_master=1
check_repl_delay=0

#測試SSH
[[email protected] masterha]# masterha_check_ssh --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf 
Wed Sep 27 08:53:51 2017 - [info] Reading default configuration from /etc/masterha/masterha_default.conf..
Wed Sep 27 08:53:51 2017 - [info] Reading application default configuration from /etc/masterha/app1.conf..
Wed Sep 27 08:53:51 2017 - [info] Reading server configuration from /etc/masterha/app1.conf..
Wed Sep 27 08:53:51 2017 - [info] Starting SSH connection tests..
Wed Sep 27 08:53:53 2017 - [debug] 
Wed Sep 27 08:53:51 2017 - [debug]  Connecting via SSH from [email protected](192.168.18.50:22) to [email protected](192.168.18.60:22)..
Wed Sep 27 08:53:51 2017 - [debug]   ok.
Wed Sep 27 08:53:51 2017 - [debug]  Connecting via SSH from [email protected](192.168.18.50:22) to [email protected](192.168.18.70:22)..
Wed Sep 27 08:53:52 2017 - [debug]   ok.
Wed Sep 27 08:53:53 2017 - [debug] 
Wed Sep 27 08:53:52 2017 - [debug]  Connecting via SSH from [email protected](192.168.18.60:22) to [email protected](192.168.18.50:22)..
Wed Sep 27 08:53:52 2017 - [debug]   ok.
Wed Sep 27 08:53:52 2017 - [debug]  Connecting via SSH from [email protected](192.168.18.60:22) to [email protected](192.168.18.70:22)..
Warning: Permanently added '192.168.18.70' (RSA) to the list of known hosts.
Wed Sep 27 08:53:53 2017 - [debug]   ok.
Wed Sep 27 08:53:53 2017 - [debug] 
Wed Sep 27 08:53:52 2017 - [debug]  Connecting via SSH from [email protected](192.168.18.70:22) to [email protected](192.168.18.50:22)..
Wed Sep 27 08:53:53 2017 - [debug]   ok.
Wed Sep 27 08:53:53 2017 - [debug]  Connecting via SSH from [email protected](192.168.18.70:22) to [email protected](192.168.18.60:22)..
Warning: Permanently added '192.168.18.60' (RSA) to the list of known hosts.
Wed Sep 27 08:53:53 2017 - [debug]   ok.
Wed Sep 27 08:53:53 2017 - [info] All SSH connection tests passed successfully.

#測試主從結構
[[email protected] masterha]# masterha_check_repl --global_conf=masterha_default.conf --conf=app1.conf 
Thu Sep 28 06:14:11 2017 - [info] Reading default configuration from masterha_default.conf..
Thu Sep 28 06:14:11 2017 - [info] Reading application default configuration from app1.conf..
Thu Sep 28 06:14:11 2017 - [info] Reading server configuration from app1.conf..
Thu Sep 28 06:14:11 2017 - [info] MHA::MasterMonitor version 0.56.
Thu Sep 28 06:14:12 2017 - [info] GTID failover mode = 0
Thu Sep 28 06:14:12 2017 - [info] Dead Servers:
Thu Sep 28 06:14:12 2017 - [info] Alive Servers:
Thu Sep 28 06:14:12 2017 - [info]   test(192.168.18.50:3307)
Thu Sep 28 06:14:12 2017 - [info]   test1(192.168.18.60:3307)
Thu Sep 28 06:14:12 2017 - [info]   test2(192.168.18.70:3307)
Thu Sep 28 06:14:12 2017 - [info] Alive Slaves:
Thu Sep 28 06:14:12 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:14:12 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:14:12 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 06:14:12 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:14:12 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:14:12 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Sep 28 06:14:12 2017 - [info] Current Alive Master: test(192.168.18.50:3307)
Thu Sep 28 06:14:12 2017 - [info] Checking slave configurations..
Thu Sep 28 06:14:12 2017 - [warning]  relay_log_purge=0 is not set on slave test1(192.168.18.60:3307).
Thu Sep 28 06:14:12 2017 - [warning]  relay_log_purge=0 is not set on slave test2(192.168.18.70:3307).
Thu Sep 28 06:14:12 2017 - [info] Checking replication filtering settings..
Thu Sep 28 06:14:12 2017 - [info]  binlog_do_db= AAA, binlog_ignore_db= 
Thu Sep 28 06:14:12 2017 - [info]  Replication filtering check ok.
Thu Sep 28 06:14:12 2017 - [info] GTID (with auto-pos) is not supported
Thu Sep 28 06:14:12 2017 - [info] Starting SSH connection tests..
Thu Sep 28 06:14:13 2017 - [info] All SSH connection tests passed successfully.
Thu Sep 28 06:14:13 2017 - [info] Checking MHA Node version..
Thu Sep 28 06:14:14 2017 - [info]  Version check ok.
Thu Sep 28 06:14:14 2017 - [info] Checking SSH publickey authentication settings on the current master..
Thu Sep 28 06:14:14 2017 - [info] HealthCheck: SSH to test is reachable.
Thu Sep 28 06:14:15 2017 - [info] Master MHA Node version is 0.56.
Thu Sep 28 06:14:15 2017 - [info] Checking recovery script configurations on test(192.168.18.50:3307)..
Thu Sep 28 06:14:15 2017 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/mydata/mysql/mysql_3307 --output_file=/var/log/masterha/app1/save_binary_logs_test --manager_version=0.56 --start_file=mybinlog.000001 
Thu Sep 28 06:14:15 2017 - [info]   Connecting to [email protected](test:22).. 
  Creating /var/log/masterha/app1 if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /mydata/mysql/mysql_3307, up to mybinlog.000001
Thu Sep 28 06:14:15 2017 - [info] Binlog setting check done.
Thu Sep 28 06:14:15 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Thu Sep 28 06:14:15 2017 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=test1 --slave_ip=192.168.18.60 --slave_port=3307 --workdir=/var/log/masterha/app1 --target_version=5.6.37-log --manager_version=0.56 --relay_dir=/mydata/mysql/mysql_3307 --current_relay_log=mysql-relay-bin.000002  --slave_pass=xxx
Thu Sep 28 06:14:15 2017 - [info]   Connecting to [email protected](test1:22).. 
  Checking slave recovery environment settings..
    Relay log found at /mydata/mysql/mysql_3307, up to mysql-relay-bin.000002
    Temporary relay log file is /mydata/mysql/mysql_3307/mysql-relay-bin.000002
    Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Thu Sep 28 06:14:15 2017 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=test2 --slave_ip=192.168.18.70 --slave_port=3307 --workdir=/var/log/masterha/app1 --target_version=5.6.37-log --manager_version=0.56 --relay_dir=/mydata/mysql/mysql_3307 --current_relay_log=mysql-relay-bin.000002  --slave_pass=xxx
Thu Sep 28 06:14:15 2017 - [info]   Connecting to [email protected](test2:22).. 
  Checking slave recovery environment settings..
    Relay log found at /mydata/mysql/mysql_3307, up to mysql-relay-bin.000002
    Temporary relay log file is /mydata/mysql/mysql_3307/mysql-relay-bin.000002
    Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Thu Sep 28 06:14:16 2017 - [info] Slaves settings check done.
Thu Sep 28 06:14:16 2017 - [info] 
test(192.168.18.50:3307) (current master)
 +--test1(192.168.18.60:3307)
 +--test2(192.168.18.70:3307)

Thu Sep 28 06:14:16 2017 - [info] Checking replication health on test1..
Thu Sep 28 06:14:16 2017 - [info]  ok.
Thu Sep 28 06:14:16 2017 - [info] Checking replication health on test2..
Thu Sep 28 06:14:16 2017 - [info]  ok.
Thu Sep 28 06:14:16 2017 - [info] Checking master_ip_failover_script status:
Thu Sep 28 06:14:16 2017 - [info]   /etc/masterha/master_ip_failover --command=status --ssh_user=root --orig_master_host=test --orig_master_ip=192.168.18.50 --orig_master_port=3307 
Thu Sep 28 06:14:16 2017 - [info]  OK.
Thu Sep 28 06:14:16 2017 - [warning] shutdown_script is not defined.
Thu Sep 28 06:14:16 2017 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.
#首先手工啟動VIP
[[email protected] masterha]# ./init_vip.sh 

#啟動MHA
[[email protected] masterha]# masterha_manager --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf >/tmp/mha_manager.log 2>&1 &

#檢視MHA啟動日誌
Thu Sep 28 06:27:29 2017 - [info] MHA::MasterMonitor version 0.56.
Thu Sep 28 06:27:29 2017 - [info] GTID failover mode = 0
Thu Sep 28 06:27:29 2017 - [info] Dead Servers:
Thu Sep 28 06:27:29 2017 - [info] Alive Servers:
Thu Sep 28 06:27:29 2017 - [info]   test(192.168.18.50:3307)
Thu Sep 28 06:27:29 2017 - [info]   test1(192.168.18.60:3307)
Thu Sep 28 06:27:29 2017 - [info]   test2(192.168.18.70:3307)
Thu Sep 28 06:27:29 2017 - [info] Alive Slaves:
Thu Sep 28 06:27:29 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:27:29 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:27:29 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 06:27:29 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:27:29 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:27:29 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Sep 28 06:27:29 2017 - [info] Current Alive Master: test(192.168.18.50:3307)
Thu Sep 28 06:27:29 2017 - [info] Checking slave configurations..
Thu Sep 28 06:27:29 2017 - [warning]  relay_log_purge=0 is not set on slave test1(192.168.18.60:3307).
Thu Sep 28 06:27:29 2017 - [warning]  relay_log_purge=0 is not set on slave test2(192.168.18.70:3307).
Thu Sep 28 06:27:29 2017 - [info] Checking replication filtering settings..
Thu Sep 28 06:27:29 2017 - [info]  binlog_do_db= AAA, binlog_ignore_db= 
Thu Sep 28 06:27:29 2017 - [info]  Replication filtering check ok.
Thu Sep 28 06:27:29 2017 - [info] GTID (with auto-pos) is not supported
Thu Sep 28 06:27:29 2017 - [info] Starting SSH connection tests..
Thu Sep 28 06:27:31 2017 - [info] All SSH connection tests passed successfully.
Thu Sep 28 06:27:31 2017 - [info] Checking MHA Node version..
Thu Sep 28 06:27:31 2017 - [info]  Version check ok.
Thu Sep 28 06:27:31 2017 - [info] Checking SSH publickey authentication settings on the current master..
Thu Sep 28 06:27:31 2017 - [info] HealthCheck: SSH to test is reachable.
Thu Sep 28 06:27:31 2017 - [info] Master MHA Node version is 0.56.
Thu Sep 28 06:27:31 2017 - [info] Checking recovery script configurations on test(192.168.18.50:3307)..
Thu Sep 28 06:27:31 2017 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/mydata/mysql/mysql_3307 --output_file=/var/log/masterha/app1/save_binary_logs_test --manager_version=0.56 --start_file=mybinlog.000001 
Thu Sep 28 06:27:31 2017 - [info]   Connecting to [email protected](test:22).. 
  Creating /var/log/masterha/app1 if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /mydata/mysql/mysql_3307, up to mybinlog.000001
Thu Sep 28 06:27:31 2017 - [info] Binlog setting check done.
Thu Sep 28 06:27:31 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Thu Sep 28 06:27:31 2017 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=test1 --slave_ip=192.168.18.60 --slave_port=3307 --workdir=/var/log/masterha/app1 --target_version=5.6.37-log --manager_version=0.56 --relay_dir=/mydata/mysql/mysql_3307 --current_relay_log=mysql-relay-bin.000002  --slave_pass=xxx
Thu Sep 28 06:27:31 2017 - [info]   Connecting to [email protected](test1:22).. 
  Checking slave recovery environment settings..
    Relay log found at /mydata/mysql/mysql_3307, up to mysql-relay-bin.000002
    Temporary relay log file is /mydata/mysql/mysql_3307/mysql-relay-bin.000002
    Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Thu Sep 28 06:27:32 2017 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=test2 --slave_ip=192.168.18.70 --slave_port=3307 --workdir=/var/log/masterha/app1 --target_version=5.6.37-log --manager_version=0.56 --relay_dir=/mydata/mysql/mysql_3307 --current_relay_log=mysql-relay-bin.000002  --slave_pass=xxx
Thu Sep 28 06:27:32 2017 - [info]   Connecting to [email protected](test2:22).. 
  Checking slave recovery environment settings..
    Relay log found at /mydata/mysql/mysql_3307, up to mysql-relay-bin.000002
    Temporary relay log file is /mydata/mysql/mysql_3307/mysql-relay-bin.000002
    Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Thu Sep 28 06:27:32 2017 - [info] Slaves settings check done.
Thu Sep 28 06:27:32 2017 - [info] 
test(192.168.18.50:3307) (current master)
 +--test1(192.168.18.60:3307)
 +--test2(192.168.18.70:3307)

Thu Sep 28 06:27:32 2017 - [info] Checking master_ip_failover_script status:
Thu Sep 28 06:27:32 2017 - [info]   /etc/masterha/master_ip_failover --command=status --ssh_user=root --orig_master_host=test --orig_master_ip=192.168.18.50 --orig_master_port=3307 
Thu Sep 28 06:27:32 2017 - [info]  OK.
Thu Sep 28 06:27:32 2017 - [warning] shutdown_script is not defined.
Thu Sep 28 06:27:32 2017 - [info] Set master ping interval 1 seconds.
Thu Sep 28 06:27:32 2017 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Thu Sep 28 06:27:32 2017 - [info] Starting ping health check on test(192.168.18.50:3307)..
Thu Sep 28 06:27:32 2017 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

#切換測試,關閉test伺服器上的mysql服務
#檢視切換日誌
Thu Sep 28 06:32:42 2017 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Thu Sep 28 06:32:42 2017 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/mydata/mysql/mysql_3307 --output_file=/var/log/masterha/app1/save_binary_logs_test --manager_version=0.56 --binlog_prefix=mybinlog
Thu Sep 28 06:32:42 2017 - [info] HealthCheck: SSH to test is reachable.
Thu Sep 28 06:32:43 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Thu Sep 28 06:32:43 2017 - [warning] Connection failed 2 time(s)..
Thu Sep 28 06:32:44 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Thu Sep 28 06:32:44 2017 - [warning] Connection failed 3 time(s)..
Thu Sep 28 06:32:45 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Thu Sep 28 06:32:45 2017 - [warning] Connection failed 4 time(s)..
Thu Sep 28 06:32:45 2017 - [warning] Master is not reachable from health checker!
Thu Sep 28 06:32:45 2017 - [warning] Master test(192.168.18.50:3307) is not reachable!
Thu Sep 28 06:32:45 2017 - [warning] SSH is reachable.
Thu Sep 28 06:32:45 2017 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha/masterha_default.conf and /etc/masterha/app1.conf again, and trying to connect to all servers to check server status..
Thu Sep 28 06:32:45 2017 - [info] Reading default configuration from /etc/masterha/masterha_default.conf..
Thu Sep 28 06:32:45 2017 - [info] Reading application default configuration from /etc/masterha/app1.conf..
Thu Sep 28 06:32:45 2017 - [info] Reading server configuration from /etc/masterha/app1.conf..
Thu Sep 28 06:32:45 2017 - [info] GTID failover mode = 0
Thu Sep 28 06:32:45 2017 - [info] Dead Servers:
Thu Sep 28 06:32:45 2017 - [info]   test(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info] Alive Servers:
Thu Sep 28 06:32:45 2017 - [info]   test1(192.168.18.60:3307)
Thu Sep 28 06:32:45 2017 - [info]   test2(192.168.18.70:3307)
Thu Sep 28 06:32:45 2017 - [info] Alive Slaves:
Thu Sep 28 06:32:45 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:45 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 06:32:45 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:45 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Sep 28 06:32:45 2017 - [info] Checking slave configurations..
Thu Sep 28 06:32:45 2017 - [warning]  relay_log_purge=0 is not set on slave test1(192.168.18.60:3307).
Thu Sep 28 06:32:45 2017 - [warning]  relay_log_purge=0 is not set on slave test2(192.168.18.70:3307).
Thu Sep 28 06:32:45 2017 - [info] Checking replication filtering settings..
Thu Sep 28 06:32:45 2017 - [info]  Replication filtering check ok.
Thu Sep 28 06:32:45 2017 - [info] Master is down!
Thu Sep 28 06:32:45 2017 - [info] Terminating monitoring script.
Thu Sep 28 06:32:45 2017 - [info] Got exit code 20 (Master dead).
Thu Sep 28 06:32:45 2017 - [info] MHA::MasterFailover version 0.56.
Thu Sep 28 06:32:45 2017 - [info] Starting master failover.
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] * Phase 1: Configuration Check Phase..
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] GTID failover mode = 0
Thu Sep 28 06:32:45 2017 - [info] Dead Servers:
Thu Sep 28 06:32:45 2017 - [info]   test(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info] Checking master reachability via MySQL(double check)...
Thu Sep 28 06:32:45 2017 - [info]  ok.
Thu Sep 28 06:32:45 2017 - [info] Alive Servers:
Thu Sep 28 06:32:45 2017 - [info]   test1(192.168.18.60:3307)
Thu Sep 28 06:32:45 2017 - [info]   test2(192.168.18.70:3307)
Thu Sep 28 06:32:45 2017 - [info] Alive Slaves:
Thu Sep 28 06:32:45 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:45 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 06:32:45 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:45 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Sep 28 06:32:45 2017 - [info] Starting Non-GTID based failover.
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] ** Phase 1: Configuration Check Phase completed.
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] * Phase 2: Dead Master Shutdown Phase..
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] Forcing shutdown so that applications never connect to the current master..
Thu Sep 28 06:32:45 2017 - [info] Executing master IP deactivation script:
Thu Sep 28 06:32:45 2017 - [info]   /etc/masterha/master_ip_failover --orig_master_host=test --orig_master_ip=192.168.18.50 --orig_master_port=3307 --command=stopssh --ssh_user=root  
Thu Sep 28 06:32:45 2017 - [info]  done.
Thu Sep 28 06:32:45 2017 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Thu Sep 28 06:32:45 2017 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] * Phase 3: Master Recovery Phase..
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] The latest binary log file/position on all slaves is mybinlog.000001:1048
Thu Sep 28 06:32:45 2017 - [info] Latest slaves (Slaves that received relay log files to the latest):
Thu Sep 28 06:32:45 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:45 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 06:32:45 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:45 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Sep 28 06:32:45 2017 - [info] The oldest binary log file/position on all slaves is mybinlog.000001:1048
Thu Sep 28 06:32:45 2017 - [info] Oldest slaves:
Thu Sep 28 06:32:45 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:45 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 06:32:45 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:45 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] Fetching dead master's binary logs..
Thu Sep 28 06:32:45 2017 - [info] Executing command on the dead master test(192.168.18.50:3307): save_binary_logs --command=save --start_file=mybinlog.000001  --start_pos=1048 --binlog_dir=/mydata/mysql/mysql_3307 --output_file=/var/log/masterha/app1/saved_master_binlog_from_test_3307_20170928063245.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.56
  Creating /var/log/masterha/app1 if not exists..    ok.
 Concat binary/relay logs from mybinlog.000001 pos 1048 to mybinlog.000001 EOF into /var/log/masterha/app1/saved_master_binlog_from_test_3307_20170928063245.binlog ..
 Binlog Checksum enabled
  Dumping binlog format description event, from position 0 to 120.. ok.
  No need to dump effective binlog data from /mydata/mysql/mysql_3307/mybinlog.000001 (pos starts 1048, filesize 1048). Skipping.
sh: mysqlbinlog: command not found
Failed to save binary log: /var/log/masterha/app1/saved_master_binlog_from_test_3307_20170928063245.binlog is broken!
 at /usr/bin/save_binary_logs line 176
Thu Sep 28 06:32:46 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterFailover.pm, ln760] Failed to save binary log events from the orig master. Maybe disks on binary logs are not accessible or binary log itself is corrupt?
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] * Phase 3.3: Determining New Master Phase..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
Thu Sep 28 06:32:46 2017 - [info] All slaves received relay logs to the same position. No need to resync each other.
Thu Sep 28 06:32:46 2017 - [info] Searching new master from slaves..
Thu Sep 28 06:32:46 2017 - [info]  Candidate masters from the configuration file:
Thu Sep 28 06:32:46 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:46 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:46 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Sep 28 06:32:46 2017 - [info]  Non-candidate masters:
Thu Sep 28 06:32:46 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:46 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:46 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 06:32:46 2017 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Thu Sep 28 06:32:46 2017 - [info] New master is test2(192.168.18.70:3307)
Thu Sep 28 06:32:46 2017 - [info] Starting master failover..
Thu Sep 28 06:32:46 2017 - [info] 
From:
test(192.168.18.50:3307) (current master)
 +--test1(192.168.18.60:3307)
 +--test2(192.168.18.70:3307)

To:
test2(192.168.18.70:3307) (new master)
 +--test1(192.168.18.60:3307)
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] * Phase 3.3: New Master Diff Log Generation Phase..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] * Phase 3.4: Master Log Apply Phase..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
Thu Sep 28 06:32:46 2017 - [info] Starting recovery on test2(192.168.18.70:3307)..
Thu Sep 28 06:32:46 2017 - [info]  This server has all relay logs. Waiting all logs to be applied.. 
Thu Sep 28 06:32:46 2017 - [info]   done.
Thu Sep 28 06:32:46 2017 - [info]  All relay logs were successfully applied.
Thu Sep 28 06:32:46 2017 - [info] Getting new master's binlog name and position..
Thu Sep 28 06:32:46 2017 - [info]  mybinlog.000003:847
Thu Sep 28 06:32:46 2017 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='test2 or 192.168.18.70', MASTER_PORT=3307, MASTER_LOG_FILE='mybinlog.000003', MASTER_LOG_POS=847, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Thu Sep 28 06:32:46 2017 - [info] Executing master IP activate script:
Thu Sep 28 06:32:46 2017 - [info]   /etc/masterha/master_ip_failover --command=start --ssh_user=root --orig_master_host=test --orig_master_ip=192.168.18.50 --orig_master_port=3307 --new_master_host=test2 --new_master_ip=192.168.18.70 --new_master_port=3307 --new_master_user='mha' --new_master_password='123456'  
Set read_only=0 on the new master.
Thu Sep 28 06:32:46 2017 - [info]  OK.
Thu Sep 28 06:32:46 2017 - [info] ** Finished master recovery successfully.
Thu Sep 28 06:32:46 2017 - [info] * Phase 3: Master Recovery Phase completed.
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] * Phase 4: Slaves Recovery Phase..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] -- Slave diff file generation on host test1(192.168.18.60:3307) started, pid: 5914. Check tmp log /var/log/masterha/app1/test1_3307_20170928063245.log if it takes time..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] Log messages from test1 ...
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Thu Sep 28 06:32:46 2017 - [info] End of log messages from test1.
Thu Sep 28 06:32:46 2017 - [info] -- test1(192.168.18.60:3307) has the latest relay log events.
Thu Sep 28 06:32:46 2017 - [info] Generating relay diff files from the latest slave succeeded.
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] -- Slave recovery on host test1(192.168.18.60:3307) started, pid: 5916. Check tmp log /var/log/masterha/app1/test1_3307_20170928063245.log if it takes time..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] Log messages from test1 ...
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] Starting recovery on test1(192.168.18.60:3307)..
Thu Sep 28 06:32:46 2017 - [info]  This server has all relay logs. Waiting all logs to be applied.. 
Thu Sep 28 06:32:46 2017 - [info]   done.
Thu Sep 28 06:32:46 2017 - [info]  All relay logs were successfully applied.
Thu Sep 28 06:32:46 2017 - [info]  Resetting slave test1(192.168.18.60:3307) and starting replication from the new master test2(192.168.18.70:3307)..
Thu Sep 28 06:32:46 2017 - [info]  Executed CHANGE MASTER.
Thu Sep 28 06:32:46 2017 - [info]  Slave started.
Thu Sep 28 06:32:46 2017 - [info] End of log messages from test1.
Thu Sep 28 06:32:46 2017 - [info] -- Slave recovery on host test1(192.168.18.60:3307) succeeded.
Thu Sep 28 06:32:46 2017 - [info] All new slave servers recovered successfully.
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] * Phase 5: New master cleanup phase..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] Resetting slave info on the new master..
Thu Sep 28 06:32:47 2017 - [info]  test2: Resetting slave info succeeded.
Thu Sep 28 06:32:47 2017 - [info] Master failover to test2(192.168.18.70:3307) completed successfully.
Thu Sep 28 06:32:47 2017 - [info] 

----- Failover Report -----

app1: MySQL Master failover test(192.168.18.50:3307) to test2(192.168.18.70:3307) succeeded

Master test(192.168.18.50:3307) is down!

Check MHA Manager logs at test1:/var/log/masterha/app1/app1.log for details.

Started automated(non-interactive) failover.
Invalidated master IP address on test(192.168.18.50:3307)
The latest slave test1(192.168.18.60:3307) has all relay logs for recovery.
Selected test2(192.168.18.70:3307) as a new master.
test2(192.168.18.70:3307): OK: Applying all logs succeeded.
test2(192.168.18.70:3307): OK: Activated master IP address.
test1(192.168.18.60:3307): This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
test1(192.168.18.60:3307): OK: Applying all logs succeeded. Slave started, replicating from test2(192.168.18.70:3307)
test2(192.168.18.70:3307): Resetting slave info succeeded.
Master failover to test2(192.168.18.70:3307) completed successfully.

#檢查啟動的狀態
[[email protected] masterha]# masterha_check_status --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf 
app1 (pid:11183) is running(0:PING_OK), master:test
#關閉mha
[[email protected] masterha]# masterha_stop --global=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf


#手工線上切換,需要關閉masterha_manager
[[email protected] masterha]# masterha_master_switch --master_state=alive --global_conf=masterha_default.conf --conf=app1.conf 
Thu Sep 28 09:45:57 2017 - [info] MHA::MasterRotate version 0.56.
Thu Sep 28 09:45:57 2017 - [info] Starting online master switch..
Thu Sep 28 09:45:57 2017 - [info] 
Thu Sep 28 09:45:57 2017 - [info] * Phase 1: Configuration Check Phase..
Thu Sep 28 09:45:57 2017 - [info] 
Thu Sep 28 09:45:57 2017 - [info] Reading default configuration from masterha_default.conf..
Thu Sep 28 09:45:57 2017 - [info] Reading application default configuration from app1.conf..
Thu Sep 28 09:45:57 2017 - [info] Reading server configuration from app1.conf..
Thu Sep 28 09:45:57 2017 - [info] GTID failover mode = 0
Thu Sep 28 09:45:57 2017 - [info] Current Alive Master: test(192.168.18.50:3307)
Thu Sep 28 09:45:57 2017 - [info] Alive Slaves:
Thu Sep 28 09:45:57 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 09:45:57 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 09:45:57 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 09:45:57 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 09:45:57 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 09:45:57 2017 - [info]     Primary candidate for the new Master (candidate_master is set)

It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on test(192.168.18.50:3307)? (YES/no): y
Thu Sep 28 09:45:59 2017 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Thu Sep 28 09:45:59 2017 - [info]  ok.
Thu Sep 28 09:45:59 2017 - [info] Checking MHA is not monitoring or doing failover..
Thu Sep 28 09:45:59 2017 - [info] Checking replication health on test1..
Thu Sep 28 09:45:59 2017 - [info]  ok.
Thu Sep 28 09:45:59 2017 - [info] Checking replication health on test2..
Thu Sep 28 09:45:59 2017 - [info]  ok.
Thu Sep 28 09:45:59 2017 - [info] Searching new master from slaves..
Thu Sep 28 09:45:59 2017 - [info]  Candidate masters from the configuration file:
Thu Sep 28 09:45:59 2017 - [info]   test(192.168.18.50:3307)  Version=5.6.37-log log-bin:enabled
Thu Sep 28 09:45:59 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 09:45:59 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 09:45:59 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Sep 28 09:45:59 2017 - [info]  Non-candidate masters:
Thu Sep 28 09:45:59 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 09:45:59 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 09:45:59 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 09:45:59 2017 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Thu Sep 28 09:45:59 2017 - [info] 
From:
test(192.168.18.50:3307) (current master)
 +--test1(192.168.18.60:3307)
 +--test2(192.168.18.70:3307)

To:
test2(192.168.18.70:3307) (new master)
 +--test1(192.168.18.60:3307)

Starting master switch from test(192.168.18.50:3307) to test2(192.168.18.70:3307)? (yes/NO): y
Thu Sep 28 09:46:00 2017 - [info] Checking whether test2(192.168.18.70:3307) is ok for the new master..
Thu Sep 28 09:46:00 2017 - [info]  ok.
Thu Sep 28 09:46:00 2017 - [info] ** Phase 1: Configuration Check Phase completed.
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:00 2017 - [info] * Phase 2: Rejecting updates Phase..
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:00 2017 - [info] Executing master ip online change script to disable write on the current master:
Thu Sep 28 09:46:00 2017 - [info]   /etc/masterha/master_ip_online_change --command=stop --orig_master_host=test --orig_master_ip=192.168.18.50 --orig_master_port=3307 --orig_master_user='mha' --orig_master_password='123456' --new_master_host=test2 --new_master_ip=192.168.18.70 --new_master_port=3307 --new_master_user='mha' --new_master_password='123456' --orig_master_ssh_user=root --new_master_ssh_user=root  
Thu Sep 28 09:46:00 2017 120078 Set read_only on the new master.. ok.
Thu Sep 28 09:46:00 2017 124819 drop vip 192.168.18.100..
Thu Sep 28 09:46:00 2017 210095 Set read_only=1 on the orig master.. ok.
Thu Sep 28 09:46:00 2017 211292 Killing all application threads..
Thu Sep 28 09:46:00 2017 211318 done.
Thu Sep 28 09:46:00 2017 - [info]  ok.
Thu Sep 28 09:46:00 2017 - [info] Locking all tables on the orig master to reject updates from everybody (including root):
Thu Sep 28 09:46:00 2017 - [info] Executing FLUSH TABLES WITH READ LOCK..
Thu Sep 28 09:46:00 2017 - [info]  ok.
Thu Sep 28 09:46:00 2017 - [info] Orig master binlog:pos is mybinlog.000001:514.
Thu Sep 28 09:46:00 2017 - [info]  Waiting to execute all relay logs on test2(192.168.18.70:3307)..
Thu Sep 28 09:46:00 2017 - [info]  master_pos_wait(mybinlog.000001:514) completed on test2(192.168.18.70:3307). Executed 0 events.
Thu Sep 28 09:46:00 2017 - [info]   done.
Thu Sep 28 09:46:00 2017 - [info] Getting new master's binlog name and position..
Thu Sep 28 09:46:00 2017 - [info]  mybinlog.000006:120
Thu Sep 28 09:46:00 2017 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='test2 or 192.168.18.70', MASTER_PORT=3307, MASTER_LOG_FILE='mybinlog.000006', MASTER_LOG_POS=120, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Thu Sep 28 09:46:00 2017 - [info] Executing master ip online change script to allow write on the new master:
Thu Sep 28 09:46:00 2017 - [info]   /etc/masterha/master_ip_online_change --command=start --orig_master_host=test --orig_master_ip=192.168.18.50 --orig_master_port=3307 --orig_master_user='mha' --orig_master_password='123456' --new_master_host=test2 --new_master_ip=192.168.18.70 --new_master_port=3307 --new_master_user='mha' --new_master_password='123456' --orig_master_ssh_user=root --new_master_ssh_user=root  
Thu Sep 28 09:46:00 2017 410159 Set read_only=0 on the new master.
Thu Sep 28 09:46:00 2017 410846Add vip 192.168.18.100 on eth1..
Thu Sep 28 09:46:00 2017 - [info]  ok.
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:00 2017 - [info] * Switching slaves in parallel..
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:00 2017 - [info] -- Slave switch on host test1(192.168.18.60:3307) started, pid: 12970
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:00 2017 - [info] Log messages from test1 ...
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:00 2017 - [info]  Waiting to execute all relay logs on test1(192.168.18.60:3307)..
Thu Sep 28 09:46:00 2017 - [info]  master_pos_wait(mybinlog.000001:514) completed on test1(192.168.18.60:3307). Executed 0 events.
Thu Sep 28 09:46:00 2017 - [info]   done.
Thu Sep 28 09:46:00 2017 - [info]  Resetting slave test1(192.168.18.60:3307) and starting replication from the new master test2(192.168.18.70:3307)..
Thu Sep 28 09:46:00 2017 - [info]  Executed CHANGE MASTER.
Thu Sep 28 09:46:00 2017 - [info]  Slave started.
Thu Sep 28 09:46:00 2017 - [info] End of log messages from test1 ...
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:00 2017 - [info] -- Slave switch on host test1(192.168.18.60:3307) succeeded.
Thu Sep 28 09:46:00 2017 - [info] Unlocking all tables on the orig master:
Thu Sep 28 09:46:00 2017 - [info] Executing UNLOCK TABLES..
Thu Sep 28 09:46:00 2017 - [info]  ok.
Thu Sep 28 09:46:00 2017 - [info] All new slave servers switched successfully.
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:00 2017 - [info] * Phase 5: New master cleanup phase..
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:02 2017 - [info]  test2: Resetting slave info succeeded.
Thu Sep 28 09:46:02 2017 - [info] Switching master to test2(192.168.18.70:3307) completed successfully.