1. 程式人生 > >Mysql+Keepalived雙主熱備高可用操作步驟詳細解析

Mysql+Keepalived雙主熱備高可用操作步驟詳細解析

eve 腳本 server設置 not find 實驗 本地ip help oracle 查看

mysql+keepalived雙主熱備高可用的介紹:

我們通常說的雙機熱備是指兩臺機器都在運行,但並不是兩臺機器都同時在提供服務。當提供服務的一臺出現故障的時候,另外一臺會馬上自動接管並且提供服務,而且切換的時間非常短。MySQL雙主復制,即互為Master-Slave(只有一個Master提供寫操作),可以實現數據庫服務器的熱備,但是一個Master宕機後不能實現動態切換。使用Keepalived,可以通過虛擬IP,實現雙主對外的統一接口以及自動檢查、失敗切換機制,從而實現MySQL數據庫的高可用方案。之前梳理了Mysql主從/主主同步,下面說下Mysql+keeoalived雙主熱備高可用方案的實施。

Keepalived看名字就知道,保持存活,在網絡裏面就是保持在線了,也就是所謂的高可用或熱備,用來防止單點故障(單點故障是指一旦某一點出現故障就會導
整個系統架構的不可用)的發生,那說到keepalived不得不說的一個協議不是VRRP協議,可以說這個協議就是keepalived實現的基礎。
1)Keepalived的工作原理是VRRP(Virtual Router Redundancy Protocol)虛擬路由冗余協議。在VRRP中有兩組重要的概念:VRRP路由器和虛擬路由器,主控路由器和備份路由器。
2)VRRP路由器是指運行VRRP的路由器,是物理實體,虛擬路由器是指VRRP協議創建的,是邏輯概念。一組VRRP路由器協同工作,共同構成一臺虛擬路由器。
Vrrp中存在著一種選舉機制,用以選出提供服務的路由即主控路由,其他的則成了備份路由。當主控路由失效後,備份路由中會重新選舉出一個主控路由,來繼
續工作,來保障不間斷服務。

實驗部署

實驗需求:

1)先實施Master->Slave的主主同步。主主是數據雙向同步,主從是數據單向同步。一般情況下,主庫宕機後,需要手動將連接切換到從庫上。(但是用keepalived就可以自動切換)
2)再結合Keepalived的使用,通過VIP實現Mysql雙主對外連接的統一接口。即客戶端通過Vip連接數據庫;當其中一臺宕機後,VIP會漂移到另一臺上,這個過程對於客戶端的數據連接來說幾乎無感覺,從而實現高可用。

技術分享圖片

實驗環境

服務角色 ? ? ?IP 系統及所需服務
master1 192.168.24.128 centos7 mysql keepalived
master2 192.168.24.130 centos7 mysql keepalived
VIP 192.168.24.188

註意:防火墻與SELINUX確保已經關閉

master1和master2同時按照mysql服務

博客中已有mysql的按照文檔,在此不再贅述

mysql主主同步環境部署

在master1 上操作如下

在my.cnf文件的[mysqld]配置區域添加下面內容:

[root@linfan ~]# vim /etc/my.cnf
[mysqld]
basedir = /usr/local/mysql
datadir = /opt/data
socket = /tmp/mysql.sock
port = 3306
pid-file = /opt/data/mysql.pid
user = mysql
skip-name-resolve
//添加以下內容
server-id = 1
log-bin = mysql-bin
sync_binlog = 1
binlog_checksum = none
binlog_format = mixed
auto-increment-increment = 2
auto-increment-offset = 1
slave-skip-errors = all         

重啟mysql服務

[root@linfan ~]# service mysqld restart
Shutting down MySQL.. SUCCESS!
Starting MySQL.. SUCCESS! 

數據同步授權,這樣I/O線程就可以以這個用戶的身份連接到主服務器,並且讀取它的二進制日誌。

mysql>  grant replication slave,replication client on *.* to doudou@‘192.168.24.%‘ identified by "123456";
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> flush privileges;
//刷新權限
Query OK, 0 rows affected (0.00 sec)

mysql>  flush tables with read lock;
//最好將庫鎖住,僅僅允許讀,以保證數據一致性;待主主同步環境部署後再解鎖;
鎖住後,就不能往表裏寫數據,但是重啟mysql服務後就會自動解鎖!
Query OK, 0 rows affected (0.00 sec)

mysql> show master status;
//log bin日誌和pos值位置
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 |      612 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

在master2 上操作如下

在my.cnf文件的[mysqld]配置區域添加下面內容:

[root@linfan ~]# vim /etc/my.cnf
[mysqld]
basedir = /usr/local/mysql
datadir = /opt/data
socket = /tmp/mysql.sock
port = 3306
pid-file = /opt/data/mysql.pid
user = mysql
skip-name-resolve
//添加以下內容
server-id = 2
log-bin = mysql-bin
sync_binlog = 1
binlog_checksum = none
binlog_format = mixed
auto-increment-increment = 2
auto-increment-offset = 2
slave-skip-errors = all    

重啟mysql的服務

[root@linfan ~]# service mysqld restart
Shutting down MySQL.. SUCCESS!
Starting MySQL... SUCCESS! 

數據同步授權,這樣I/O線程就可以以這個用戶的身份連接到主服務器,並且讀取它的二進制日誌。

mysql>  grant replication slave,replication client on *.* to doudou@‘192.168.24.%‘ identified by "123456";
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql>  flush privileges;
Query OK, 0 rows affected (0.00 sec)

mysql>  flush tables with read lock;
Query OK, 0 rows affected (0.00 sec)

mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000004 |      150 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

在master1上做同步操作

mysql> unlock tables;     //先解鎖,將對方數據同步到自己的數據庫中
mysql> stop slave;
mysql> change  master to master_host=‘192.168.24.130‘,master_user=‘doudou‘,master_password=‘123456‘,master_log_file=‘mysql-bin.000004‘,master_log_pos=150;         
Query OK, 0 rows affected, 2 warnings (0.01 sec)

mysql> start slave;
Query OK, 0 rows affected (0.01 sec)

查看同步狀態,如下出現兩個“Yes”,表明同步成功!
mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.24.130
                  Master_User: doudou
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000004
          Read_Master_Log_Pos: 150
               Relay_Log_File: linfan-relay-bin.000002
                Relay_Log_Pos: 312
        Relay_Master_Log_File: mysql-bin.000004
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 150
              Relay_Log_Space: 512
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 2
                  Master_UUID: dc702f48-b7b9-11e8-9caa-000c298fc02c
             Master_Info_File: /opt/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)

ERROR:
No query specified

在master2上做同步操作:

mysql> unlock tables;     //先解鎖,將對方數據同步到自己的數據庫中
mysql> stop slave;
mysql> change  master to master_host=‘192.168.24.129‘,master_user=‘doudou‘,master_password=‘123456‘,master_log_file=‘mysql-bin.000001‘,master_log_pos=612;         
Query OK, 0 rows affected, 2 warnings (0.01 sec)

mysql> start slave;
Query OK, 0 rows affected (0.01 sec)

查看同步狀態,如下出現兩個“Yes”,表明同步成功!
mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.24.130
                  Master_User: doudou
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 150
               Relay_Log_File: linfan-relay-bin.000002
                Relay_Log_Pos: 312
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 150
              Relay_Log_Space: 512
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 2
                  Master_UUID: dc702f48-b7b9-11e8-9caa-000c298fc02c
             Master_Info_File: /opt/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)

ERROR:
No query specified

PS: 在這裏可能會報錯

Got fatal error 1236 from master when reading data from binary log: ‘Could not find first log file name in binary log index file‘

突然之間Slave_IO_Running: 狀態變成NO了

解決方法。

首先在從庫上執行

stop slave;

查看主庫master狀態

mysql> show master status\G;
*************************** 1. row ***************************
             File: mysql-bin.000113
         Position: 276925387
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 
1 row in set (0.00 sec)

ERROR: 
No query specified

mysql> flush logs;
Query OK, 0 rows affected (0.11 sec)

刷新binlog日誌

flush logs;

刷新後的日誌會+1

例如上面的 File: mysql-bin.000113 會變成 File: mysql-bin.000114

再次查看master狀態

mysql> show master status\G;
*************************** 1. row ***************************
             File: mysql-bin.000114
         Position: 120
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 
1 row in set (0.00 sec)

ERROR: 
No query specified

然後就不需要在操作master,切換到從庫

輸入CHANGE MASTER TO MASTER_LOG_FILE=‘mysql-bin.000114‘,MASTER_LOG_POS=120;

執行start slave;

查看從庫狀態

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 101.200.*.*
                  Master_User: backup
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000114
          Read_Master_Log_Pos: 11314
               Relay_Log_File: mysql-relay.000002
                Relay_Log_Pos: 11477
        Relay_Master_Log_File: mysql-bin.000114
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 11314
              Relay_Log_Space: 11646
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 21
                  Master_UUID: e4a43da7-5b58-11e5-a12f-00163e003632
             Master_Info_File: /home/data/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
1 row in set (0.00 sec)

ERROR: 
No query specified

主主同步效果驗證

1)在master1服務器的數據庫寫入數據:

mysql> create database tom;
Query OK, 1 row affected (0.01 sec)

mysql> use tom;
Database changed

mysql> create table mary(id int,name varchar(100) not null,age tinyint);
Query OK, 0 rows affected (0.06 sec)

mysql> insert mary values(1,"lisi",10),(2,"zhangshan",28),(3,"wangwu",18);
Query OK, 3 rows affected (0.11 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> select * from mary;
+------+-----------+------+
| id   | name      | age  |
+------+-----------+------+
|    1 | lisi      |   10 |
|    2 | zhangshan |   28 |
|    3 | wangwu    |   18 |
+------+-----------+------+
3 rows in set (0.00 sec)

然後在master2數據庫上查看,發現數據已經同步過來了!

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| tom                |
+--------------------+
5 rows in set (0.01 sec)

mysql> use tom;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+---------------+
| Tables_in_tom |
+---------------+
| mary          |
+---------------+
1 row in set (0.00 sec)

mysql> select * from mary;
+------+-----------+------+
| id   | name      | age  |
+------+-----------+------+
|    1 | lisi      |   10 |
|    2 | zhangshan |   28 |
|    3 | wangwu    |   18 |
+------+-----------+------+
3 rows in set (0.00 sec)

2)在master2數據庫上寫入新數據


mysql> insert mary values(4,"zhaosi",66),(5,"lida",88);
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from mary;
+------+-----------+------+
| id   | name      | age  |
+------+-----------+------+
|    1 | lisi      |   10 |
|    2 | zhangshan |   28 |
|    3 | wangwu    |   18 |
|    4 | zhaosi    |   66 |
|    5 | lida      |   88 |
+------+-----------+------+
5 rows in set (0.00 sec)

然後在master1數據庫上查看,發現數據也已經同步過來了!

mysql>  select * from mary;
+------+-----------+------+
| id   | name      | age  |
+------+-----------+------+
|    1 | lisi      |   10 |
|    2 | zhangshan |   28 |
|    3 | wangwu    |   18 |
|    4 | zhaosi    |   66 |
|    5 | lida      |   88 |
+------+-----------+------+
5 rows in set (0.00 sec)

至此,Mysql主主同步環境已經實現。

配置Mysql+Keepalived故障轉移的高可用環境

安裝keepalived並將其配置成系統服務。master1和master2兩臺機器上同樣進行如下操作:

[root@linfan ~]# yum install -y openssl-devel
[root@linfan ~]# cd /usr/src/
[root@linfan src]# wget http://www.keepalived.org/software/keepalived-1.3.5.tar.gz   
[root@linfan src]# tar -xf keepalived-1.3.5.tar.gz
[root@linfan src]# cd keepalived-1.3.5
[root@linfan keepalived-1.3.5]#  ./configure --prefix=/usr/local/keepalived 
//此處會有警告提示,忽略即可!
[root@linfan keepalived-1.3.5]# make && make install 
[root@linfan keepalived-1.3.5]# cp /usr/src/keepalived-1.3.5/keepalived/etc/init.d/keepalived /etc/rc.d/init.d/
[root@linfan keepalived-1.3.5]# cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/
[root@linfan keepalived-1.3.5]#  mkdir /etc/keepalived/
[root@linfan keepalived-1.3.5]# cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
[root@linfan keepalived-1.3.5]# cp /usr/local/keepalived/sbin/keepalived /usr/sbin/
[root@linfan keepalived-1.3.5]# echo "/etc/init.d/keepalived start" >> /etc/rc.local

2)master1機器上的keepalived.conf配置。(下面配置中沒有使用lvs的負載均衡功能,所以不需要配置虛擬服務器virtual server)

[root@linfan ~]# cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak
[root@linfan ~]# vim /etc/keepalived/keepalived.conf  //清空內容,添加為以下內容
! Configuration File for keepalived

global_defs {
notification_email {
[email protected]
[email protected]
}

notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id MASTER-HA
}

vrrp_script chk_mysql_port {     #檢測mysql服務是否在運行。有很多方式,比如進程,用腳本檢測等等
    script "/opt/chk_mysql.sh"   #這裏通過腳本監測
    interval 2                   #腳本執行間隔,每2s檢測一次
    weight -5                    #腳本結果導致的優先級變更,檢測失敗(腳本返回非0)則優先級 -5
    fall 2                    #檢測連續2次失敗才算確定是真失敗。會用weight減少優先級(1-255之間)
    rise 1                    #檢測1次成功就算成功。但不修改優先級
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0      #指定虛擬ip的網卡接口
    mcast_src_ip 192.168.24.128
    virtual_router_id 51    #路由器標識,MASTER和BACKUP必須是一致的
    priority 101            #定義優先級,數字越大,優先級越高,在同一個vrrp_instance下,MASTER的優先級必須大於BACKUP的優先級。
這樣MASTER故障恢復後,就可以將VIP資源再次搶回來
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.24.188
    }

track_script {
   chk_mysql_port
}
}

編寫切換腳本。KeepAlived做心跳檢測,如果Master的MySQL服務掛了(3306端口掛了),那麽它就會選擇自殺。Slave的KeepAlived通過心跳檢測發現這個情況,就會將VIP的請求接管

vim /opt/chk_mysql.sh
#!/bin/bash
counter=$(netstat -na|grep "LISTEN"|grep "3306"|wc -l)
if [ "${counter}" -eq 0 ]; then
    /etc/init.d/keepalived stop
fi  
[root@linfan ~]#  chmod 755 /opt/chk_mysql.sh
[root@linfan ~]# /etc/init.d/keepalived start
Starting keepalived (via systemctl):  Job for keepalived.service failed because a timeout was exceeded. See "systemctl status keepalived.service" and "journalctl -xe" for details.
                                                           [FAILED]
                                                           //發現啟動失敗。經排查,是因為pid的路徑有問題
                                                           vim /lib/systemd/system/keepalived.service
                                                           [Unit]
Description=LVS and VRRP High Availability Monitor
After=syslog.target network-online.target

[Service]
Type=forking
PIDFile=/var/run/keepalived.pid //將此行修改如此
KillMode=process
EnvironmentFile=-/usr/local/keepalived/etc/sysconfig/keepalived
ExecStart=/usr/local/keepalived/sbin/keepalived $KEEPALIVED_OPTIONS
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target
[root@linfan ~]# systemctl daemon-reload   //重新載入 systemd,掃描新的或有變動的單元
再次啟動
[root@linfan ~]# /etc/init.d/keepalived start
Starting keepalived (via systemctl):                       [  OK  ]

4)master2機器上的keepalived配置。master2機器上的keepalived.conf文件只修改priority為90、nopreempt不設置、real_server設置本地IP。

[root@linfan ~]# cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak
[root@linfan ~]# vim /etc/keepalived/keepalived.conf //清空內容,添加為以下內容
! Configuration File for keepalived

global_defs {
notification_email {
[email protected]
[email protected]
}

notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id MASTER-HA
}

vrrp_script chk_mysql_port {
    script "/opt/chk_mysql.sh"
    interval 2
    weight -5
    fall 2
    rise 1
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    mcast_src_ip 192.168.24.130 //本機IP
    virtual_router_id 51
    priority 99
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.24.188 //VIP
    }

track_script {
   chk_mysql_port
}
}
vim /opt/chk_mysql.sh //編寫腳本
#!/bin/bash
counter=$(netstat -na|grep "LISTEN"|grep "3306"|wc -l)
if [ "${counter}" -eq 0 ]; then
    /etc/init.d/keepalived stop
fi  
//為了避免再次啟動失敗,在此提前修改
vim /lib/systemd/system/keepalived.service
                                                           [Unit]
Description=LVS and VRRP High Availability Monitor
After=syslog.target network-online.target

[Service]
Type=forking
PIDFile=/var/run/keepalived.pid //將此行修改如此
KillMode=process
EnvironmentFile=-/usr/local/keepalived/etc/sysconfig/keepalived
ExecStart=/usr/local/keepalived/sbin/keepalived $KEEPALIVED_OPTIONS
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target
[root@linfan ~]# systemctl daemon-reload   //重新載入 systemd,掃描新的或有變動的單元
啟動
[root@linfan ~]# /etc/init.d/keepalived start
Starting keepalived (via systemctl):                       [  OK  ]

5)master1和master2兩臺服務器都要授權允許root用戶遠程登錄,用於在客戶端登陸測試!

mysql> grant all on *.* to root@‘%‘ identified by "123456";
Query OK, 0 rows affected, 1 warning (0.02 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.01 sec)

Mysql+keepalived故障轉移的高可用測試

1)通過Mysql客戶端通過VIP連接,看是否連接成功。
比如,在遠程一臺測試機上連接,通過vip地址可以正常連接(下面的連接權限要是在服務端提前授權的)

[root@linfan ~]# mysql -h192.168.24.188 -uroot -p123456
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 5.7.22-log MySQL Community Server (GPL)

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type ‘help;‘ or ‘\h‘ for help. Type ‘\c‘ to clear the current input statement.

mysql>  select * from tom.mary;
+------+-----------+------+
| id   | name      | age  |
+------+-----------+------+
|    1 | lisi      |   10 |
|    2 | zhangshan |   28 |
|    3 | wangwu    |   18 |
|    4 | zhaosi    |   66 |
|    5 | lida      |   88 |
+------+-----------+------+
5 rows in set (0.00 sec)

2)默認情況下,vip是在master1上的。使用"ip addr"命令查看vip切換情況

[root@linfan ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:23:40:f6 brd ff:ff:ff:ff:ff:ff
    inet 192.168.24.128/24 brd 192.168.24.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 192.168.24.188/32 scope global eth0
       valid_lft forever preferred_lft forever  ////這個32位子網掩碼的vip地址表示該資源目前還在master1機器上
    inet 192.168.24.146/24 brd 192.168.24.255 scope global secondary dynamic eth0
       valid_lft 1115sec preferred_lft 1115sec
    inet6 fe80::20c:29ff:fe23:40f6/64 scope link
       valid_lft forever preferred_lft forever

停止master1機器上的mysql服務,根據配置中的腳本,mysql服務停了,keepalived也會停,從而vip資源將會切換到master2機器上。(mysql服務沒有起來的時候,keepalived服務也無法順利啟動!)

[root@linfan ~]# service mysqld stop
Shutting down MySQL............ SUCCESS!
[root@linfan ~]# ps -ef|grep mysql
root      10652   2175  0 03:04 pts/1    00:00:00 grep --color=auto mysql
[root@linfan ~]# ps -ef|grep keepalived
root      10654   2175  0 03:04 pts/1    00:00:00 grep --color=auto keepalived
[root@linfan ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:23:40:f6 brd ff:ff:ff:ff:ff:ff
    inet 192.168.24.128/24 brd 192.168.24.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 192.168.24.146/24 brd 192.168.24.255 scope global secondary dynamic eth0
       valid_lft 998sec preferred_lft 998sec
    inet6 fe80::20c:29ff:fe23:40f6/64 scope link
       valid_lft forever preferred_lft forever

如上結果,發現32位子網掩碼的vip沒有了,說明此時vip資源已不在master1機器上了
查看下master1的系統日誌,如下,會發現vip資源已經切換走了

[root@linfan ~]# tail -f /var/log/messages
Sep 14 03:03:54 linfan systemd: Stopping LVS and VRRP High Availability Monitor...
Sep 14 03:03:54 linfan Keepalived_vrrp[6871]: VRRP_Instance(VI_1) sent 0 priority
Sep 14 03:03:54 linfan Keepalived_vrrp[6871]: VRRP_Instance(VI_1) removing protocol VIPs.
Sep 14 03:03:54 linfan Keepalived_healthcheckers[6869]: Stopped
Sep 14 03:03:55 linfan Keepalived_vrrp[6871]: Stopped
Sep 14 03:03:55 linfan Keepalived[6868]: Stopped Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2
Sep 14 03:03:55 linfan systemd: Stopped LVS and VRRP High Availability Monitor.
Sep 14 03:04:55 linfan dhclient[3177]: DHCPREQUEST on eth0 to 192.168.24.254 port 67 (xid=0x7f91b51f)
Sep 14 03:04:55 linfan dhclient[3177]: DHCPACK from 192.168.24.254 (xid=0x7f91b51f)
Sep 14 03:04:57 linfan dhclient[3177]: bound to 192.168.24.146 -- renewal in 829 seconds.

再到master2機器上,發現vip資源的確切換過來了

[root@linfan ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:8f:c0:2c brd ff:ff:ff:ff:ff:ff
    inet 192.168.24.130/24 brd 192.168.24.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 192.168.24.188/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fe8f:c02c/64 scope link
       valid_lft forever preferred_lft forever

查看master2的系統日誌

[root@linfan ~]#  tail -f /var/log/messages
Sep 14 03:12:19 linfan Keepalived_vrrp[6710]: Sending gratuitous ARP on eth0 for 192.168.24.188
Sep 14 03:12:19 linfan Keepalived_vrrp[6710]: Sending gratuitous ARP on eth0 for 192.168.24.188
Sep 14 03:12:19 linfan Keepalived_vrrp[6710]: Sending gratuitous ARP on eth0 for 192.168.24.188
Sep 14 03:12:19 linfan Keepalived_vrrp[6710]: Sending gratuitous ARP on eth0 for 192.168.24.188
Sep 14 03:12:24 linfan Keepalived_vrrp[6710]: Sending gratuitous ARP on eth0 for 192.168.24.188
Sep 14 03:12:24 linfan Keepalived_vrrp[6710]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth0 for 192.168.24.188
Sep 14 03:12:24 linfan Keepalived_vrrp[6710]: Sending gratuitous ARP on eth0 for 192.168.24.188
Sep 14 03:12:24 linfan Keepalived_vrrp[6710]: Sending gratuitous ARP on eth0 for 192.168.24.188
Sep 14 03:12:24 linfan Keepalived_vrrp[6710]: Sending gratuitous ARP on eth0 for 192.168.24.188
Sep 14 03:12:24 linfan Keepalived_vrrp[6710]: Sending gratuitous ARP on eth0 for 192.168.24.188

3)再次啟動master1的mysql和keepalived服務。(註意:如果restart重啟mysql,那麽還要啟動下keepalived,因為mysql重啟,根據腳本會造成keepalived關閉)
註意:一定要先啟動mysql服務,然後再啟動keepalived服務。如果先啟動keepalived服務,按照上面的配置,mysql沒有起來,就會自動關閉keepalived。

[root@linfan ~]# service mysqld start
Starting MySQL.. SUCCESS!
[root@linfan ~]# /etc/init.d/keepalived start
Starting keepalived (via systemctl):                       [  OK  ]

啟動這兩個服務器後,稍微等過一會兒,註意觀察會發現vip資源再次從master2機器上切換回來了。

[root@linfan ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:23:40:f6 brd ff:ff:ff:ff:ff:ff
    inet 192.168.24.128/24 brd 192.168.24.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 192.168.24.188/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet 192.168.24.146/24 brd 192.168.24.255 scope global secondary dynamic eth0
       valid_lft 1587sec preferred_lft 1587sec
    inet6 fe80::20c:29ff:fe23:40f6/64 scope link
       valid_lft forever preferred_lft forever
[root@linfan ~]# tail -f /var/log/messages
Sep 14 03:08:26 linfan Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.24.188
Sep 14 03:08:26 linfan Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.24.188
Sep 14 03:08:26 linfan Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.24.188
Sep 14 03:08:26 linfan Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.24.188
Sep 14 03:08:31 linfan Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.24.188
Sep 14 03:08:31 linfan Keepalived_vrrp[11028]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth0 for 192.168.24.188
Sep 14 03:08:31 linfan Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.24.188
Sep 14 03:08:31 linfan Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.24.188
Sep 14 03:08:31 linfan Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.24.188
Sep 14 03:08:31 linfan Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.24.188

再看看master2機器,發現vip資源又被恢復後的master1搶過去了

[root@linfan ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:8f:c0:2c brd ff:ff:ff:ff:ff:ff
    inet 192.168.24.130/24 brd 192.168.24.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fe8f:c02c/64 scope link
       valid_lft forever preferred_lft forever
[root@linfan ~]#  tail -f /var/log/messages
Sep 14 03:08:25 linfan Keepalived_vrrp[6710]: VRRP_Instance(VI_1) Received advert with higher priority 101, ours 99
Sep 14 03:08:25 linfan Keepalived_vrrp[6710]: VRRP_Instance(VI_1) Entering BACKUP STATE
Sep 14 03:08:25 linfan Keepalived_vrrp[6710]: VRRP_Instance(VI_1) removing protocol VIPs.

4)同樣,關閉master1機器的keepalived服務,vip資源會自動切換到master2機器上。當master1的keepalived服務恢復後,會將vip資源再次切回來。

[root@linfan ~]# /etc/init.d/keepalived stop
Stopping keepalived (via systemctl):                       [  OK  ]
[root@linfan ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:23:40:f6 brd ff:ff:ff:ff:ff:ff
    inet 192.168.24.128/24 brd 192.168.24.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 192.168.24.146/24 brd 192.168.24.255 scope global secondary dynamic eth0
       valid_lft 1351sec preferred_lft 1351sec
    inet6 fe80::20c:29ff:fe23:40f6/64 scope link
       valid_lft forever preferred_lft forever

查看master2,發現vip切過來了

[root@linfan ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:8f:c0:2c brd ff:ff:ff:ff:ff:ff
    inet 192.168.24.130/24 brd 192.168.24.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 192.168.24.188/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fe8f:c02c/64 scope link
       valid_lft forever preferred_lft forever

再次恢復master1的keepalived服務,發現vip資源很快又切回來了。

[root@linfan ~]# /etc/init.d/keepalived start
Starting keepalived (via systemctl):                       [  OK  ]
[root@linfan ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:23:40:f6 brd ff:ff:ff:ff:ff:ff
    inet 192.168.24.128/24 brd 192.168.24.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 192.168.24.188/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet 192.168.24.146/24 brd 192.168.24.255 scope global secondary dynamic eth0
       valid_lft 1190sec preferred_lft 1190sec
    inet6 fe80::20c:29ff:fe23:40f6/64 scope link
       valid_lft forever preferred_lft forever

再此查看master2,發現vip資源被切走了

[root@linfan ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:8f:c0:2c brd ff:ff:ff:ff:ff:ff
    inet 192.168.24.130/24 brd 192.168.24.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fe8f:c02c/64 scope link
       valid_lft forever preferred_lft forever

以上在vip資源切換過程中,對於客戶端連接mysql(使用vip連接)來說幾乎是沒有任何影響的。

溫馨提示(Keepalived的搶占和非搶占模式)

keepalive是基於vrrp協議在linux主機上以守護進程方式,根據配置文件實現健康檢查。
VRRP是一種選擇協議,它可以把一個虛擬路由器的責任動態分配到局域網上的VRRP路由器中的一臺。
控制虛擬路由器IP地址的VRRP路由器稱為主路由器,它負責轉發數據包到這些虛擬IP地址。
一旦主路由器不可用,這種選擇過程就提供了動態的故障轉移機制,這就允許虛擬路由器的IP地址可以作為終端主機的默認第一跳路由器。

keepalive通過組播,單播等方式(自定義),實現keepalive主備推選。工作模式分為搶占和非搶占(通過參數nopreempt來控制)。
1)搶占模式:
主服務正常工作時,虛擬IP會在主上,備不提供服務,當主服務優先級低於備的時候,備會自動搶占虛擬IP,這時,主不提供服務,備提供服務。
也就是說,工作在搶占模式下,不分主備,只管優先級。

如上配置,不管keepalived.conf裏的state配置成master還是backup,只看誰的priority優先級高(一般而言,state為MASTER的優先級要高於BACKUP)。
priority優先級高的那一個在故障恢復後,會自動將VIP資源再次搶占回來!!

2)非搶占模式:
這種方式通過參數nopreempt(一般設置在advert_int的那一行下面)來控制。不管priority優先級,只要MASTER機器發生故障,VIP資源就會被切換到BACKUP上。
並且當MASTER機器恢復後,也不會去將VIP資源搶占回來,直至BACKUP機器發生故障時,才能自動切換回來。

千萬註意:
nopreempt這個參數只能用於state為backup的情況,所以在配置的時候要把master和backup的state都設置成backup,這樣才會實現keepalived的非搶占模式!

也就是說:
a)當state狀態一個為master,一個為backup的時候,加不加nopreempt這個參數都是一樣的效果。即都是根據priority優先級來決定誰搶占vip資源的,是搶占模式!
b)當state狀態都設置成backup,如果不配置nopreempt參數,那麽也是看priority優先級決定誰搶占vip資源,即也是搶占模式。
c)當state狀態都設置成backup,如果配置nopreempt參數,那麽就不會去考慮priority優先級了,是非搶占模式!即只有vip當前所在機器發生故障,另一臺機器才能接管vip。即使優先級高的那一臺機器恢復  後也不會主動搶回vip,只能等到對方發生故障,才會將vip切回來。

mysql狀態檢測腳本優化

上面的mysql監測腳本有點過於簡單且粗暴,即腳本一旦監測到Master的mysql服務關閉,就立刻把keepalived服務關閉,從而實現vip轉移!

下面對該腳本進行優化,優化後,當監測到Master的mysql服務關閉後,就會將vip切換到Backup上(但此時Master的keepalived服務不會被暴力kill)
當Master的mysql服務恢復後,就會再次將VIP資源切回來!

[root@linfan keepalived-1.3.5]# vim /opt/chk_mysql.sh 
#!/bin/bash
MYSQL=/usr/local/mysql/bin/mysql
MYSQL_HOST=localhost
MYSQL_USER=mysql
MYSQL_PASSWORD=linfan123
CHECK_TIME=3

#mysql  is working MYSQL_OK is 1 , mysql down MYSQL_OK is 0

MYSQL_OK=1

function check_mysql_helth (){
    $MYSQL -h $MYSQL_HOST -u $MYSQL_USER -p${MYSQL_PASSWORD} -e "show status;" >/dev/null 2>&1
    if [ $? = 0 ] ;then
    MYSQL_OK=1
    else
    MYSQL_OK=0
    fi
    return $MYSQL_OK
}
while [ $CHECK_TIME -ne 0 ]
do
    let "CHECK_TIME -= 1"
    check_mysql_helth
if [ $MYSQL_OK = 1 ] ; then
    CHECK_TIME=0
    exit 0
fi
if [ $MYSQL_OK -eq 0 ] &&  [ $CHECK_TIME -eq 0 ]
then
    pkill keepalived
    exit 1
fi
sleep 1
done

Mysql+Keepalived雙主熱備高可用操作步驟詳細解析