18.1-18.5 集群介紹,用keepalived配置高可用集群
根據功能劃分為兩大類:高可用和負載均衡
高可用集群通常為兩臺服務器,一臺工作,另外一臺作為冗余,當提供服務的機器宕機,冗余將接替繼續提供服務大概意思是:高可用一般使用兩臺機器,功能,角色是一樣的。當一臺服務器宕機不能服務了,利用另外的服務器頂替。
實現高可用的開源軟件有:heartbeat(不建議使用,切換通信速度慢,2010年停止更新,)、keepalived(建議使用,有高可用和負載均衡的功能)
負載均衡集群,需要有一臺服務器作為分發器,它負責把用戶的請求分發給後端的服務器處理,在這個集群裏,除了分發器外,就是給用戶提供服務的服務器了,這些服務器數量至少為2
原理:一個請求,分發到後端的多臺服務器去處理。假如一臺服務器滿足不到用戶的服務需求,需要擴容到多臺服務器去處理服務請求。
實現負載均衡的開源軟件有LVS、keepalived、haproxy、nginx,商業的有F5、Netscaler
18.2 keepalived介紹
在這裏我們使用keepalived來實現高可用集群,因為heartbeat在centos6上有一些問題,影響實驗效果
keepalived通過VRRP(Virtual Router Redundancy Protocl,虛擬路由冗余協議)來實現高可用。
在這個協議裏會將多臺功能相同的路由器組成一個小組,這個小組裏會有1個master(主)角色和N(N>=1)個backup角色。
master會通過組播的形式向各個backup發送VRRP協議的數據包,當backup收不到master發來的VRRP數據包時,就會認為master宕機了。此時就需要根據各個backup的優先級來決定誰成為新的mater。
Keepalived要有三個模塊,分別是core、check和vrrp。其中core模塊為keepalived的核心,負責主進程的啟動、維護以及全局配置文件的加載和解析,check模塊負責健康檢查,vrrp模塊是來實現VRRP協議的。
18.3 用keepalived配置高可用集群(上)
用keepalived配置高可用
準備工作 2臺linux系統
準備兩臺機器128和129,128作為master,129作為backup
1 兩臺機器都執行yum install -y keepalived
[root@centos7-01 src]# yum install -y keepalived
[root@centos7-02 conf]# yum install -y keepalived
2 兩臺機器都安裝nginx(負載均衡也可以用上)
在生產環境中,許多企業把Nginx作為負載均衡器來用,它的重要性很高,一旦宕機會導致整個站點不能訪問,所以有必要再準備一臺備用Nginx,keepalived用在這種場景非常合適。
3 編輯128上keepalived配置文件,內容從https://coding.net/u/aminglinux/p/aminglinux-book/git/blob/master/D21Z/master_keepalived.conf獲取
先清空原有的配置信息
[root@centos7-01 src]# > /etc/keepalived/keepalived.conf
然後再添加如下配置信息
[root@centos7-01 src]# vim /etc/keepalived/keepalived.conf
global_defs {
notification_email {
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_script chk_nginx {
script "/usr/local/sbin/check_ng.sh"
interval 3
}
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass aminglinux>com
}
virtual_ipaddress {
192.168.188.100
}
track_script {
chk_nginx
}
}
參數解釋:
global_defs {
notification_email {
[email protected]//定義接收告警的人員信息
}
notification_email_from [email protected] //定義發郵件地址(實際上沒用)
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
以上是自定義參數,
檢測服務是否正常
vrrp_script chk_nginx {//chk_nginx為自定義名字,後面還會用到它
script "/usr/local/sbin/check_ng.sh"//自定義腳本,該腳本為監控nginx服務的腳本,務必記住,稍後會用上
interval 3//每隔3s執行一次該腳本
}
vrrp_instance VI_1 {
state MASTER//角色為master,從的話是backup,這裏參數要大寫
interface ens33//針對哪個網卡監聽VIP
virtual_router_id 51//定義路由器的ID
priority 100//權重為100,master要比backup大
advert_int 1
authentication {//認證相關信息
auth_type PASS//定義認證方式,PASS表示password
auth_pass aminglinux>com//定義密碼,這個密碼自定義
}
virtual_ipaddress {//定義VIP參數
192.168.189.100//定義VIP(當master宕機的時候,訪問backup的ip地址,這個IP是公共的)
}
track_script {
chk_nginx//定義監控腳本,這裏和上面vrr_script後面的字符串要保持一致
}
}
關於VIP:它的英文名字是Virtual IP,即虛擬IP,也有浮動IP的叫法。因為這個IP是由keepalived給服務器配置上的,
服務器靠這個VIP對外提供服務,當master機器宕機,VIP被分配到backup上,這樣用戶看來是無感知的。
4 128編輯監控腳本,內容從https://coding.net/u/aminglinux/p/aminglinux-book/git/blob/master/D21Z/master_check_ng.sh獲取
#vim /usr/local/sbin/check_ng.sh
#!/bin/bash
#時間變量,用於記錄日誌
d=`date --date today +%Y%m%d_%H:%M:%S`
#計算nginx進程數量
n=`ps -C nginx --no-heading|wc -l`
#如果進程為0,則啟動nginx,並且再次檢測nginx進程數量,
#如果還為0,說明nginx無法啟動,此時需要關閉keepalived
if [ $n -eq "0" ]; then
/etc/init.d/nginx start
n2=`ps -C nginx --no-heading|wc -l`
if [ $n2 -eq "0" ]; then
echo "$d nginx down,keepalived will stop" >> /var/log/check_ng.log
systemctl stop keepalived
fi
fi
5 給予腳本755權限
[root@centos7-01 src]#chmod 755 /usr/local/sbin/check_ng.sh
6 在130啟動keepalived服務,然後檢查keepalived和nginx的服務進程
[root@centos7-01 src]# systemctl start keepalived
[root@centos7-01 src]# ps aux |grep keepalived
root 2370 0.0 0.1 118608 1384 ? Ss 16:12 0:00 /usr/sbin/keepalived -D
root 2371 0.0 0.3 127468 3292 ? S 16:12 0:00 /usr/sbin/keepalived -D
root 2372 0.1 0.2 127408 2828 ? S 16:12 0:00 /usr/sbin/keepalived -D
root 2414 0.0 0.0 112676 984 pts/0 R+ 16:12 0:00 grep --color=auto keepalived
[root@centos7-01 src]# ps aux |grep nginx
root 1153 0.0 0.0 24944 864 ? Ss 14:52 0:00 nginx: master process /usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf
nobody 1156 0.0 0.3 27388 3696 ? S 14:52 0:00 nginx: worker process
nobody 1157 0.0 0.3 27388 3444 ? S 14:52 0:00 nginx: worker process
root 2434 0.0 0.0 112676 984 pts/0 R+ 16:12 0:00 grep --color=auto nginx
7 先停止nginx服務,隨後需要用到nginx再啟動起來
[root@centos7-01 src]# /etc/init.d/nginx stop
Stopping nginx (via systemctl): [ 確定 ]
可以看到停止了nginx,它還會被加載,因為keepalived檢測腳本把nginx加載起來了
[root@centos7-01 src]# ps aux |grep nginx
root 2799 0.0 0.0 24944 872 ? Ss 16:15 0:00 nginx: master process /usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf
nobody 2803 0.0 0.3 27388 3452 ? S 16:15 0:00 nginx: worker process
nobody 2804 0.0 0.3 27388 3452 ? S 16:15 0:00 nginx: worker process
root 2812 0.0 0.0 112676 980 pts/0 R+ 16:15 0:00 grep --color=auto nginx
18.4 用keepalived配置高可用集群(中)
keepalived日誌文件在/var/log/messages
1 查看VIP地址#ip add (#ifconfig是查不到VIP的)
[root@centos7-01 src]# ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:15:53:53 brd ff:ff:ff:ff:ff:ff
inet 192.168.189.128/24 brd 192.168.189.255 scope global ens33
valid_lft forever preferred_lft forever
inet 192.168.189.100/32 scope global ens33//此處是VIP地址
valid_lft forever preferred_lft forever
inet 192.168.189.150/24 brd 192.168.189.255 scope global secondary ens33:0
valid_lft forever preferred_lft forever
inet6 fe80::243c:86d7:d85e:224d/64 scope link
valid_lft forever preferred_lft forever
3: ens37: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:15:53:5d brd ff:ff:ff:ff:ff:ff
2 配置從之前,需要確保主和從角色有無開啟防火墻,iptables,firewall和selinux等等有無處於關閉狀態。
[root@centos7-01 src]# iptables -nvL如果沒有或者沒能被清空,此時需要停止firewall
[root@centos7-01 src]# getenforce
Disabled
[root@centos7-01 src]# systemctl stop firewall.service
[root@centos7-02 conf]# systemctl stop firewall.service
[root@centos7-02 conf]# setenforce 0
[root@centos7-02 conf]# getenforce
Permissive
[root@centos7-02 conf]# iptables -nvL
3 129上編輯配置文件,內容從https://coding.net/u/aminglinux/p/aminglinux-book/git/blob/master/D21Z/backup_keepalived.conf獲取
[root@centos7-02 conf]# > /etc/keepalived/keepalived.conf
[root@centos7-02 conf]# vim !$
vim /etc/keepalived/keepalived.conf
global_defs {
notification_email {
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_script chk_nginx {
script "/usr/local/sbin/check_ng.sh"
interval 3
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass aminglinux>com
}
virtual_ipaddress {
192.168.189.100
}
track_script {
chk_nginx
}
}
4 129上編輯監控腳本,內容從https://coding.net/u/aminglinux/p/aminglinux-book/git/blob/master/D21Z/backup_check_ng.sh獲取
[root@centos7-02 conf]# vim /usr/local/sbin/check_ng.sh
[root@centos7-02 conf]# !vim
vim /usr/local/sbin/check_ng.sh
#時間變量,用於記錄日誌
d=`date --date today +%Y%m%d_%H:%M:%S`
#計算nginx進程數量
n=`ps -C nginx --no-heading|wc -l`
#如果進程為0,則啟動nginx,並且再次檢測nginx進程數量,
#如果還為0,說明nginx無法啟動,此時需要關閉keepalived
if [ $n -eq "0" ]; then
/etc/init.d/nginx start
#如果yum安裝的nginx,用這個啟動命令systemctl start nginx
n2=`ps -C nginx --no-heading|wc -l`
if [ $n2 -eq "0" ]; then
echo "$d nginx down,keepalived will stop" >> /var/log/check_ng.log
systemctl stop keepalived
fi
fi
5 給腳本755權限
[root@centos7-02 conf]# chmod 755 !$
chmod 755 /usr/local/sbin/check_ng.sh
6 129上也啟動服務 systemctl start keepalived
[root@centos7-02 conf]# systemctl start keepalived
[root@centos7-02 conf]# ps aux |grep keepalived
root 8027 0.0 0.1 118652 1400 ? Ss 16:51 0:00 /usr/sbin/keepalived -D
root 8028 0.0 0.3 127516 3296 ? S 16:51 0:00 /usr/sbin/keepalived -D
root 8029 0.0 0.2 127456 2848 ? S 16:51 0:00 /usr/sbin/keepalived -D
root 8070 0.0 0.0 112720 972 pts/0 S+ 16:51 0:00 grep --color=auto keepalived
7 區分主從的nginx
在瀏覽器訪問128的頁面192.168.189.128
在瀏覽器訪問129的頁面192.168.189.129
訪問VIP 會自動跳轉到master(192.168.189.128)
訪問VIP
18.5 用keepalived配置高可用集群(下)
測試高可用
1 測試1:關閉master上的nginx服務
nginx服務關閉不到3秒,又被重新啟動,這是因為keepalived的檢測腳本起作用了。
2 測試2:在master上增加iptabls規則,
把vvrp出去的包給封掉
[root@centos7-01 111.com]#iptables -I OUTPUT -p vrrp -j DROP
檢查iptables的行為表
[root@centos7-01 111.com]# iptables -nvL
Chain INPUT (policy ACCEPT 43 packets, 3932 bytes)
pkts bytes target prot opt in out source destination
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 27 packets, 3268 bytes)
pkts bytes target prot opt in out source destination
42 1680 DROP 112 -- * * 0.0.0.0/0 0.0.0.0/0
在主山查看日誌記錄,頻繁出現vrrp類似錯誤的信息
[root@centos7-01 111.com]# tail /var/log/messages
May 21 17:33:09 centos7-01 NetworkManager[642]: <info> [1526895189.1381] device (ens37): Activation: starting connection '有線連接 1' (d44e77b3-03bc-3209-8d77-782475a5a763)
May 21 17:33:09 centos7-01 NetworkManager[642]: <info> [1526895189.1383] device (ens37): state change: disconnected -> prepare (reason 'none') [30 40 0]
May 21 17:33:09 centos7-01 NetworkManager[642]: <info> [1526895189.1386] device (ens37): state change: prepare -> config (reason 'none') [40 50 0]
May 21 17:33:09 centos7-01 NetworkManager[642]: <info> [1526895189.1431] device (ens37): state change: config -> ip-config (reason 'none') [50 70 0]
May 21 17:33:09 centos7-01 NetworkManager[642]: <info> [1526895189.1435] dhcp4 (ens37): activation: beginning transaction (timeout in 45 seconds)
May 21 17:33:09 centos7-01 NetworkManager[642]: <info> [1526895189.1559] dhcp4 (ens37): dhclient started with pid 12636
May 21 17:33:09 centos7-01 dhclient[12636]: DHCPDISCOVER on ens37 to 255.255.255.255 port 67 interval 8 (xid=0x5f29b3dd)
May 21 17:33:17 centos7-01 dhclient[12636]: DHCPDISCOVER on ens37 to 255.255.255.255 port 67 interval 8 (xid=0x5f29b3dd)
May 21 17:33:25 centos7-01 dhclient[12636]: DHCPDISCOVER on ens37 to 255.255.255.255 port 67 interval 14 (xid=0x5f29b3dd)
May 21 17:33:39 centos7-01 dhclient[12636]: DHCPDISCOVER on ens37 to 255.255.255.255 port 67 interval 16 (xid=0x5f29b3dd)
在從上查看日誌,可以看到與其他機器(master)進行交互數據
[root@centos7-02 conf]# tail -10 /var/log/messages
May 21 17:34:12 centos7-02 dhclient[11315]: DHCPDISCOVER on ens37 to 255.255.255.255 port 67 interval 15 (xid=0x20d5f4a4)
May 21 17:34:27 centos7-02 dhclient[11315]: DHCPDISCOVER on ens37 to 255.255.255.255 port 67 interval 14 (xid=0x20d5f4a4)
May 21 17:34:36 centos7-02 NetworkManager[537]: <warn> [1526895276.4276] dhcp4 (ens37): request timed out
May 21 17:34:36 centos7-02 NetworkManager[537]: <info> [1526895276.4277] dhcp4 (ens37): state changed unknown -> timeout
May 21 17:34:36 centos7-02 NetworkManager[537]: <info> [1526895276.4295] dhcp4 (ens37): canceled DHCP transaction, DHCP client pid 11315
May 21 17:34:36 centos7-02 NetworkManager[537]: <info> [1526895276.4296] dhcp4 (ens37): state changed timeout -> done
May 21 17:34:36 centos7-02 NetworkManager[537]: <info> [1526895276.4297] device (ens37): state change: ip-config -> failed (reason 'ip-config-unavailable') [70 120 5]
May 21 17:34:36 centos7-02 NetworkManager[537]: <info> [1526895276.4299] policy: disabling autoconnect for connection '有線連接 1'.
May 21 17:34:36 centos7-02 NetworkManager[537]: <warn> [1526895276.4300] device (ens37): Activation: failed for connection '有線連接 1'
May 21 17:34:36 centos7-02 NetworkManager[537]: <info> [1526895276.4302] device (ens37): state change: failed -> disconnected (reason 'none') [120 30 0]
在瀏覽上訪問VIP,還是沒有問題的。
實驗證明,master雖然禁掉了VRRP協議,還是不能達到切換資源的目的的。
把VRRP恢復,規則清空即可
[root@centos7-01 111.com]# iptables -F
[root@centos7-01 111.com]# iptables -nvL
Chain INPUT (policy ACCEPT 19 packets, 1332 bytes)
pkts bytes target prot opt in out source destination
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 17 packets, 1400 bytes)
pkts bytes target prot opt in out source destination
3 測試3:關閉master上的keepalived服務
最暴力的測試方法就是,在master上把keepalived服務停止
[root@centos7-01 111.com]# systemctl stop keepalived
在從上ip add看看vip有沒有被釋放到從上
[root@centos7-02 conf]# ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:73:7c:4c brd ff:ff:ff:ff:ff:ff
inet 192.168.189.129/24 brd 192.168.189.255 scope global ens33
valid_lft forever preferred_lft forever
inet 192.168.189.100/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::b485:96d0:c537:251e/64 scope link
valid_lft forever preferred_lft forever
3: ens37: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:73:7c:56 brd ff:ff:ff:ff:ff:ff
VIP 192.168.189.100 被釋放到了在從上
查看從的日誌,發現VIP被添加上
4 在瀏覽器訪問VIP,會跳轉到backup的默認主頁上
由此證明,VIP已經到了從機器上了。
5 恢復master主上啟動VIP服務,測試即可。
18.1-18.5 集群介紹,用keepalived配置高可用集群