共享存儲之drbd應用詳解及pacemaker實現高可用共享存儲(一)

drbd概述
    Distributed Replicated Block Device(DRBD)是一種基於軟件的，無共享，復制的存儲解決方案，在服務器之間的對塊設備（硬盤，分區，邏輯卷等）進行鏡像。
    DRBD工作在內核當中的，類似於一種驅動模塊。DRBD工作的位置在文件系統的buffer cache和磁盤調度器之間，通過tcp/ip發給另外一臺主機到對方的tcp/ip最終發送給對方的drbd，再由對方的drbd存儲在本地對應磁盤上，類似於一個網絡RAID-1功能。
    在高可用(HA)中使用DRBD功能，可以代替使用一個共享盤陣。本地(主節點)與遠程主機(備節點)的數據可以保證實時同步。當本地系統出現故障時,遠程主機上還會保留有一份相同的數據,可以繼續使用。

DRBD的架構如下圖所示:

drbd的安裝
前提：
1）本配置共有兩個測試節點，分別node1.samlee.com和node2.samlee.com，相的IP地址分別為172.16.100.6和172.16.100.7；
2）node1和node2兩個節點上各提供了一個大小相同的分區作為drbd設備；我們這裏為在兩個節點上均為/dev/sda3，大小為5G；
3）系統為CentOS 6.5，x86_64平臺；

1、準備工作
1)所有節點的主機名稱和對應的IP地址解析服務可以正常工作，且每個節點的主機名稱需要跟"uname -n“命令的結果保持一致；因此，需要保證兩個節點上的/etc/hosts文件均為下面的內容：

# vim /etc/hosts
172.16.100.6   node1.magedu.com node1
172.16.100.7   node2.magedu.com node2

為了使得重新啟動系統後仍能保持如上的主機名稱，還分別需要在各節點執行類似如下的命令：
Node1配置:

# sed -i 's@\(HOSTNAME=\).*@\1node1.samlee.com@g'  /etc/sysconfig/Network
# hostname node1.samlee.com

Node2配置：

# sed -i 's@\(HOSTNAME=\).*@\1node2.samlee.com@g' /etc/sysconfig/network
# hostname node2.samlee.com

2)設定兩個節點可以基於密鑰進行ssh通信，這可以通過如下的命令實現：
Node1配置:

# ssh-keygen -t rsa -P ''
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@node2
# ssh node2.samlee.com 'date';date

Node2配置:

# ssh-keygen -t rsa -P ''
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@node1
# ssh node1.samlee.com 'date';date

3)設置5分鐘自動同步時間(node1、node2都需要配置)

# crontab -e
*/5 * * * * /sbin/ntpdata 172.16.100.10 &> /dev/null

4)關閉selinux(node1、node2都需要配置)

# setenforce 0
# vim /etc/selinux/config
SELINUX=disabled

2、軟件包介紹
    drbd共有兩部分組成：內核模塊和用戶空間的管理工具。其中drbd內核模塊代碼已經整合進Linux內核2.6.33以後的版本中，因此，如果您的內核版本高於此版本的話，你只需要安裝管理工具即可；否則，您需要同時安裝內核模塊和管理工具兩個軟件包，並且此兩者的版本號一定要保持對應。

    目前適用CentOS 5的drbd版本主要有8.0、8.2、8.3三個版本，其對應的rpm包的名字分別為drbd, drbd82和drbd83，對應的內核模塊的名字分別為kmod-drbd, kmod-drbd82和kmod-drbd83。而適用於CentOS 6的版本為8.4，其對應的rpm包為drbd和drbd-kmdl，但在實際選用時，要切記兩點：drbd和drbd-kmdl的版本要對應；另一個是drbd-kmdl的版本要與當前系統的內容版本相對應。各版本的功能和配置等略有差異；我們實驗所用的平臺為x86_64且系統為CentOS 6.4，因此需要同時安裝內核模塊和管理工具。我們這裏選用最新的8.4的版本(drbd-8.4.3-33.el6.x86_64.rpm和drbd-kmdl-2.6.32-431.el6-8.4.3-33.el6.x86_64.rpm)，下載地址為ftp://rpmfind.net/linux/atrpms/，請按照需要下載。

    實際使用中，您需要根據自己的系統平臺等下載符合您需要的軟件包版本，這裏不提供各版本的下載地址。

3、軟件包安裝
下載完成後直接安裝即可：

# rpm -ivh drbd-8.4.3-33.el6.x86_64.rpm drbd-kmdl-2.6.32-431.el6-8.4.3-33.el6.x86_64.rpm

4.配置drbd
drbd的主配置文件為/etc/drbd.conf；為了管理的便捷性，目前通常會將些配置文件分成多個部分，且都保存至/etc/drbd.d目錄中，主配置文件中僅使用"include"指令將這些配置文件片斷整合起來。通常，/etc/drbd.d目錄中的配置文件為global_common.conf和所有以.res結尾的文件。其中global_common.conf中主要定義global段和common段，而每一個.res的文件用於定義一個資源。

在配置文件中，global段僅能出現一次，且如果所有的配置信息都保存至同一個配置文件中而不分開為多個文件的話，global段必須位於配置文件的最開始處。目前global段中可以定義的參數僅有minor-count, dialog-refresh, disable-ip-verification和usage-count。

common段則用於定義被每一個資源默認繼承的參數，可以在資源定義中使用的參數都可以在common段中定義。實際應用中，common段並非必須，但建議將多個資源共享的參數定義為common段中的參數以降低配置文件的復雜度。

resource段則用於定義drbd資源，每個資源通常定義在一個單獨的位於/etc/drbd.d目錄中的以.res結尾的文件中。資源在定義時必須為其命名，名字可以由非空白的ASCII字符組成。每一個資源段的定義中至少要包含兩個host子段，以定義此資源關聯至的節點，其它參數均可以從common段或drbd的默認中進行繼承而無須定義。

下面的操作在node1.samlee.com上完成。

(1)配置/etc/drbd.d/global-common.conf

# vim /etc/drbd.d/global_common.conf
global {
        usage-count no;
        # minor-count dialog-refresh disable-ip-verification
}

common {
        protocol C;

        handlers {
                pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
                pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
                local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
                # fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
                # split-brain "/usr/lib/drbd/notify-split-brain.sh root";
                # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
                # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
                # after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
        }

        startup {
                #wfc-timeout 120;
                #degr-wfc-timeout 120;
        }

        disk {
                on-io-error detach;
                #fencing resource-only;
        }

        net {
                cram-hmac-alg "sha1";
                shared-secret "mydrbdlab";
        }

        syncer {
                rate 1000M;
        }
}

(2)創建共享存儲分區大小為5G(node1,node2都必須執行分區)。

# fdisk /dev/sda

WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
         switch off the mode (command 'c') and change display units to
         sectors (command 'u').

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 3
First cylinder (7675-15665, default 7675): 
Using default value 7675
Last cylinder, +cylinders or +size{K,M,G} (7675-15665, default 15665): +5G

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table. The new table will be used at
the next reboot or after you run partprobe(8) or kpartx(8)
Syncing disks.

#重新讀取內核分區
# partx -a /dev/sda
# cat /proc/partitions 
major minor  #blocks  name

   8        0  125829120 sda
   8        1     204800 sda1
   8        2   61440000 sda2
   8        3    5248836 sda3

(3)定義資源文件/etc/drbd.d/mystore.res,內容如下:

# vim /etc/drbd.d/mystore.res
resource mystore {
        on node1.samlee.com {
                device  /dev/drbd0;
                disk    /dev/sda3;
                address 172.16.100.6:7789;
                meta-disk internal;
        }
        on node2.samlee.com {
                device  /dev/drbd0;
                disk    /dev/sda3;
                address 172.16.100.7:7789;
                meta-disk internal;
        }
}

以上文件在兩個節點上必須相同，因此，可以基於ssh將剛才配置的文件全部同步至另外一個節點。

# scp /etc/drbd.d/* node2:/etc/drbd.d/

(4)在兩個節點上初始化已定義的資源並啟動服務：

1)初始化資源，在Node1和Node2上分別執行：

# drbdadm create-md web

2)啟動服務，在Node1和Node2上分別執行：

# service drbd start

3)查看啟動狀態：

# cat /proc/drbd 
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00
 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:5248636

也可以使用drbd-overview命令來查看：

# drbd-overview 
  0:mystore/0  Connected Secondary/Secondary Inconsistent/Inconsistent C r-----

從上面的信息中可以看出此時兩個節點均處於Secondary狀態。於是，我們接下來需要將其中一個節點設置為Primary。在要設置為Primary的節點上執行如下命令：

# drbdadm primary --force mystore

註：也可以在要設置為Primary的節點上使用如下命令來設置主節點：

# drbdadm -- --overwrite-data-of-peer primary mystore

而後再次查看狀態，可以發現數據同步過程已經開始：

# cat /proc/drbd
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00
 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---n-
    ns:3689532 nr:0 dw:0 dr:3694240 al:0 bm:225 lo:3 pe:3 ua:7 ap:0 ep:1 wo:f oos:1561212
    [=============>......] sync'ed: 70.3% (1524/5124)M
    finish: 0:00:42 speed: 36,400 (40,080) K/sec

等數據同步完成以後再次查看狀態，可以發現節點已經牌實時狀態，且節點已經有了主次：

# drbd-overview 
  0:mystore/0  Connected Primary/Secondary UpToDate/UpToDate C r----- 

### Primary/Secondary  --左邊顯示當前主機狀態，右邊顯示其他節點狀態

(5)創建文件系統
文件系統的掛載只能在Primary節點進行，因此，也只有在設置了主節點後才能對drbd設備進行格式化,下面的操作在node1.samlee.com上完成：

# mke2fs -t ext4 /dev/drbd0 
# mkdir /mydata
# mount /dev/drbd0 /mydata/
# cd /mydata/
# vim node1.conf
Weblcome to node1.....

(6)切換Primary和Secondary節點
對主Primary/Secondary模型的drbd服務來講，在某個時刻只能有一個節點為Primary，因此，要切換兩個節點的角色，只能在先將原有的Primary節點設置為Secondary後，才能原來的Secondary節點設置為Primary:

Node1操作--降級為備節點:

# umount /mydata/
# drbdadm secondary mystore

查看節點狀態:

# drbd-overview 
  0:mystore/0  Connected Secondary/Secondary UpToDate/UpToDate C r-----

Node2操作--提升為主節點:

# drbdadm primary mystore
# mkdir /mydata
# mount /dev/drbd0 /mydata/

使用下面的命令查看在此前在主節點上創建文件是否存在，再新建文件node2.conf：

# ls /mydata/
lost+found  node1.conf
# cat /mydata/node1.conf 
Weblcome to node1.....
# vim /mydata/node2.conf
Welcome to Node2....

測試如下:

Node2操作--降級為備節點:

# umount /mydata/
# drbdadm secondary mystore

查看節點狀態:

# drbd-overview 
  0:mystore/0  Connected Secondary/Secondary UpToDate/UpToDate C r-----

Node1操作--提升為主節點:

# drbdadm primary mystore
# mount /dev/drbd0 /mydata/

使用下面的命令查看在此前在主節點上創建文件是否存在：

# ls /mydata/
lost+found  node1.conf  node2.conf

Pacemaker+drbd實現高可用共享存儲

前提：
1)本配置共有兩個測試節點，分別node1.samlee.org和node2.samlee.org，相的IP地址分別為172.16.100.6和172.16.100.7;
2)node1和node2兩個節點已經配置好了基於corosync的集群；且node1和node2也已經配置好了Primary/Secondary模型的drbd設備/dev/drbd0，且對應的資源名稱為mystore；如果您此處的配置有所不同，請確保後面的命令中使用到時與您的配置修改此些信息與您所需要的配置保持一致；

3)停止drbd服務，將所有節點降級為備用節點

# drbdadm secondary mystore
# drbd-overview 
  0:mystore/0  Connected Secondary/Secondary UpToDate/UpToDate C r----- 
# chkconfig drbd off
# service drbd stop

4)系統為系統為CentOS 6.5，x86_64平臺;

實現過程如下:

1、查看當前集群的配置信息，確保已經配置全局屬性參數為兩節點集群所適用：

# crm configure show
node node1.samlee.com
node node2.samlee.com
property $id="cib-bootstrap-options" \
    dc-version="1.1.10-14.el6-368c726" \
    cluster-infrastructure="classic openais (with plugin)" \
    expected-quorum-votes="2"
# crm configure property stonith-enabled=false
# crm configure property no-quorum-policy=ignore
# crm configure rsc_defaults resource-stickiness=100
# crm configure show
node node1.samlee.com
node node2.samlee.com
property $id="cib-bootstrap-options" \
    dc-version="1.1.10-14.el6-368c726" \
    cluster-infrastructure="classic openais (with plugin)" \
    expected-quorum-votes="2" \
    stonith-enabled="false" \
    no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
    resource-stickiness="100"

2、將已經配置好的drbd設備/dev/drbd0定義為集群服務；

1）按照集群服務的要求，首先確保兩個節點上的drbd服務已經停止，且不會隨系統啟動而自動啟動：

# drbd-overview
 0:web Unconfigured . . . . 
# chkconfig drbd off

2）配置drbd為集群資源：
提供drbd的RA目前由OCF歸類為linbit，其路徑為/usr/lib/ocf/resource.d/linbit/drbd。我們可以使用如下命令來查看此RA及RA的meta信息：

查詢集群資源類型:

# crm ra classes
lsb
ocf / heartbeat linbit pacemaker
service
stonith

查詢集群資源類型下包含的小類

# crm ra list ocf linbit
drbd

查詢集群資源類型drbd使用幫助

Manages a DRBD device as a Master/Slave resource (ocf:linbit:drbd)

This resource agent manages a DRBD resource as a master/slave resource.
DRBD is a shared-nothing replicated storage device.
Note that you should configure resource level fencing in DRBD,
this cannot be done from this resource agent.
See the DRBD User's Guide for more information.
http://www.drbd.org/docs/applications/

Parameters (* denotes required, [] the default):

drbd_resource* (string): drbd resource name
The name of the drbd resource from the drbd.conf file.

drbdconf (string, [/etc/drbd.conf]): Path to drbd.conf
Full path to the drbd.conf file.

stop_outdates_secondary (boolean, [false]): outdate a secondary on stop
Recommended setting: until pacemaker is fixed, leave at default (disabled).
Note that this feature depends on the passed in information in
OCF_RESKEY_CRM_meta_notify_master_uname to be correct, which unfortunately is
not reliable for pacemaker versions up to at least 1.0.10 / 1.1.4.
If a Secondary is stopped (unconfigured), it may be marked as outdated in the
drbd meta data, if we know there is still a Primary running in the cluster.
Note that this does not affect fencing policies set in drbd config,
but is an additional safety feature of this resource agent only.
You can enable this behaviour by setting the parameter to true.
If this feature seems to not do what you expect, make sure you have defined
fencing policies in the drbd configuration as well.

Operations' defaults (advisory minimum):

start timeout=240
promote timeout=90
demote timeout=90
notify timeout=90
stop timeout=100
monitor_Slave timeout=20 interval=20
monitor_Master timeout=20 interval=10

drbd需要同時運行在兩個節點上，但只能有一個節點（primary/secondary模型）是Master，而另一個節點為Slave；因此，它是一種比較特殊的集群資源，其資源類型為多態（Multi-state）clone類型，即主機節點有Master和Slave之分，且要求服務剛啟動時兩個節點都處於slave狀態。

# crm configure
crm(live)configure# primitive mysqlstore ocf:linbit:drbd params drbd_resource=mystore op monitor role=Master interval=30s timeout=30s op monitor role=Slave interval=60s timeout=20s op start timeout=240s op stop timeout=100s
crm(live)configure# verify 
crm(live)configure# master ms_mysqlstore mysqlstore meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify="true"
crm(live)configure# verify 
crm(live)configure# commit 

--查詢配置
# crm configure show
node node1.samlee.com
node node2.samlee.com
primitive mysqlstore ocf:linbit:drbd \
    params drbd_resource="mystore" \
    op monitor role="Master" interval="30s" timeout="30s" \
    op monitor role="Slave" interval="60s" timeout="20s" \
    op start timeout="240s" interval="0" \
    op stop timeout="100s" interval="0"
ms ms_mysqlstore mysqlstore \
    meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
property $id="cib-bootstrap-options" \
    dc-version="1.1.10-14.el6-368c726" \
    cluster-infrastructure="classic openais (with plugin)" \
    expected-quorum-votes="2" \
    stonith-enabled="false" \
    no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
    resource-stickiness="100"

查詢當前集群運行狀態:

crm(live)# status
Last updated: Thu Aug 18 12:26:47 2016
Last change: Thu Aug 18 12:24:02 2016 via cibadmin on node1.samlee.com
Stack: classic openais (with plugin)
Current DC: node1.samlee.com - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
2 Resources configured


Online: [ node1.samlee.com node2.samlee.com ]

 Master/Slave Set: ms_mysqlstore [mysqlstore]
     Masters: [ node1.samlee.com ]
     Slaves: [ node2.samlee.com ]

由上面的信息可以看出此時的drbd服務的Primary節點為node1.samlee.com，Secondary節點為node2.samlee.com。當然，也可以在node1上使用如下命令驗正當前主機是否已經成為web資源的Primary節點：

# drbdadm role mystore
Primary/Secondary

ms_mystore的Master節點即為drbd服務web資源的Primary節點，此節點的設備/dev/drbd0可以掛載使用，且在某集群服務的應用當中也需要能夠實現自動掛載。假設我們這裏的web資源是為Web服務器集群提供網頁文件的共享文件系統，其需要掛載至/mydata（此目錄需要在兩個節點都已經建立完成）目錄。

此外，此自動掛載的集群資源需要運行於drbd服務的Master節點上，並且只能在drbd服務將某節點設置為Primary以後方可啟動。因此，還需要為這兩個資源建立排列約束和順序約束。

crm(live)configure# primitive mysqlfs ocf:heartbeat:Filesystem params device=/dev/drbd0 directory=/mydata fstype=ext4 op monitor interval=30s timeout=40s op start timeout=60s op stop timeout=60s on-fail=restart
crm(live)configure# verify 
crm(live)configure# colocation mysqlfs_with_ms_mysqlstore_master inf: mysqlfs ms_mysqlstore:Master
crm(live)configure# verify 
crm(live)configure# order mysqlfs_after_ms_mysqlstore_master mandatory: ms_mysqlstore:promote mysqlfs:start
crm(live)configure# verify 
crm(live)configure# show
node node1.samlee.com
node node2.samlee.com
primitive mysqlfs ocf:heartbeat:Filesystem \
    params device="/dev/drbd0" directory="/mydata" fstype="ext4" \
    op monitor interval="30s" timeout="40s" \
    op start timeout="60s" interval="0" \
    op stop timeout="60s" on-fail="restart" interval="0"
primitive mysqlstore ocf:linbit:drbd \
    params drbd_resource="mystore" \
    op monitor role="Master" interval="30s" timeout="30s" \
    op monitor role="Slave" interval="60s" timeout="20s" \
    op start timeout="240s" interval="0" \
    op stop timeout="100s" interval="0"
ms ms_mysqlstore mysqlstore \
    meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
colocation mysqlfs_with_ms_mysqlstore_master inf: mysqlfs ms_mysqlstore:Master
order mysqlfs_after_ms_mysqlstore_master inf: ms_mysqlstore:promote mysqlfs:start
property $id="cib-bootstrap-options" \
    dc-version="1.1.10-14.el6-368c726" \
    cluster-infrastructure="classic openais (with plugin)" \
    expected-quorum-votes="2" \
    stonith-enabled="false" \
    no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
    resource-stickiness="100"

--提交配置
crm(live)configure# commit

查詢集群狀態如下:

crm(live)# status
Last updated: Thu Aug 18 13:27:28 2016
Last change: Thu Aug 18 13:25:31 2016 via cibadmin on node1.samlee.com
Stack: classic openais (with plugin)
Current DC: node1.samlee.com - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
3 Resources configured


Online: [ node1.samlee.com node2.samlee.com ]

 Master/Slave Set: ms_mysqlstore [mysqlstore]
     Masters: [ node1.samlee.com ]
     Slaves: [ node2.samlee.com ]
 mysqlfs    (ocf::heartbeat:Filesystem):    Started node1.samlee.com

查看drbd運行狀態:

[root@node1 ~]# drbd-overview 
  0:mystore/0  Connected Primary/Secondary UpToDate/UpToDate C r----- /mydata ext4 5.0G 139M 4.6G 3%
[root@node1 ~]# ls /mydata/
lost+found  node1.conf  node2.conf

本文出自 “Opensamlee” 博客，請務必保留此出處http://gzsamlee.blog.51cto.com/9976612/1839914

Tags: 解決方案 IP地址服務器軟件網絡

文章來源：

共享存儲之drbd應用詳解及pacemaker實現高可用共享存儲(一)

相關文章