1. 程式人生 > >CDH 高可用部署

CDH 高可用部署

  1. 前期準備
    1. 配置hosts

192.168.245.105  scm-node1

192.168.245.106  scm-node2

192.168.245.107  scm-node3

    1. 設定hostname

在192.168.245.105上執行

sudo hostnamectl --static --transient  set-hostname scm-node1

在192.168.245.106上執行

sudo hostnamectl --static --transient  set-hostname scm-node2

在192.168.245.107上執行

sudo hostnamectl --static --transient  set-hostname scm-node3

    1. 關閉防火牆
      1. CentOS-6

在scm-node2、scm-node3兩臺主機上執行以下命令:

sudo chkconfig iptables off

sudo service iptables stop

      1. CentOS-7

在scm-node2、scm-node3兩臺主機上執行以下命令:

sudo chkconfig firewalld off

sudo service firewalld stop

    1. 關閉SELinux

在scm-node2、scm-node3兩臺主機上執行以下命令:

sudo sed -i '/SELINUX/s/enforcing/disabled/' /etc/selinux/config

    1. 安裝軟體

在scm-node2、scm-node3兩臺主機上已安裝CDH、MySQL。

    1. 其他配置

已設定NTP時鐘同步和雙機互信。

  1. 安裝nfs
    1. 線上安裝nfs

在scm-node1主機上執行以下命令:

sudo yum -y install nfs-utils rpcbind

    1. 離線安裝nfs
      1. CentOS-6安裝包

keyutils-1.4-5.el6.x86_64.rpm

libevent-1.4.13-4.el6.x86_64.rpm

libgssglue-0.1-11.el6.x86_64.rpm

libtirpc-0.2.1-15.el6.x86_64.rpm

nfs-utils-1.2.3-78.el6_10.1.x86_64.rpm

nfs-utils-lib-1.1.5-13.el6.x86_64.rpm

python-argparse-1.2.1-2.1.el6.noarch.rpm

rpcbind-0.2.0-16.el6.x86_64.rpm

      1. CentOS-7安裝包

gssproxy-0.7.0-17.el7.x86_64.rpm

keyutils-1.5.8-3.el7.x86_64.rpm

libbasicobjects-0.1.1-29.el7.x86_64.rpm

libevent-2.0.21-4.el7.x86_64.rpm

libini_config-1.3.1-29.el7.x86_64.rpm

libcollection-0.7.0-29.el7.x86_64.rpm

libpath_utils-0.2.1-29.el7.x86_64.rpm

libnfsidmap-0.25-19.el7.x86_64.rpm

libref_array-0.1.5-29.el7.x86_64.rpm

libtirpc-0.2.4-0.10.el7.x86_64.rpm

libverto-libevent-0.2.5-4.el7.x86_64.rpm

nfs-utils-1.3.0-0.54.el7.x86_64.rpm

quota-nls-4.01-17.el7.noarch.rpm

rpcbind-0.2.0-44.el7.x86_64.rpm

quota-4.01-17.el7.x86_64.rpm

tcp_wrappers-7.6-77.el7.x86_64.rpm

      1. 安裝所有RPM包

在scm-node1主機上進入安裝包所在目錄,然後執行以下命令:

sudo rpm -ivh *.rpm

    1. 引數說明

ro:共享目錄只讀;

rw:共享目錄可讀可寫;

all_squash:所有訪問使用者都對映為匿名使用者或使用者組;

no_all_squash(預設):訪問使用者先與本機使用者匹配,匹配失敗後再對映為匿名使用者或使用者組;

root_squash(預設):將來訪的root使用者對映為匿名使用者或使用者組;

no_root_squash:來訪的root使用者保持root帳號許可權;

anonuid=<UID>:指定匿名訪問使用者的本地使用者UID,預設為nfsnobody(65534);

anongid=<GID>:指定匿名訪問使用者的本地使用者組GID,預設為nfsnobody(65534);

secure(預設):限制客戶端只能從小於1024的tcp/ip埠連線伺服器;

insecure:允許客戶端從大於1024的tcp/ip埠連線伺服器;

sync:將資料同步寫入記憶體緩衝區與磁碟中,效率低,但可以保證資料的一致性;

async:將資料先儲存在記憶體緩衝區中,必要時才寫入磁碟;

wdelay(預設):檢查是否有相關的寫操作,如果有則將這些寫操作一起執行,這樣可以提高效率;

no_wdelay:若有寫操作則立即執行,應與sync配合使用;

subtree_check(預設) :若輸出目錄是一個子目錄,則nfs伺服器將檢查其父目錄的許可權;

no_subtree_check :即使輸出目錄是一個子目錄,nfs伺服器也不檢查其父目錄的許可權,這樣可以提高效率;

    1. 啟動nfs

sudo service rpcbind start

sudo service nfs start

sudo chkconfig rpcbind on

sudo chkconfig nfs on

  1. 安裝corosync+pacemaker
    1. 線上安裝corosync+pacemaker

在scm-node2、scm-node3兩臺主機上執行以下命令:

sudo yum install -y corosync pacemaker

    1. 離線安裝corosync+pacemaker
      1. CentOS-6安裝包

corosync-1.4.7-6.el6.x86_64.rpm

corosynclib-1.4.7-6.el6.x86_64.rpm

ConsoleKit-0.4.1-6.el6.x86_64.rpm

ConsoleKit-libs-0.4.1-6.el6.x86_64.rpm

avahi-libs-0.6.25-17.el6.x86_64.rpm

cifs-utils-4.8.1-20.el6.x86_64.rpm

clusterlib-3.0.12.1-84.el6.x86_64.rpm

cman-3.0.12.1-84.el6.x86_64.rpm

cvs-1.11.23-16.el6.x86_64.rpm

cyrus-sasl-md5-2.1.23-15.el6_6.2.x86_64.rpm

dbus-1.2.24-9.el6.x86_64.rpm

dmidecode-2.12-7.el6.x86_64.rpm

eggdbus-0.6-3.el6.x86_64.rpm

fence-agents-4.0.15-13.el6_9.2.x86_64.rpm

fence-virt-0.2.3-24.el6.x86_64.rpm

gettext-0.17-18.el6.x86_64.rpm

gnutls-2.12.23-22.el6.x86_64.rpm

gnutls-utils-2.12.23-22.el6.x86_64.rpm

hal-0.5.14-14.el6.x86_64.rpm

hal-info-20090716-5.el6.noarch.rpm

hal-libs-0.5.14-14.el6.x86_64.rpm

hdparm-9.43-4.el6.x86_64.rpm

ipmitool-1.8.15-2.el6.x86_64.rpm

libgomp-4.4.7-23.el6.x86_64.rpm

libqb-0.17.1-2.el6.x86_64.rpm

libtalloc-2.1.5-1.el6_7.x86_64.rpm

libtdb-1.3.8-3.el6_8.2.x86_64.rpm

libtevent-0.9.26-2.el6_7.x86_64.rpm

libvirt-client-0.10.2-64.el6.x86_64.rpm

libxslt-1.1.26-2.el6_3.1.x86_64.rpm

libibverbs-1.1.8-4.el6.x86_64.rpm

libnl-1.1.4-2.el6.x86_64.rpm

librdmacm-1.0.21-0.el6.x86_64.rpm

lm_sensors-libs-3.1.1-17.el6.x86_64.rpm

modcluster-0.16.2-35.el6.x86_64.rpm

nc-1.84-24.el6.x86_64.rpm

net-snmp-utils-5.5-60.el6.x86_64.rpm

net-snmp-libs-5.5-60.el6.x86_64.rpm

numactl-2.0.9-2.el6.x86_64.rpm

oddjob-0.30-6.el6.x86_64.rpm

openais-1.1.1-7.el6.x86_64.rpm

openaislib-1.1.1-7.el6.x86_64.rpm

pacemaker-1.1.18-3.el6.x86_64.rpm

pacemaker-cli-1.1.18-3.el6.x86_64.rpm

pacemaker-cluster-libs-1.1.18-3.el6.x86_64.rpm

pacemaker-libs-1.1.18-3.el6.x86_64.rpm

parted-2.1-29.el6.x86_64.rpm

pciutils-3.1.10-4.el6.x86_64.rpm

perl-Net-Telnet-3.03-11.el6.noarch.rpm

perl-TimeDate-1.16-13.el6.noarch.rpm

pexpect-2.3-6.el6.noarch.rpm

pm-utils-1.2.5-11.el6.x86_64.rpm

polkit-0.96-11.el6.x86_64.rpm

pyOpenSSL-0.13.1-2.el6.x86_64.rpm

python-suds-0.4.1-3.el6.noarch.rpm

quota-3.17-23.el6.x86_64.rpm

rdma-6.9_4.1-3.el6.noarch.rpm

resource-agents-3.9.5-46.el6.x86_64.rpm

ricci-0.16.2-87.el6.x86_64.rpm

samba-common-3.6.23-51.el6.x86_64.rpm

samba-winbind-3.6.23-51.el6.x86_64.rpm

samba-winbind-clients-3.6.23-51.el6.x86_64.rpm

sg3_utils-1.28-13.el6.x86_64.rpm

sg3_utils-libs-1.28-13.el6.x86_64.rpm

tcp_wrappers-7.6-58.el6.x86_64.rpm

telnet-0.17-48.el6.x86_64.rpm

yajl-1.0.7-3.el6.x86_64.rpm

      1. CentOS-7安裝包

corosync-2.4.3-2.el7_5.1.x86_64.rpm

corosynclib-2.4.3-2.el7_5.1.x86_64.rpm

bc-1.06.95-13.el7.x86_64.rpm

cifs-utils-6.2-10.el7.x86_64.rpm

cups-libs-1.6.3-35.el7.x86_64.rpm

libldb-1.2.2-1.el7.x86_64.rpm

libtalloc-2.1.10-1.el7.x86_64.rpm

libtevent-0.9.33-2.el7.x86_64.rpm

libtdb-1.3.15-1.el7.x86_64.rpm

libwbclient-4.7.1-9.el7_5.x86_64.rpm

libcgroup-0.41-15.el7.x86_64.rpm

libxslt-1.1.28-5.el7.x86_64.rpm

libqb-1.0.1-6.el7.x86_64.rpm

pacemaker-cluster-libs-1.1.18-11.el7_5.3.x86_64.rpm

perl-TimeDate-2.30-2.el7.noarch.rpm

pacemaker-1.1.18-11.el7_5.3.x86_64.rpm

pacemaker-cli-1.1.18-11.el7_5.3.x86_64.rpm

psmisc-22.20-15.el7.x86_64.rpm

samba-common-4.7.1-9.el7_5.noarch.rpm

resource-agents-3.9.5-124.el7.x86_64.rpm

samba-common-libs-4.7.1-9.el7_5.x86_64.rpm

pacemaker-libs-1.1.18-11.el7_5.3.x86_64.rpm

samba-client-libs-4.7.1-9.el7_5.x86_64.rpm

      1. 安裝所有RPM包

在scm-node2、scm-node3兩臺主機進入安裝包所在目錄,然後執行以下命令:

sudo rpm -ivh *.rpm

  1. 安裝crmsh
    1. 下載crmsh
      1. CentOS-6下載地址

http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/noarch/

      1. CentOS-7下載地址

http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/noarch/

    1. 離線安裝crmsh
      1. CentOS-6安裝包

crmsh-3.0.0-6.1.noarch.rpm

crmsh-scripts-3.0.0-6.1.noarch.rpm

python-parallax-1.0.1-28.1.noarch.rpm

python-lxml-2.2.3-1.1.el6.x86_64.rpm

python-six-1.9.0-2.el6.noarch.rpm

python-dateutil-1.4.1-7.el6.noarch.rpm

redhat-rpm-config-9.0.3-51.el6.centos.noarch.rpm

      1. CentOS-7安裝包

libxslt-1.1.28-5.el7.x86_64.rpm

python-dateutil-1.5-7.el7.noarch.rpm

python-lxml-3.2.1-4.el7.x86_64.rpm

python-parallax-1.0.1-29.1.noarch.rpm

crmsh-scripts-3.0.0-6.2.noarch.rpm

crmsh-3.0.0-6.2.noarch.rpm

      1. 安裝所有RPM包

在scm-node2、scm-node3兩臺主機進入安裝包所在目錄,然後執行以下命令:

sudo rpm -ivh *.rpm

  1. 配置corosync+pacemaker叢集
    1. 配置corosync
      1. 配置version1.x(CentOS-6)

在scm-node2主機上執行以下命令:

sudo vi /etc/corosync/corosync.conf

/etc/corosync/corosync.conf內容:

compatibility: whitetank

totem {

        version: 2

        secauth: off

        interface {

                member {

                        memberaddr: scm-node2

                }

                member {

                        memberaddr: scm-node3

                }

                ringnumber: 0

                bindnetaddr: scm-node2

                mcastport: 5405

        }

        transport: udpu

}

 

logging {

        fileline: off

        to_logfile: yes

        to_syslog: yes

        logfile: /var/log/cluster/corosync.log

        debug: off

        timestamp: on

        logger_subsys {

                subsys: AMF

                debug: off

        }

}

 

service {

        name: pacemaker

        ver:  0

        use_mgmtd: yes

}

在scm-node3主機上執行以下命令:

sudo vi /etc/corosync/corosync.conf

/etc/corosync/corosync.conf內容:

compatibility: whitetank

totem {

        version: 2

        secauth: off

        interface {

                member {

                        memberaddr: scm-node2

                }

                member {

                        memberaddr: scm-node3

                }

                ringnumber: 0

                bindnetaddr: scm-node3

                mcastport: 5405

        }

        transport: udpu

}

 

logging {

        fileline: off

        to_logfile: yes

        to_syslog: yes

        logfile: /var/log/cluster/corosync.log

        debug: off

        timestamp: on

        logger_subsys {

                subsys: AMF

                debug: off

        }

}

 

service {

        name: pacemaker

        ver:  0

        use_mgmtd: yes

}

      1. 配置version2.x(CentOS-7)

在scm-node2、scm-node3兩臺主機上執行以下命令:

sudo vi /etc/corosync/corosync.conf

/etc/corosync/corosync.conf內容:

totem {

       version: 2

       secauth: off

       cluster_name: cmf

       transport: udpu

}

 

nodelist {

          node {

                ring0_addr: scm-node2

                nodeid: 1

          }

          node {

                ring0_addr: scm-node3

                nodeid: 2

          }

}

 

logging {

        fileline: off

        to_logfile: yes

        to_syslog: yes

        logfile: /var/log/cluster/corosync.log

        debug: off

        timestamp: on

        logger_subsys {

                subsys: AMF

                debug: off

        }

}

 

quorum {

        provider: corosync_votequorum

        two_node: 1

}

    1. 啟動叢集
      1. 啟動version1.x(CentOS-6)

啟動服務:

sudo service corosync start

開機啟動:

sudo chkconfig corosync on

      1. 啟動version2.x(CentOS-7)

啟動服務:

sudo service pacemaker start

sudo service corosync start

開機啟動:

sudo chkconfig pacemaker on

sudo chkconfig corosync on

    1. 配置叢集
      1. 禁用仲裁檢查

在scm-node2主機上執行以下命令:

sudo crm configure property no-quorum-policy=ignore

在兩個節點中,當節點達不到法定票數時(節點數不是奇數),即兩個節點一個壞了,沒法投票,正常的節點達不到法定票數,此時如果是預設引數,即正常的機器不能工作,所以需要該為ignore,使正常機器接管。

      1. 禁用stonith

在scm-node2主機上執行以下命令:

sudo crm configure property stonith-enabled=false

因為我們這裡沒有stonith裝置所有要禁用。

      1. 修改預設粘性值

在scm-node2主機上執行以下命令:

sudo crm configure rsc_defaults resource-stickiness= 100

一些環境中會要求儘量避免資源在節點之間移動。移動資源通常一位置一段時間內無法提供服務,某些負載的服務,比如Oracle資料庫,這個時間可能會很長。為了達到這個效果,pacemaker有一個叫做資源粘性值的概念,它能夠控制一個服務(資源)有多想待在它正在執行的節點上。你可以把它認為是無法提供服務的“代價”。pacemaker為了達到最優分部各個資源的目的,預設設定這個值為0.我們可以為每個資源定義不同的粘性值,但一般來說更改預設粘性值就夠了。

      1. 檢查配置是否正確

在scm-node2、scm-node3兩臺主機上執行以下命令:

sudo crm_verify -L -V

假若沒有輸出任何則配置正確。

    1. 常用操作
      1. 檢視叢集狀態

在scm-node2主機上執行以下命令:

sudo crm status

      1. 檢視資源配置

在scm-node2主機上執行以下命令:

sudo crm configure show

      1. 檢視資源代理

在scm-node2主機上執行以下命令:

sudo crm ra classes

      1. 移動資源位置

在scm-node2主機上執行以下命令:

sudo crm resource move cloudera-scm-server scm-node3

      1. 定義克隆叢集

定義2個克隆資源:

sudo crm configure clone mysql-cluster mysql clone-max=2 clone-node-max=2 notify=true

停止克隆:

sudo crm resource stop mysql-cluster

刪除克隆:

sudo crm configure delete mysql-cluster

注意:

1.mysql-cluster是自定義叢集名

2.必須先定義原生(primitive)mysql資源,詳情見6.1.2

      1. 定義資源分組

定義分組,可以保證mysql和cloudera-scm-server在同一節點上:

sudo crm configure group server-group mysql cloudera-scm-server

停止分組:

sudo crm resource stop server-group

刪除分組:

sudo crm configure delete server-group

注意:proxy-group是自定義組名。

      1. 定義資源約束

定義排列約束,mysql和cloudera-scm-server必須在同一節點上:

sudo crm configure colocation mysql-with-cloudera-scm-server inf: mysql cloudera-scm-server

定義順序約束,先啟動mysql之後才啟動cloudera-scm-server:

sudo crm configure order mysql_before_cloudera-scm-server mandatory: mysql cloudera-scm-server

定義位置約束,資源vip固定在scm-node2:

sudo crm configure location vip_pref_node2 vip inf: scm-node2

刪除約束:

sudo crm configure delete mysql-with-cloudera-scm-server

sudo crm configure delete mysql_before_cloudera-scm-server

sudo crm configure delete vip_pref_node2

注意:mysql-with-cloudera-scm-server、mysql_before_cloudera-scm-server、vip_pref_node2是自定義約束名

      1. 刪除叢集資源

sudo crm resource stop vip

sudo crm configure delete vip

注意:必須先停止資源,才能刪除。

  1. 配置CDH HA叢集
    1. 配置nfs
      1. 配置mysql

在scm-node1主機上建立資料夾:

sudo mkdir -p /media/mysql

在scm-node1主機上給資料夾授權:

sudo chmod 666 /media/mysql

在scm-node1主機上配置exports:

sudo vi /etc/exports

/etc/exports新增內容:

/media/mysql scm-node*(rw,async,no_root_squash,no_subtree_check)

注意:scm-node*表示允許主機名以scm-node開頭的主機訪問/media/mysql

配置生效命令:

sudo exportfs -r

在scm-node2、scm-node3兩臺主機上掛載目錄:

sudo vi /etc/fstab

在/etc/fstab檔案最後新增內容:

scm-node1:/media/mysql /var/lib/mysql nfs auto,noatime,nolock,intr,tcp,actimeo=1800 0 0

      1. 配置cloudera-scm-server

在scm-node1主機上建立資料夾:

sudo mkdir -p /media/cloudera-scm-server

在scm-node1主機上給資料夾授權:

sudo chmod 666 /media/ cloudera-scm-server

配置exports:

sudo vi /etc/exports

/etc/exports新增內容:

/media/ cloudera-scm-server scm-node*(rw,async,no_root_squash,no_subtree_check)

配置生效命令:

sudo exportfs -r

在scm-node2、scm-node3兩臺主機上資料夾授權:

sudo chown -R cloudera-scm:cloudera-scm /var/lib/cloudera-scm-server

在scm-node2、scm-node3兩臺主機上掛載目錄:

sudo vi /etc/fstab

在/etc/fstab最後新增內容:

scm-node1:/media/cloudera-scm-server /var/lib/cloudera-scm-server nfs auto,noatime,nolock,intr,tcp,actimeo=1800 0 0

      1. 配置cloudera-scm-agent

在scm-node1主機上建立資料夾:

sudo mkdir -p /media/cloudera-scm-agent

sudo mkdir -p /media/cloudera-host-monitor

sudo mkdir -p /media/cloudera-scm-eventserver

sudo mkdir -p /media/cloudera-service-monitor

在scm-node1主機上給資料夾授權:

sudo chmod 666 /media/ cloudera-scm-agent

sudo chmod 666 /media/ cloudera-host-monitor

sudo chmod 666 /media/ cloudera-scm-eventserver

sudo chmod 666 /media/ cloudera-service-monitor

配置exports:

sudo vi /etc/exports

/etc/exports新增內容:

/media/ cloudera-scm-agent scm-node*(rw,async,no_root_squash,no_subtree_check)

/media/ cloudera-host-monitor scm-node*(rw,async,no_root_squash,no_subtree_check)

/media/ cloudera-scm-eventserver scm-node*(rw,async,no_root_squash,no_subtree_check)

/media/ cloudera-service-monitor scm-node*(rw,async,no_root_squash,no_subtree_check)

配置生效命令:

sudo exportfs -r

在scm-node2、scm-node3兩臺主機上資料夾授權:

sudo chown -R cloudera-scm:cloudera-scm /var/lib/cloudera-scm-agent

sudo chown -R cloudera-scm:cloudera-scm /var/lib/cloudera-host-monitor

sudo chown -R cloudera-scm:cloudera-scm /var/lib/cloudera-scm-eventserver

sudo chown -R cloudera-scm:cloudera-scm /var/lib/cloudera-service-monitor

在scm-node2、scm-node3兩臺主機上掛載目錄:

sudo vi /etc/fstab

在/etc/fstab最後新增內容:

scm-node1:/media/cloudera-scm-agent /var/lib/cloudera-scm-agent nfs auto,noatime,nolock,intr,tcp,actimeo=1800 0 0

scm-node1:/media/cloudera-host-monitor /var/lib/cloudera- host-monitor nfs auto,noatime,nolock,intr,tcp,actimeo=1800 0 0

scm-node1:/media/cloudera-scm-eventserver /var/lib/cloudera-scm-eventserver nfs auto,noatime,nolock,intr,tcp,actimeo=1800 0 0

scm-node1:/media/cloudera-service-monitor /var/lib/cloudera- service-monitor nfs auto,noatime,nolock,intr,tcp,actimeo=1800 0 0

在CDH管理介面執行以下操作:

1.停止Cloudera Management Service,然後刪除

2.關閉HTTP Referer Check

    1. 新增資源
      1. 新增VIP資源

在scm-node2主機上執行以下命令:

sudo crm configure primitive vip ocf:heartbeat:IPaddr2 params ip='192.168.245.165' op monitor interval=5s timeout=20s on-fail=restart

注意:vip是自定義資源名,ip必須和當前主機在同一網段。

      1. 新增mysql資源

在scm-node2、scm-node3主機上執行以下命令(CentOS-6):

sudo chkconfig mysql off

在scm-node2、scm-node3主機上執行以下命令(CentOS-7):

sudo chkconfig mysqld off

在scm-node2主機上執行以下命令(CentOS-6):

sudo crm configure primitive mysql lsb:mysql op monitor interval=20s timeout=100s on-fail=restart

在scm-node2主機上執行以下命令(CentOS-7):

sudo crm configure primitive mysql systemd:mysqld op monitor interval=20s timeout=100s on-fail=restart

      1. 新增cloudera-scm-server資源

在scm-node2、scm-node3主機上執行以下命令(CentOS-6):

sudo chkconfig cloudera-scm-server off

在scm-node2、scm-node3主機上執行以下命令(CentOS-7):

sudo chkconfig cloudera-scm-server off

在scm-node2、scm-node3主機上修改db.propertie:

sudo vi /etc/cloudera-scm-server/db.properties

/etc/cloudera-scm-server/db.properties變更內容:

com.cloudera.cmf.db.host=192.168.245.165

在scm-node2主機上新增cloudera-scm-server資源:

sudo crm configure primitive cloudera-scm-server lsb:cloudera-scm-server op monitor interval=20s timeout=40s on-fail=restart

      1. 新增cloudera-scm-agent資源

在scm-node2、scm-node3主機上執行以下命令(CentOS-6):

sudo chkconfig cloudera-scm-agent off

在scm-node2、scm-node3主機上執行以下命令(CentOS-7):

sudo chkconfig cloudera-scm-agent off

在scm-node2、scm-node3主機上建立資料夾:

sudo mkdir -p /usr/lib/ocf/resource.d/cm

在scm-node2、scm-node3主機上建立檔案:

sudo vi /usr/lib/ocf/resource.d/cm/cloudera-scm-agent

/usr/lib/ocf/resource.d/cm/cloudera-scm-agent內容(CentOS-6):

#!/bin/sh

#######################################################################

# CM Agent OCF script

#######################################################################

#######################################################################

# Initialization:

: ${__OCF_ACTION=$1}

OCF_SUCCESS=0

OCF_ERROR=1

OCF_STOPPED=7

#######################################################################

 

meta_data() {

        cat <<END

<?xml version="1.0"?>

<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">

<resource-agent name="Cloudera Manager Agent" version="1.0">

<version>1.0</version>

 

<longdesc lang="en">

 This OCF agent handles simple monitoring, start, stop of the Cloudera

 Manager Agent, intended for use with Pacemaker/corosync for failover.

</longdesc>

<shortdesc lang="en">Cloudera Manager Agent OCF script</shortdesc>

 

<parameters />

 

<actions>

<action name="start"        timeout="20" />

<action name="stop"         timeout="20" />

<action name="monitor"      timeout="20" interval="10" depth="0"/>

<action name="meta-data"    timeout="5" />

</actions>

</resource-agent>

END

}

 

#######################################################################

 

agent_usage() {

cat <<END

 usage: $0 {start|stop|monitor|meta-data}

 Cloudera Manager Agent HA OCF script - used for managing Cloudera Manager Agent and managed processes lifecycle for use with Pacemaker.

END

}

 

agent_start() {

    service cloudera-scm-agent start

    if [ $? =  0 ]; then

        return $OCF_SUCCESS

    fi

    return $OCF_ERROR

}

 

agent_stop() {

    service cloudera-scm-agent hard_stop_confirmed

    if [ $? =  0 ]; then

        return $OCF_SUCCESS

    fi

    return $OCF_ERROR

}

 

agent_monitor() {

        # Monitor _MUST!_ differentiate correctly between running

        # (SUCCESS), failed (ERROR) or _cleanly_ stopped (NOT RUNNING).

        # That is THREE states, not just yes/no.

        service cloudera-scm-agent status

        if [ $? = 0 ]; then

            return $OCF_SUCCESS

        fi

        return $OCF_STOPPED

}

 

 

case $__OCF_ACTION in

meta-data)      meta_data

                exit $OCF_SUCCESS

                ;;

start)          agent_start;;

stop)           agent_stop;;

monitor)        agent_monitor;;

usage|help)     agent_usage

                exit $OCF_SUCCESS

                ;;

*)              agent_usage

                exit $OCF_ERR_UNIMPLEMENTED

                ;;

esac

rc=$?

exit $rc

/usr/lib/ocf/resource.d/cm/cloudera-scm-agent內容(CentOS-7):

#!/bin/sh

#######################################################################

# CM Agent OCF script

#######################################################################

#######################################################################

# Initialization:

: ${__OCF_ACTION=$1}

OCF_SUCCESS=0

OCF_ERROR=1

OCF_STOPPED=7

#######################################################################

 

meta_data() {

        cat <<END

<?xml version="1.0"?>

<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">

<resource-agent name="Cloudera Manager Agent" version="1.0">

<version>1.0</version>

 

<longdesc lang="en">

 This OCF agent handles simple monitoring, start, stop of the Cloudera

 Manager Agent, intended for use with Pacemaker/corosync for failover.

</longdesc>

<shortdesc lang="en">Cloudera Manager Agent OCF script</shortdesc>

 

<parameters />

 

<actions>

<action name="start"        timeout="20" />

<action name="stop"         timeout="20" />

<action name="monitor"      timeout="20" interval="10" depth="0"/>

<action name="meta-data"    timeout="5" />

</actions>

</resource-agent>

END

}

 

#######################################################################

 

agent_usage() {

cat <<END

 usage: $0 {start|stop|monitor|meta-data}

 Cloudera Manager Agent HA OCF script - used for managing Cloudera Manager Agent and managed processes lifecycle for use with Pacemaker.

END

}

 

agent_start() {

    service cloudera-scm-agent start

    if [ $? =  0 ]; then

        return $OCF_SUCCESS

    fi

    return $OCF_ERROR

}

 

agent_stop() {

    service cloudera-scm-agent next_stop_hard

    service cloudera-scm-agent stop

    if [ $? =  0 ]; then

        return $OCF_SUCCESS

    fi

    return $OCF_ERROR

}

 

agent_monitor() {

        # Monitor _MUST!_ differentiate correctly between running

        # (SUCCESS), failed (ERROR) or _cleanly_ stopped (NOT RUNNING).

        # That is THREE states, not just yes/no.

        service cloudera-scm-agent status

        if [ $? = 0 ]; then

            return $OCF_SUCCESS

        fi

        return $OCF_STOPPED

}

 

 

case $__OCF_ACTION in

meta-data)      meta_data

                exit $OCF_SUCCESS

                ;;

start)          agent_start;;

stop)           agent_stop;;

monitor)        agent_monitor;;

usage|help)     agent_usage

                exit $OCF_SUCCESS

                ;;

*)              agent_usage

                exit $OCF_ERR_UNIMPLEMENTED

                ;;

esac

rc=$?

exit $rc

在scm-node2、scm-node3主機上給檔案授權:

sudo chmod 770 /usr/lib/ocf/resource.d/cm/cloudera-scm-agent

在scm-node2、scm-node3主機上修改config.ini:

sudo vi /etc/cloudera-scm-agent/config.ini

/etc/cloudera-scm-agent/config.ini變更內容:

server_host=192.168.245.165

lib_dir=/var/lib/cloudera-scm-agent

在scm-node2主機上新增cloudera-scm-agent資源:

sudo crm configure primitive cloudera-scm-agent ocf:cm:cloudera-scm-agent op monitor interval=20s timeout=40s on-fail=restart

注意:新增cloudera-scm-agent資源主要是為了Cloudera Management Service故障轉移,如果不需要可以不加

    1. 定義約束

在scm-node2主機上定義CDH分組:

sudo crm configure group cdh-group vip mysql cloudera-scm-server cloudera-scm-agent

在scm-node2主機上定義啟動順序約束:

sudo crm configure order vip_before_mysql mandatory: vip mysql

 

sudo crm configure order mysql_before_cloudera-scm-server mandatory: mysql cloudera-scm-server

 

sudo crm configure order cloudera-scm-server_before_cloudera-scm-agent mandatory: cloudera-scm-server cloudera-scm-agent