1. 程式人生 > >Ceph集群部署手冊

Ceph集群部署手冊

sds rim 100g push .repo rbd clean tin style

Ceph集群搭建

一、 環境準備(三臺服務器一樣的配置)

操作系統平臺:centos7.3

1.關閉firewalld與selinux

2.每臺服務器添加3塊100G硬盤

3.配置ip

centos-01

centos-02

centos-03

192.168.0.118

192.168.0.119

192.168.0.120

4.修改yum源,官網的yum源可能會很慢,所以可以添加ali的

[[email protected] ~ ]# yum clean all
[[email protected] ~ ]# curl http://mirrors.aliyun.com/repo/Centos-7.repo >/etc/yum.repos.d/CentOS-Base.repo

[[email protected] ~ ]# curl http://mirrors.aliyun.com/repo/epel-7.repo >/etc/yum.repos.d/epel.repo
[[email protected] ~ ]# sed -i ‘/aliyuncs/d‘ /etc/yum.repos.d/CentOS-Base.repo
[[email protected] ~ ]# sed -i ‘/aliyuncs/d‘ /etc/yum.repos.d/epel.repo

[[email protected] ~ ]#sed -i ‘s/$releasever/7/g‘ /etc/yum.repos.d/CentOS-Base.repo

[[email protected] ~ ]# yum makecache

5.修改靜態解析文件/etc/hosts
[[email protected] ~ ]# vim /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.0.118 centos-01
192.168.0.119 centos-02

二、集群搭建

1.集群配置如下:

主機

IP

功能

centos-01

192.168.0.118

deploy、mon*1、osd*3

centos-02

192.168.0.119

mon*1、 osd*3

centos-03

192.168.0.120

mon*1 、osd*3

2.環境清理

如果之前部署失敗了,不必刪除ceph客戶端,或者重新搭建虛擬機,只需要在每個節點上執行如下指令即可將環境清理至剛安裝完ceph客戶端時的狀態!強烈建議在舊集群上搭建之前清理幹凈環境,否則會發生各種異常情況。
[[email protected] cluster]# ps aux|grep ceph |awk ‘{print $2}‘|xargs kill -9
[[email protected] cluster]# ps aux|grep ceph #確保所有進程已經結束
ps -ef|grep ceph
#確保此時所有ceph進程都已經關閉!!!如果沒有關閉,多執行幾次。
umount /var/lib/ceph/osd/*
rm -rf /var/lib/ceph/osd/*
rm -rf /var/lib/ceph/mon/*
rm -rf /var/lib/ceph/mds/*
rm -rf /var/lib/ceph/bootstrap-mds/*
rm -rf /var/lib/ceph/bootstrap-osd/*
rm -rf /var/lib/ceph/bootstrap-rgw/*
rm -rf /var/lib/ceph/tmp/*
rm -rf /etc/ceph/*
rm -rf /var/run/ceph/*

如果在任何時候遇到問題並想重新開始,請執行以下操作清除Ceph軟件包,並清除所有數據和配置:

ceph-deploy purge node1 node2

ceph-deploy purgedata node1 node2

ceph-deploy forgetkeys && rm ceph.*

sed -i ‘s/$releasever/7/g‘ /etc/yum.repos.d/CentOS-Base.repo

4.增加ceph的源
vim /etc/yum.repos.d/ceph.repo
添加以下內容:
[ceph]
name=ceph
baseurl=http://mirrors.163.com/ceph/rpm-jewel/el7/x86_64/
gpgcheck=0
[ceph-noarch]
name=cephnoarch
baseurl=http://mirrors.163.com/ceph/rpm-jewel/el7/noarch/
gpgcheck=0

5.安裝ceph客戶端:

yum makecache
yum install ceph ceph-radosgw -y

6.開始部署

在部署節點(centos-01)生成ssh秘鑰對,並將公鑰上傳至centos-02、centos-03

[[email protected] ~]# ssh-keygen

[[email protected] ~]# ssh-copy-id [email protected]

[[email protected] ~]# ssh-copy-id [email protected]

在部署節點(centos-01)安裝ceph-deploy,下文的部署節點統一指centos-01:
[[email protected] ~]# yum -y install ceph-deploy
[[email protected] centos-01 ~]# ceph-deploy --version
1.5.39
[[email protected] centos-01 ~]# ceph -v
ceph version 10.2.11 (e4b061b47f07f583c92a050d9e84b1813a35671e)

7.在部署節點創建部署目錄並開始部署:

[[email protected] centos-01 ~]# cd
[[email protected] centos-01 ~]# mkdir cluster
[[email protected] centos-01 ~]# cd cluster/
[[email protected] centos-01 cluster]# ceph-deploy new centos-01 centos-02 centos-03 [ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.34): /usr/bin/ceph-deploy new ceph-1 ceph-2 ceph-3
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] func : <function new at 0x7f91781f96e0>
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f917755ca28>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] ssh_copykey : True
[ceph_deploy.cli][INFO ] mon : [‘ceph-1‘, ‘ceph-2‘, ‘ceph-3‘]
..
..
ceph_deploy.new][WARNIN] could not connect via SSH
[ceph_deploy.new][INFO ] will connect again with password prompt
The authenticity of host ‘ceph-2 (192.168.57.223)‘ can‘t be established.
ECDSA key fingerprint is ef:e2:3e:38:fa:47:f4:61:b7:4d:d3:24:de:d4:7a:54.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘ceph-2,192.168.57.223‘ (ECDSA) to the list of known hosts.
root
[email protected]‘s password:
[ceph-2][DEBUG ] connected to host: ceph-2

..
..
[ceph_deploy.new][DEBUG ] Resolving host ceph-3
[ceph_deploy.new][DEBUG ] Monitor ceph-3 at 192.168.57.224
[ceph_deploy.new][DEBUG ] Monitor initial members are [‘ceph-1‘, ‘ceph-2‘, ‘ceph-3‘]
[ceph_deploy.new][DEBUG ] Monitor addrs are [‘192.168.57.222‘, ‘192.168.57.223‘, ‘192.168.57.224‘]
[ceph_deploy.new][DEBUG ] Creating a random mon key...
[ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring...
[ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf...

此時,目錄內容如下:

[[email protected] centos-01 cluster]# ls
ceph.conf ceph-deploy-ceph.log ceph.mon.keyring

8.根據自己的IP配置向ceph.conf中添加public_network,並稍微增大mon之間時差允許範圍(默認為0.05s,現改為2s):

[[email protected] centos-01 cluster]# echo public_network=192.168.0.0/24 >> ceph.conf
[[email protected] centos-01 cluster]# echo mon_clock_drift_allowed = 2 >> ceph.conf
[[email protected] centos-01 cluster]# cat ceph.conf
[global]
fsid = 0248817a-b758-4d6b-a217-11248b098e10
mon_initial_members = ceph-1, ceph-2, ceph-3
mon_host = 192.168.57.222,192.168.57.223,192.168.57.224
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
public_network=192.168.57.0/24
mon_clock_drift_allowed = 2

9.開始部署monitor:
[[email protected] centos-01 cluster]# ceph-deploy mon create-initial
..
..若幹log

如果ceph-deploy mon create-initial 報錯“ [Errno 2] No such file or directory”

解決方法:修改主機名,清理環境

推送配置文件 #ceph-deploy --overwrite-conf config push node1 node2 node3


[r[email protected] centos-01 cluster]# ls
ceph.bootstrap-mds.keyring ceph.bootstrap-rgw.keyring ceph.conf ceph.mon.keyring
ceph.bootstrap-osd.keyring ceph.client.admin.keyring ceph-deploy-ceph.log

10.查看集群狀態:
[[email protected] centos-01 cluster]# ceph -s
cluster 0248817a-b758-4d6b-a217-11248b098e10
health HEALTH_ERR
no osds
Monitor clock skew detected
monmap e1: 3 mons at {ceph-1=192.168.57.222:6789/0,ceph-2=192.168.57.223:6789/0,ceph-3=192.168.57.224:6789/0}
election epoch 6, quorum 0,1,2 ceph-1,ceph-2,ceph-3
osdmap e1: 0 osds: 0 up, 0 in
flags sortbitwise
pgmap v2: 64 pgs, 1 pools, 0 bytes data, 0 objects
0 kB used, 0 kB / 0 kB avail
64 creating

11.開始部署OSD:
ceph-deploy --overwrite-conf osd prepare centos-01:/dev/sdb centos-01:/dev/sdc centos-01:/dev/sdd centos-02:/dev/sdb centos-02:/dev/sdc centos-02:/dev/sdd centos-03:/dev/sdb centos-03:/dev/sdc centos-03:/dev/sdd --zap-disk
ceph-deploy --overwrite-conf osd activate centos-01:/dev/sdb1 centos-01:/dev/sdc1 centos-01:/dev/sdd1 centos-02:/dev/sdb1 centos-02:/dev/sdc1 centos-02:/dev/sdd1 centos-03:/dev/sdb1 centos-03:/dev/sdc1 centos-03:/dev/sdd1

集群狀態應該如下:
1234567891011 [[email protected] centos-01 cluster]# ceph -s
cluster 0248817a-b758-4d6b-a217-11248b098e10
health HEALTH_WARN
too few PGs per OSD (21 < min 30)
monmap e1: 3 mons at {ceph-1=192.168.57.222:6789/0,ceph-2=192.168.57.223:6789/0,ceph-3=192.168.57.224:6789/0}
election epoch 22, quorum 0,1,2 ceph-1,ceph-2,ceph-3
osdmap e45: 9 osds: 9 up, 9 in
flags sortbitwise
pgmap v82: 64 pgs, 1 pools, 0 bytes data, 0 objects
273 MB used, 16335 GB / 16336 GB avail
64 active+clean

12.去除這個WARN,只需要增加rbd池的PG就好:
[[email protected] centos-01 cluster]# ceph osd pool set rbd pg_num 128
set pool 0 pg_num to 128
[[email protected] centos-01 cluster]# ceph osd pool set rbd pgp_num 128
set pool 0 pgp_num to 128
[[email protected] centos-01 cluster]# ceph -s
cluster 0248817a-b758-4d6b-a217-11248b098e10
health HEALTH_ERR
19 pgs are stuck inactive for more than 300 seconds
12 pgs peering
19 pgs stuck inactive
monmap e1: 3 mons at {ceph-1=192.168.57.222:6789/0,ceph-2=192.168.57.223:6789/0,ceph-3=192.168.57.224:6789/0}
election epoch 22, quorum 0,1,2 ceph-1,ceph-2,ceph-3
osdmap e49: 9 osds: 9 up, 9 in
flags sortbitwise
pgmap v96: 128 pgs, 1 pools, 0 bytes data, 0 objects
308 MB used, 18377 GB / 18378 GB avail
103 active+clean
12 peering
9 creating
4 activating

[[email protected] centos-01 cluster]# ceph -s
cluster 0248817a-b758-4d6b-a217-11248b098e10
health HEALTH_OK
monmap e1: 3 mons at {ceph-1=192.168.57.222:6789/0,ceph-2=192.168.57.223:6789/0,ceph-3=192.168.57.224:6789/0}
election epoch 22, quorum 0,1,2 ceph-1,ceph-2,ceph-3
osdmap e49: 9 osds: 9 up, 9 in
flags sortbitwise
pgmap v99: 128 pgs, 1 pools, 0 bytes data, 0 objects
310 MB used, 18377 GB / 18378 GB avail
128 active+clean

至此,集群部署完畢。

13.config推送

請不要使用直接修改某個節點的/etc/ceph/ceph.conf文件的方式,而是去部署節點(此處為ceph-1:/root/cluster/ceph.conf)目錄下修改。因為節點到幾十個的時候,不可能一個個去修改的,采用推送的方式快捷安全!
修改完畢後,執行如下指令,將conf文件推送至各個節點:

[[email protected] centos-01 cluster]# ceph-deploy --overwrite-conf config push centos-01 centos-02 centos-03

此時,需要重啟各個節點的monitor服務,見下一節。

14.mon&osd啟動方式

# centos-01為各個monitor所在節點的主機名。
systemctl start [email protected] centos-01.service
systemctl restart [email protected] centos-01.service
systemctl stop [email protected] centos-01.service


#0為該節點的OSD的id,可以通過`ceph osd tree`查看
systemctl start/stop/restart [email protected]
[[email protected] centos-01 cluster]# ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 17.94685 root default
-2 5.98228 host ceph-1
0 1.99409 osd.0 up 1.00000 1.00000
1 1.99409 osd.1 up 1.00000 1.00000
8 1.99409 osd.2 up 1.00000 1.00000
-3 5.98228 host ceph-2
2 1.99409 osd.3 up 1.00000 1.00000
3 1.99409 osd.4 up 1.00000 1.00000
4 1.99409 osd.5 up 1.00000 1.00000
-4 5.98228 host ceph-3
5 1.99409 osd.6 up 1.00000 1.00000
6 1.99409 osd.7 up 1.00000 1.00000
7 1.99409 osd.8 up 1.00000 1.00000

Ceph集群部署手冊