1. 程式人生 > >ceph 集群報 mds cluster is degraded 故障排查

ceph 集群報 mds cluster is degraded 故障排查

ceph 故障排查 mds degraded

ceph 集群報 mds cluster is degraded 故障排查

ceph 集群版本:

ceph -v
ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185)

ceph -w 查看服務狀態:

mds cluster is degraded
     monmap e1: 3 mons at {ceph-6-11=172.16.6.11:6789/0,ceph-6-12=172.16.6.12:6789/0,ceph-6-13=172.16.6.13:6789/0}
            election epoch 454, quorum 0,1,2 ceph-6-11,ceph-6-12,ceph-6-13
      fsmap e1928: 1/1/1 up {0=ceph-6-13=up:rejoin}, 2 up:standby
     osdmap e4107: 90 osds: 90 up, 90 in
            flags sortbitwise,require_jewel_osds
      pgmap v24380658: 5120 pgs, 4 pools, 14837 GB data, 5031 kobjects            44476 GB used, 120 TB / 163 TB avail                5120 active+clean

服務日誌:

fault with nothing to send, going to standby2017-05-08 00:21:32.423571 7fb859159700  1 heartbeat_map is_healthy ‘MDSRank‘ had timed out after 152017-05-08 00:21:32.423578 7fb859159700  1 mds.beacon.ceph-6-12 _send skipping beacon, heartbeat map not healthy2017-05-08 00:21:33.006114 7fb85e264700  1 heartbeat_map is_healthy ‘MDSRank‘ had timed out after 152017-05-08 00:21:34.902990 7fb858958700 -1 mds.ceph-6-12 *** got signal Terminated ***2017-05-08 00:21:36.423632 7fb859159700  1 heartbeat_map is_healthy ‘MDSRank‘ had timed out after 152017-05-08 00:21:36.423640 7fb859159700  1 mds.beacon.ceph-6-12 _send skipping beacon, heartbeat map not healthy2017-05-08 00:21:36.904448 7fb85c260700  1 mds.0.1929 rejoin_joint_start2017-05-08 00:21:36.906440 7fb85995a700  1 heartbeat_map reset_timeout ‘MDSRank‘ had timed out after 152017-05-08 00:21:36.906502 7fb858958700  1 mds.ceph-6-12 suicide.  wanted state up:rejoin2017-05-08 00:21:37.906842 7fb858958700  1 mds.0.1929 shutdown: shutting down rank 02017-05-08 01:04:36.411123 7f2886f60180  0 set uid:gid to 167:167 (ceph:ceph)2017-05-08 01:04:36.411140 7f2886f60180  0 ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185), process ceph-mds, pid 11320282017-05-08 01:04:36.411734 7f2886f60180  0 pidfile_write: ignore empty --pid-file2017-05-08 01:04:37.291720 7f2880f40700  1 mds.ceph-6-12 handle_mds_map standby2017-05-08 01:04:44.618574 7f2880f40700  1 mds.0.1955 handle_mds_map i am now mds.0.19552017-05-08 01:04:44.618588 7f2880f40700  1 mds.0.1955 handle_mds_map state change up:boot --> up:replay2017-05-08 01:04:44.618602 7f2880f40700  1 mds.0.1955 replay_start2017-05-08 01:04:44.618627 7f2880f40700  1 mds.0.1955  recovery set is

表現現象:

此時cephfs 掛載到系統的文件夾,可以進入,無法創建文件,僅能查看目錄;

故障排查解決:

參考文檔
http://tracker.ceph.com/issues/19118
http://tracker.ceph.com/issues/18730

查看信息發現,是新版本的一個bug,近期我們做了一個版本升級,從10.2.5升級到10.2.7 ,升級完成不到一周:

基本原因分析,當cephfs 存儲有大量數據的時候,多個主節點要同步狀並進行數據交換,mds 節點有消息監測,默認設置的是15秒超時,如果15沒有收到消息,就將節點踢出集群。默認的超時時間較短,會導致壓力大,返回數據慢的節點異常,被反復踢出集群,剛被踢出集群,心跳又發現節點是活著的,又會將節點加入集群,加入集群後一會又被踢出,如此反復。此時ceph集群會報“mds cluster is degraded”。服務日誌報“heartbeat_map is_healthy ‘MDSRank‘ had timed out after 15”

解決辦法:

解決辦法1:

此辦法為應急辦法,留一個mds 節點工作,其它節點服務暫時關閉,僅剩余一個節點獨立工作,不再有mds 之間的心跳監測,此問題可以規避。此步驟完成後可以按照解決辦法2進行處理,徹底解決。

解決辦法2:增大超時時間閥值,修改到300秒,參數如下:

在所有的mds 節點執行,

mds beacon grace

描述:	多久沒收到標識消息就認為 MDS 落後了(並可能替換它)。
類型:	Float
默認值:	15

參考文檔:
http://docs.ceph.org.cn/cephfs/mds-config-ref/

修改參數方法:

可以寫入ceph 配置文件,此方法我們沒有測試成功;

查看現配置:

[[email protected] ~]# ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-6-11.asok config show |grep mds|grep beacon_grace
    "mds_beacon_grace": "15",

使用在線配置命令直接修改成功:

[[email protected] ~]# ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-6-11.asok config set mds_beacon_grace 300{    "success": "mds_beacon_grace = ‘300‘ (unchangeable) "}

驗證:

[[email protected] ~]# ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-6-11.asok config show |grep mds|grep beacon_grace
    "mds_beacon_grace": "300",  #  << === 參數已經修改成功

參數修改完成後,可開啟所有已關閉mds 節點,在集群中任意關閉一個mds 主節點,狀態可以同步到其它節點,其它主節點會接管服務響應,cephfs 使用不受影響。


本文出自 “康建華” 博客,謝絕轉載!

ceph 集群報 mds cluster is degraded 故障排查