CephFS client evict子命令使用
在使用Ceph的CephFS時,每個client都會建立與MDS的連線,以獲取CephFS的元資料資訊。如果有多個Active的MDS,則一個client可能會與多個MDS都建立連線。
Ceph提供了client/session
子命令來查詢和管理這些連線,在這些子命令中,有一個命令來處理當CephFS的client有問題時,如何手動來斷開這些client的連線,比如執行命令:# ceph tell mds.2 client evict
,則會把與mds rank 2 連線的所有clients都斷開。
那麼執行client evict
的影響是什麼?是否可以恢復呢?本文將重點介紹一下這些。
命令格式
參考:ofollow,noindex">http://docs.ceph.com/docs/master/cephfs/eviction/
測試環境:Ceph Mimic 13.2.1
1. 檢視所有client/session
可以通過命令client/session ls
檢視與ms rank [id] 建立connection的所有clients;
# ceph tell mds.0 client ls 2018-09-05 10:00:15.986 7f97f0ff97000 client.25196 ms_handle_reset on 192.168.0.26:6800/1856812761 2018-09-05 10:00:16.002 7f97f1ffb7000 client.25199 ms_handle_reset on 192.168.0.26:6800/1856812761 [ { "id": 25085, "num_leases": 0, "num_caps": 5, "state": "open", "replay_requests": 0, "completed_requests": 0, "reconnecting": false, "inst": "client.25085 192.168.0.26:0/265326503", "client_metadata": { "ceph_sha1": "5533ecdc0fda920179d7ad84e0aa65a127b20d77", "ceph_version": "ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic (stable)", "entity_id": "admin", "hostname": "mimic3", "mount_point": "/mnt/cephfuse", "pid": "44876", "root": "/" } } ]
比較重要的資訊有:
- id:client唯一id
- num_caps:client獲取的caps
- inst:client端的ip和埠連結資訊
- ceph_version:client端的ceph-fuse版本,若使用kernel client,則為kernel_version
- hostname:client端的主機名
- mount_point:client在主機上對應的mount point
- pid:client端ceph-fuse程序的pid
2. evict指定client
可以通過指定id來evict特定的client連結;
若有多個Active MDS,單個MDS Rank的evict也會傳播到別的Active MDS
# ceph tell mds.0 client evict id=25085
evict client後,在對應的host上檢查client的mountpoint已經不能訪問:
root@mimic3:/mnt/cephfuse# ls ls: cannot open directory '.': Cannot send after transport endpoint shutdown root@mimic3:~# vim /var/log/ceph/ceph-client.admin.log ... 2018-09-05 10:02:54.829 7fbe732d7700 -1 client.25085 I was blacklisted at osd epoch 519
3. 檢視ceph osd的blacklist
evict client後,會把client加入到osd blacklist中(後續有程式碼分析);
root@mimic1:~# ceph osd blacklist ls listed 1 entries 192.168.0.26:0/265326503 2018-09-05 11:02:54.696345
加入到osd blacklist後,防止evict client的in-flight資料寫下去,影響資料一致性;有效時間為1個小時;
4. 嘗試恢復evict client
把ceph osd blacklist裡與剛evict client相關的記錄刪除;
root@mimic1:~# ceph osd blacklist rm 192.168.0.26:0/265326503 un-blacklisting 192.168.0.26:0/265326503
在對應的host上檢查client是否正常?發現client變得正常了!!
root@mimic3:~# cd /mnt/cephfuse root@mimic3:/mnt/cephfuse# ls perftest
而測試Ceph Luminous 12.2.7 版本時,evcit client後無法立刻恢復,等一段時間後恢復!!
( “mds_session_autoclose”: “300.000000”,)
root@luminous2:~# ceph osd blacklist rm 192.168.213.25:0/1534097905 un-blacklisting 192.168.213.25:0/1534097905 root@luminous2:~# ceph osd blacklist ls listed 0 entries root@luminous2:/mnt/cephfuse# ls ls: cannot open directory '.': Cannot send after transport endpoint shutdown
等待一段時間(300s)後,session變得正常!
root@luminous2:/mnt/cephfuse# ls perftest
測試cephfs kernel client 的evcit,client無法恢復!!
root@mimic3:~# cd /mnt/cephfs -bash: cd: /mnt/cephfs: Permission denied
5. evict所有的client
若在evict命令後不指定具體的client id,則會把與該MDS Rank連結的所有client evict掉;
若有多個Active MDS,單個MDS Rank的evict也會傳播到別的Active MDS
# ceph tell mds.0 client evict
這個命令慎用,也一定不要誤用,影響比較大!!!
6. session kill命令
session子命令裡還有一個kill命令,它比evict命令更徹底;
root@mimic1:~# ceph tell mds.0 session kill 104704 2018-09-05 15:57:45.897 7ff2157fa7000 client.25742 ms_handle_reset on 192.168.0.26:6800/1856812761 2018-09-05 15:57:45.917 7ff2167fc7000 client.25745 ms_handle_reset on 192.168.0.26:6800/1856812761 root@mimic1:~# ceph tell mds.0 session ls 2018-09-05 15:57:50.709 7f44eeffd7000 client.95370 ms_handle_reset on 192.168.0.26:6800/1856812761 2018-09-05 15:57:50.725 7f44effff7000 client.95376 ms_handle_reset on 192.168.0.26:6800/1856812761 [] root@mimic1:~# ceph osd blacklist ls listed 1 entries 192.168.0.26:0/1613295381 2018-09-05 16:57:45.920138
刪除 osd blacklist entry:
root@mimic1:~# ceph osd blacklist rm 192.168.0.26:0/1613295381 un-blacklisting 192.168.0.26:0/1613295381 root@mimic1:~# ceph osd blacklist ls listed 0 entries
之後client連結沒有再恢復!!!
root@mimic3:~# cd /mnt/cephfuse root@mimic3:/mnt/cephfuse# ls ls: cannot open directory '.': Cannot send after transport endpoint shutdown
session kill後,這個session無法再恢復!!!也要慎用!!!
程式碼分析
基於Ceph Mimic 13.2.1程式碼;
執行client evict的程式碼如下,可以看出裡面會新增osd blacklist:
bool MDSRank::evict_client(int64_t session_id, bool wait, bool blacklist, std::stringstream& err_ss, Context *on_killed) { ... // 獲取指定id的session Session *session = sessionmap.get_session( entity_name_t(CEPH_ENTITY_TYPE_CLIENT, session_id)); // 定義kill mds session的函式 auto kill_mds_session = [this, session_id, on_killed]() { assert(mds_lock.is_locked_by_me()); Session *session = sessionmap.get_session( entity_name_t(CEPH_ENTITY_TYPE_CLIENT, session_id)); if (session) { if (on_killed) { server->kill_session(session, on_killed); } else { C_SaferCond on_safe; server->kill_session(session, &on_safe); mds_lock.Unlock(); on_safe.wait(); mds_lock.Lock(); } } ... }; // 定義新增OSD blacklist的函式 auto background_blacklist = [this, session_id, cmd](std::function<void ()> fn) { ... Context *on_blacklist_done = new FunctionContext([this, session_id, fn](int r) { objecter->wait_for_latest_osdmap( new C_OnFinisher( new FunctionContext(...), finisher) ); }); ... monc->start_mon_command(cmd, {}, nullptr, nullptr, on_blacklist_done); }; auto blocking_blacklist = [this, cmd, &err_ss, background_blacklist]() { C_SaferCond inline_ctx; background_blacklist([&inline_ctx]() { inline_ctx.complete(0); }); mds_lock.Unlock(); inline_ctx.wait(); mds_lock.Lock(); }; // 根據引數執行kill mds session和新增OSD的blacklist if (wait) { if (blacklist) { blocking_blacklist(); } // We dropped mds_lock, so check that session still exists session = sessionmap.get_session(entity_name_t(CEPH_ENTITY_TYPE_CLIENT, session_id)); ... kill_mds_session(); } else { if (blacklist) { background_blacklist(kill_mds_session); } else { kill_mds_session(); } } ... }
呼叫該函式的地方有:
Cscope tag: evict_client #linefilename / context / line 11965mds/MDSRank.cc <<handle_asok_command>> bool evicted = evict_client(strtol(client_id.c_str(), 0, 10), true, 22120mds/MDSRank.cc <<evict_clients>> evict_client(s->info.inst.name.num(), false, 3782mds/Server.cc <<find_idle_sessions>> mds->evict_client(session->info.inst.name.num(), false, true, 41058mds/Server.cc <<reconnect_tick>> mds->evict_client(session->info.inst.name.num(), false, true, ss,
1、handle_asok_command:命令列處理client evict
2、evict_clients:批量evict clients
3、find_idle_sessions:對於stale狀態的session,執行evict client
4、reconnect_tick:mds恢復後等待client reconnect,45s超時後evict clients
相關引數
於mds session相關的配置引數有:
# ceph daemon mgr.luminous2 config show | grep mds_session_ "mds_session_autoclose": "300.000000", "mds_session_blacklist_on_evict": "true", "mds_session_blacklist_on_timeout": "true", "mds_session_timeout": "60.000000",
還有一些client相關的:
"client_reconnect_stale": "false", "client_tick_interval": "1.000000", "mon_client_ping_interval": "10.000000", "mon_client_ping_timeout": "30.000000",
evict client後的處理
從上面的實踐可以看出,evcit client後,client會被新增到osd blacklist裡,超時時間為1小時;在這個時間段內,client是不能訪問CephFS的;
但是通過命令:ceph osd blacklist rm <entry>
刪除osd的blacklist後,client端立刻就能繼續訪問CephFS,一切都跟之前正常時候一樣!
方法1:rm blacklist
root@mimic1:~# ceph tell mds.0 client evict id=25085 2018-09-05 11:07:43.580 7f80d37fe7000 client.25364 ms_handle_reset on 192.168.0.26:6800/1856812761 2018-09-05 11:07:44.292 7f80e8ff97000 client.25370 ms_handle_reset on 192.168.0.26:6800/1856812761 root@mimic1:~# ceph tell mds.0 client ls 2018-09-05 11:05:23.527 7f5005ffb7000 client.25301 ms_handle_reset on 192.168.0.26:6800/1856812761 2018-09-05 11:05:23.539 7f5006ffd7000 client.94941 ms_handle_reset on 192.168.0.26:6800/1856812761 [] root@mimic1:~# ceph osd blacklist rm 192.168.0.26:0/265326503 un-blacklisting 192.168.0.26:0/265326503 root@mimic1:~# ceph tell mds.0 client ls 2018-09-05 11:07:57.884 7fe07b7f67000 client.95022 ms_handle_reset on 192.168.0.26:6800/1856812761 2018-09-05 11:07:57.900 7fe07c7f87000 client.25400 ms_handle_reset on 192.168.0.26:6800/1856812761 []
然後在client host重新訪問以下掛載點目錄後,session變為正常
root@mimic1:~# ceph tell mds.0 client ls 2018-09-05 11:06:31.484 7f6c6bfff7000 client.94971 ms_handle_reset on 192.168.0.26:6800/1856812761 2018-09-05 11:06:31.496 7f6c717fa7000 client.94977 ms_handle_reset on 192.168.0.26:6800/1856812761 [ { "id": 25085, "num_leases": 0, "num_caps": 4, "state": "open", "replay_requests": 0, "completed_requests": 0, "reconnecting": false, "inst": "client.25085 192.168.0.26:0/265326503", "client_metadata": { "ceph_sha1": "5533ecdc0fda920179d7ad84e0aa65a127b20d77", "ceph_version": "ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic (stable)", "entity_id": "admin", "hostname": "mimic3", "mount_point": "/mnt/cephfuse", "pid": "44876", "root": "/" } } ]
方法2:wait 1小時
預設evict client後,新增osd blacklist的超時時間為1小時,考察1小時過後,session可以變為正常:
root@mimic1:~# ceph osd blacklist ls listed 0 entries
然後在client host重新訪問以下掛載點目錄後,session變為正常
root@mimic3:~# cd /mnt/cephfuse/ root@mimic3:/mnt/cephfuse# ls perftest
檢視mds的sessions:
root@mimic1:~# ceph tell mds.0 session ls 2018-09-05 13:56:26.630 7fae7f7fe7000 client.95118 ms_handle_reset on 192.168.0.26:6801/1541744746 2018-09-05 13:56:26.642 7fae94ff97000 client.25496 ms_handle_reset on 192.168.0.26:6801/1541744746 [ { "id": 25085, "num_leases": 0, "num_caps": 1, "state": "open", "replay_requests": 0, "completed_requests": 0, "reconnecting": false, "inst": "client.25085 192.168.0.26:0/265326503", "client_metadata": { "ceph_sha1": "5533ecdc0fda920179d7ad84e0aa65a127b20d77", "ceph_version": "ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic (stable)", "entity_id": "admin", "hostname": "mimic3", "mount_point": "/mnt/cephfuse", "pid": "44876", "root": "/" } } ]