Hbase在hdfs上的archive目錄佔用空間過大
hbase版本:1.1.2 hadoop版本:2.7.3
Hbase在hdfs上的目錄/apps/hbase/data/archive佔用空間過大,導致不停地發出hdfs空間使用率告警。
【問題】
告警資訊 alert: datanode_storage is triggered 告警資訊表明某個或某些data node 的HDFS儲存空間使用率已超過閾值(我們設定的是80%),需要清理。
[hdfs@master-2 root]$ hdfs dfs -du -h /apps/hbase/data/archive/data/
19.1 M /apps/hbase/data/archive/data/default
12.6 T /apps/hbase/data/archive/data/good_namespace# 此目錄佔用過多空間
[hdfs@master-2 root]$ hdfs dfs -du -h /apps/hbase/data/archive/data/default
19.1 M /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3
[hdfs@master-2 root]$ hdfs dfs -du -h /apps/hbase/data/archive/data/good_namespace
4.8 M /apps/hbase/data/archive/data/good_namespace/url_history_30
1.3 M /apps/hbase/data/archive/data/good_namespace/url_history_7
5.8 M /apps/hbase/data/archive/data/good_namespace/user_statistic
12.6 T /apps/hbase/data/archive/data/good_namespace/users# 是這張表佔用了過多空間
90.8 M /apps/hbase/data/archive/data/good_namespace/weekly_stat
23.5 G /apps/hbase/data/archive/data/good_namespace/android_active_user_info
【分析】
查遍了HDFS叢集上所有可能發生資料臃腫的地方,例如oldWALs、.Trash,並清理了相關的檔案,效果甚微。
後來想到會不會是因為備份資料的歸檔導致的空間佔用,於是去檢視hbase相關的表做的快照snapshot,果然有一張大表的快照。
【解決】
清理佔用空間最多的錶快照,只保留最新的,刪除舊的
# 查詢發現/apps/hbase/data/archive/data目錄下的每個子目錄分別對應著hbase表的快照
hbase(main):002:0> list_snapshots
SNAPSHOT TABLE + CREATION TIME
KYLIN_YEDCQ82BF3_snapshot_20180315 KYLIN_YEDCQ82BF3 (Thu Mar 15 18:19:56 +0800 2018)
url_history_30-snapshot_20180313 good_namespace:url_history_30 (Tue Mar 13 16:41:54 +0800 2018)
url_history_7-snapshot_20180313 good_namespace:url_history_7 (Tue Mar 13 17:07:04 +0800 2018)
user_statistic_Snapshot_20180209 good_namespace:user_statistic (Fri Feb 09 18:01:21 +0800 2018)
user_statistic_snapshot_20180313 good_namespace:user_statistic (Tue Mar 13 16:36:06 +0800 2018)
users_snapshot_20180209 good_namespace:users (Fri Feb 09 17:26:51 +0800 2018)
users_snapshot_20180313 good_namespace:users (Tue Mar 13 15:39:20 +0800 2018)
users_snapshot_20180408 good_namespace:users (Sun Apr 08 16:16:32 +0800 2018)
weekly_stat_snapshot_20180313 good_namespace:weekly_stat (Tue Mar 13 16:17:33 +0800 2018)
android_active_user_info_Snapshot_20180212 good_namespace:android_active_user_info (Mon Feb 12 11:35:33 +0800 2018)
10 row(s) in 0.2800 seconds
=> ["KYLIN_YEDCQ82BF3_snapshot_20180315", "url_history_30-snapshot_20180313", "url_history_7-snapshot_20180313", "user_statistic_Snapshot_20180209", "user_statistic_snapshot_20180313", "users_snapshot_20180209", "users_snapshot_20180313", "users_snapshot_20180408", "weekly_stat_snapshot_20180313", "android_active_user_info_Snapshot_20180212"]
hbase(main):003:0>
# 先刪除舊的快照,再建立最新的快照,只保留最新的快照
hbase(main):002:0> list_snapshots
SNAPSHOT TABLE + CREATION TIME
KYLIN_YEDCQ82BF3_snapshot_20180315 KYLIN_YEDCQ82BF3 (Thu Mar 15 18:19:56 +0800 2018) url_history_30_snapshot_20180412 good_namespace:url_history_30 (Thu Apr 12 15:05:08 +0800 2018) url_history_7-snapshot_20180412 good_namespace:url_history_7 (Thu Apr 12 15:06:10 +0800 2018)
user_statistic_snapshot_20180412 good_namespace:user_statistic (Thu Apr 12 14:56:33 +0800 2018) users_snapshot_20180412 good_namespace:users (Thu Apr 12 14:59:40 +0800 2018) weekly_stat_snapshot_20180412 good_namespace:weekly_stat (Thu Apr 12 15:03:20 +0800 2018)
android_active_user_info_20180412 good_namespace:android_active_user_info (Thu Apr 12 15:02:15 +0800 2018) android_active_user_info_Snapshot_20180212 good_namespace:android_active_user_info (Mon Feb 12 11:35:33 +0800 2018) 8 row(s) in 0.1540 seconds
=> ["KYLIN_YEDCQ82BF3_snapshot_20180315", "url_history_30_snapshot_20180412", "url_history_7-snapshot_20180412", "user_statistic_snapshot_20180412", "users_snapshot_20180412", "weekly_stat_snapshot_20180412", "android_active_user_info_20180412", "android_active_user_info_Snapshot_20180212"]
# 再次檢視hdfs該目錄下的空間佔用情況:成功釋放掉快照相關檔案佔用的空間
[hdfs@master-2 root]$ hdfs dfs -du -h /apps/hbase/data/archive/data/
19.1 M /apps/hbase/data/archive/data/default
119.5 G /apps/hbase/data/archive/data/good_namespace# 此目錄佔用空間從12.6Tb下降到119.5Gb