1. 程式人生 > >ceph修改pg inconsistent

ceph修改pg inconsistent

ceph pg inconsistent

異常情況

1、收到異常情況如下:

HEALTH_ERR 37 scrub errors; Possible data damage: 1 pg inconsistent

2、查看詳細信息

#ceph health detail
HEALTH_ERR 37 scrub errors; Possible data damage: 1 pg inconsistent
OSD_SCRUB_ERRORS 37 scrub errors
PG_DAMAGED Possible data damage: 1 pg inconsistent
pg 1.dbc is active+clean+inconsistent, acting [55,71,25]

3、預處理辦法

一般情況采用 ceph pg [pgid],但是經過觀察,並不能解決。

參考處理辦法

https://ceph.com/geen-categorie/ceph-manually-repair-object/

Just move the object away with the following:

  • stop the OSD that has the wrong object responsible for that PG
  • flush the journal (ceph-osd -i <id> --flush-journal)
  • move the bad object to another location
  • start the OSD again
  • call ceph pg repair 17.1c1

我的處理過程

找出異常的 pg,然後到對應的osd所在主機上修復。


root@CLTQ-064-070:~# ceph osd find 55
{
    "osd": 55,
    "ip": "172.29.64.76:6817/789571",
    "crush_location": {
        "host": "CLTQ-064-076",
        "root": "default"
    }
}

這裏表示是主機CLTQ-064-076
然後到 進行修復

1、停止osd

systemctl stop [email protected]

2、刷入日誌

ceph-osd -i 55 --flush-journal

3、啟動osd

systemctl start [email protected]

4、修復(一般不需要)

ceph pg repair 1.dbc

5、查看pg所在osd

# ceph pg ls|grep 1.dbc

1.dbc      3695                  0        0         0       0 12956202159 1578     1578                active+clean 2018-04-03 19:34:45.924642  2489‘4678 2494:19003 [55,71,25]         55 [55,71,25]             55  2489‘4678 2018-04-03 18:32:56.365327       2489‘4678 2018-04-03 18:32:56.365327 

可以確認集群恢復OK。PG還是在 osd.55上。

ceph修改pg inconsistent