1. 程式人生 > >總結遇到的幾次MongoDB副本集初始化失敗問題

總結遇到的幾次MongoDB副本集初始化失敗問題

alt 51cto http 報錯 param 0.0.0.0 ^c fir nodes

前言:

在之前搭建MongoDB集群中,遇到過幾次小問題引起的初始化副本集失敗,都是之前初學時踩的坑,做個小結。

1、IP錯誤引起MongoDB副本集初始化失敗

這個錯誤在另一篇文章已經描述過,這裏略過不贅述。
詳情見博客:IP錯誤引起MongoDB副本集初始化失敗

2、PRIMARY與SECONDARY主機mongodb-keyfile文件內容不一致,導致在PRIMARY上添加副本集失敗

問題描述:

搭建另外一個MongoDB副本集,主機和角色分配如下:

主機IP 角色 系統
131.10.11.106 PRIMARY centos7
131.10.11.111 SECONDARY centos7
131.10.11.114 SECONDARY centos7

MongoDB server version: 3.4.10.1

在PRIMARY上添加SECONDARY主機131.10.11.111,出現下面的報錯:

mongotest:PRIMARY> rs.add("131.10.11.111:27017")
{
    "ok" : 0,
    "errmsg" : "Quorum check failed because not enough voting nodes responded; required 2 but only the following 1 voting nodes responded: 131.10.11.106:27017; the following nodes did not respond affirmatively: 131.10.11.111:27017 failed with Authentication failed.",
    "code" : 74,
    "codeName" : "NodeNotFound"
}

原因分析:

經過排查,發現131.10.11.111主機的mongodb-keyfile和主節點不一致,並且在131.10.11.111主機的配置文件mongo.conf文件沒有配置安全認證,所以導致了初始化失敗

解決方法:

1、將PRIMARY節點上的mongodb-keyfile文件復制到備節點131.10.11.111上,並且修改權限為400
2、並且修改配置文件/etc/mongodb/mongo.conf如下:

[root@mongodb111 mongodb]# cat mongo.conf
systemLog:
   destination: file
   path: "/opt/mongodbdata/mongod.log"
   logAppend: true
storage:
   journal:
      enabled: true
   dbPath: /opt/mongodbdata
setParameter:
   enableLocalhostAuthBypass: true
processManagement:
   fork: true
   pidFilePath: "/opt/mongodbdata/mongod.pid"
replication:                          
   replSetName: mongotest  
#添加下面幾行:
security:
   authorization: enabled
   keyFile: "/etc/mongodb/mongodb-keyfile"
[root@mongodb111 mongodb]#

重啟131.10.11.111機器mongodb,然後重新在PRIMARY上執行 rs.add("131.10.11.111:27017"),成功。

3、備節點配置文件沒有配置replSet,導致添加副本集失敗

問題描述:

這個問題和問題2是在同一個環境中遇到的,在106主機上添加114主機的時候,報下面的錯誤:

mongotest:PRIMARY> rs.add("131.10.11.114:27017")
{
    "ok" : 0,
    "errmsg" : "Quorum check failed because not enough voting nodes responded; required 2 but only the following 1 voting nodes responded: 131.10.11.106:27017; the following nodes did not respond affirmatively: 131.10.11.114:27017 failed with not running with --replSet",
    "code" : 74,
    "codeName" : "NodeNotFound"
}

原因分析:

根據提示“the following nodes did not respond affirmatively: 131.10.11.114:27017 failed with not running with --replSe”,查看了114主機的配置文件mongo.conf,發現這是因為備節點上的配置文件裏面沒有配置副本集,所以無法添加

解決方法:

修改備節點的/etc/mongodb/mongo.conf配置文件如下,加上副本集配置:

[root@mongodb114 mongodb]# cat mongo.conf
systemLog:
   destination: file
   path: "/opt/mongodbdata/mongod.log"
   logAppend: true
storage:
   journal:
      enabled: true
   dbPath: /opt/mongodbdata
setParameter:
   enableLocalhostAuthBypass: true
processManagement:
   fork: true
   pidFilePath: "/opt/mongodbdata/mongod.pid"
security:
   authorization: enabled
   keyFile: "/etc/mongodb/mongodb-keyfile"
replication:                           #加上副本集配置,
   replSetName: mongotest        #name要註意和主節點上保持一致
[root@mongodb114 mongodb]#

重啟131.10.11.114機器mongodb,然後重新在PRIMARY上執行 rs.add("131.10.11.114:27017"),成功

4、bindIp默認127.0.0.1,導致MongoDB副本集初始化失敗

問題描述:

有一次搭建一個MongoDB副本集,主機和角色分配如下:

主機IP 角色 系統
10.0.0.101 PRIMARY centos7
10.0.0.102 SECONDARY centos7
10.0.0.103 SECONDARY centos7

MongoDB server version: 4.0.2
在PRIMARY主機10.0.0.101上加入SECONDARY主機10.0.0.102的時候出現這個錯誤:
添加從節點失敗:

CrystalTest:PRIMARY> rs.add("10.0.0.102:27017") 
{
    "operationTime" : Timestamp(1539054715, 1),
    "ok" : 0,
    "errmsg" : "Quorum check failed because not enough voting nodes responded; required 2 but only the following 1 voting nodes responded: 10.0.0.101:27017; the following nodes did not respond affirmatively: 10.0.0.102:27017 failed with Error connecting to 10.0.0.102:27017 :: caused by :: Connection refused",
    "code" : 74,
    "codeName" : "NodeNotFound",
    "$clusterTime" : {
        "clusterTime" : Timestamp(1539054715, 1),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}

原因分析:

看到 “failed with Error connecting to 10.0.0.102:27017 :: caused by :: Connection refused”的時候很疑惑,因為10.0.0.102主機上的27017端口是OK的,服務也能正常使用,防火墻什麽的都是關掉了的,嘗試在PRIMARY主機10.0.0.101主機上telnet,發現不通:

[root@test101 ~]# telnet 10.0.0.102 27017
Trying 10.0.0.102...
telnet: connect to address 10.0.0.102: Connection refused

然後到102主機上查看端口,發現bindIp是127.0.0.1,問題應該就是這裏了。bindIp是127.0.0.1,因此導致了10.0.0.101主機連不過去:

[root@test102 ~]# netstat -tlunp|grep mongo
tcp        0      0 127.0.0.1:27017         0.0.0.0:*               LISTEN      1065/mongod    #顯示的是127.0.0.1:27017

解決方法:

修改102主機的mongo.conf加入“bindIp: 0.0.0.0 ”,然後重啟102主機的MongoDB

[root@test102 bin]# cat /etc/mongodb/mongo.conf         
systemLog:
   destination: file
   path: "/opt/mongodbdata/mongod.log"
   logAppend: true
storage:
   journal:
      enabled: true
   dbPath: /opt/mongodbdata
setParameter:
   enableLocalhostAuthBypass: true
processManagement:
   fork: true
   pidFilePath: "/opt/mongodbdata/mongod.pid"
replication:
   replSetName: CrystalTest
security:
   authorization: enabled
   keyFile: "/etc/mongodb/mongodb-keyfile"
net:
  port: 27017
  bindIp: 0.0.0.0     #加入這一行

再查看端口:

[root@test102 mongodb]# netstat -tlunp|grep 27017
tcp        0      0 0.0.0.0:27017           0.0.0.0:*               LISTEN      3433/mongod           #變成了0 0.0.0.0:27017 
[root@test102 mongodb]#

然後在101主機上telnet,可以連過去了:

[root@test101 ~]# telnet 10.0.0.102 27017
Trying 10.0.0.102...
Connected to 10.0.0.102.
Escape character is ‘^]‘.
^C^C

Connection closed by foreign host.
[root@test101 ~]# 

重新在PRIMARY主機10.0.0.101添加102主機,就成功了:

CrystalTest:PRIMARY> rs.add("10.0.0.102:27017")
{
    "ok" : 1,
    "operationTime" : Timestamp(1539056959, 1),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1539056959, 1),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}

總結遇到的幾次MongoDB副本集初始化失敗問題