總結遇到的幾次MongoDB副本集初始化失敗問題
在之前搭建MongoDB集群中,遇到過幾次小問題引起的初始化副本集失敗,都是之前初學時踩的坑,做個小結。
1、IP錯誤引起MongoDB副本集初始化失敗
這個錯誤在另一篇文章已經描述過,這裏略過不贅述。
詳情見博客:IP錯誤引起MongoDB副本集初始化失敗
2、PRIMARY與SECONDARY主機mongodb-keyfile文件內容不一致,導致在PRIMARY上添加副本集失敗
問題描述:
搭建另外一個MongoDB副本集,主機和角色分配如下:
主機IP | 角色 | 系統 |
---|---|---|
131.10.11.106 | PRIMARY | centos7 |
131.10.11.111 | SECONDARY | centos7 |
131.10.11.114 | SECONDARY | centos7 |
MongoDB server version: 3.4.10.1
在PRIMARY上添加SECONDARY主機131.10.11.111,出現下面的報錯:
mongotest:PRIMARY> rs.add("131.10.11.111:27017") { "ok" : 0, "errmsg" : "Quorum check failed because not enough voting nodes responded; required 2 but only the following 1 voting nodes responded: 131.10.11.106:27017; the following nodes did not respond affirmatively: 131.10.11.111:27017 failed with Authentication failed.", "code" : 74, "codeName" : "NodeNotFound" }
原因分析:
經過排查,發現131.10.11.111主機的mongodb-keyfile和主節點不一致,並且在131.10.11.111主機的配置文件mongo.conf文件沒有配置安全認證,所以導致了初始化失敗
解決方法:
1、將PRIMARY節點上的mongodb-keyfile文件復制到備節點131.10.11.111上,並且修改權限為400
2、並且修改配置文件/etc/mongodb/mongo.conf如下:
[root@mongodb111 mongodb]# cat mongo.conf systemLog: destination: file path: "/opt/mongodbdata/mongod.log" logAppend: true storage: journal: enabled: true dbPath: /opt/mongodbdata setParameter: enableLocalhostAuthBypass: true processManagement: fork: true pidFilePath: "/opt/mongodbdata/mongod.pid" replication: replSetName: mongotest #添加下面幾行: security: authorization: enabled keyFile: "/etc/mongodb/mongodb-keyfile" [root@mongodb111 mongodb]#
重啟131.10.11.111機器mongodb,然後重新在PRIMARY上執行 rs.add("131.10.11.111:27017"),成功。
3、備節點配置文件沒有配置replSet,導致添加副本集失敗
問題描述:
這個問題和問題2是在同一個環境中遇到的,在106主機上添加114主機的時候,報下面的錯誤:
mongotest:PRIMARY> rs.add("131.10.11.114:27017")
{
"ok" : 0,
"errmsg" : "Quorum check failed because not enough voting nodes responded; required 2 but only the following 1 voting nodes responded: 131.10.11.106:27017; the following nodes did not respond affirmatively: 131.10.11.114:27017 failed with not running with --replSet",
"code" : 74,
"codeName" : "NodeNotFound"
}
原因分析:
根據提示“the following nodes did not respond affirmatively: 131.10.11.114:27017 failed with not running with --replSe”,查看了114主機的配置文件mongo.conf,發現這是因為備節點上的配置文件裏面沒有配置副本集,所以無法添加
解決方法:
修改備節點的/etc/mongodb/mongo.conf配置文件如下,加上副本集配置:
[root@mongodb114 mongodb]# cat mongo.conf
systemLog:
destination: file
path: "/opt/mongodbdata/mongod.log"
logAppend: true
storage:
journal:
enabled: true
dbPath: /opt/mongodbdata
setParameter:
enableLocalhostAuthBypass: true
processManagement:
fork: true
pidFilePath: "/opt/mongodbdata/mongod.pid"
security:
authorization: enabled
keyFile: "/etc/mongodb/mongodb-keyfile"
replication: #加上副本集配置,
replSetName: mongotest #name要註意和主節點上保持一致
[root@mongodb114 mongodb]#
重啟131.10.11.114機器mongodb,然後重新在PRIMARY上執行 rs.add("131.10.11.114:27017"),成功
4、bindIp默認127.0.0.1,導致MongoDB副本集初始化失敗
問題描述:
有一次搭建一個MongoDB副本集,主機和角色分配如下:
主機IP | 角色 | 系統 |
---|---|---|
10.0.0.101 | PRIMARY | centos7 |
10.0.0.102 | SECONDARY | centos7 |
10.0.0.103 | SECONDARY | centos7 |
MongoDB server version: 4.0.2
在PRIMARY主機10.0.0.101上加入SECONDARY主機10.0.0.102的時候出現這個錯誤:
添加從節點失敗:
CrystalTest:PRIMARY> rs.add("10.0.0.102:27017")
{
"operationTime" : Timestamp(1539054715, 1),
"ok" : 0,
"errmsg" : "Quorum check failed because not enough voting nodes responded; required 2 but only the following 1 voting nodes responded: 10.0.0.101:27017; the following nodes did not respond affirmatively: 10.0.0.102:27017 failed with Error connecting to 10.0.0.102:27017 :: caused by :: Connection refused",
"code" : 74,
"codeName" : "NodeNotFound",
"$clusterTime" : {
"clusterTime" : Timestamp(1539054715, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
原因分析:
看到 “failed with Error connecting to 10.0.0.102:27017 :: caused by :: Connection refused”的時候很疑惑,因為10.0.0.102主機上的27017端口是OK的,服務也能正常使用,防火墻什麽的都是關掉了的,嘗試在PRIMARY主機10.0.0.101主機上telnet,發現不通:
[root@test101 ~]# telnet 10.0.0.102 27017
Trying 10.0.0.102...
telnet: connect to address 10.0.0.102: Connection refused
然後到102主機上查看端口,發現bindIp是127.0.0.1,問題應該就是這裏了。bindIp是127.0.0.1,因此導致了10.0.0.101主機連不過去:
[root@test102 ~]# netstat -tlunp|grep mongo
tcp 0 0 127.0.0.1:27017 0.0.0.0:* LISTEN 1065/mongod #顯示的是127.0.0.1:27017
解決方法:
修改102主機的mongo.conf加入“bindIp: 0.0.0.0 ”,然後重啟102主機的MongoDB
[root@test102 bin]# cat /etc/mongodb/mongo.conf
systemLog:
destination: file
path: "/opt/mongodbdata/mongod.log"
logAppend: true
storage:
journal:
enabled: true
dbPath: /opt/mongodbdata
setParameter:
enableLocalhostAuthBypass: true
processManagement:
fork: true
pidFilePath: "/opt/mongodbdata/mongod.pid"
replication:
replSetName: CrystalTest
security:
authorization: enabled
keyFile: "/etc/mongodb/mongodb-keyfile"
net:
port: 27017
bindIp: 0.0.0.0 #加入這一行
再查看端口:
[root@test102 mongodb]# netstat -tlunp|grep 27017
tcp 0 0 0.0.0.0:27017 0.0.0.0:* LISTEN 3433/mongod #變成了0 0.0.0.0:27017
[root@test102 mongodb]#
然後在101主機上telnet,可以連過去了:
[root@test101 ~]# telnet 10.0.0.102 27017
Trying 10.0.0.102...
Connected to 10.0.0.102.
Escape character is ‘^]‘.
^C^C
Connection closed by foreign host.
[root@test101 ~]#
重新在PRIMARY主機10.0.0.101添加102主機,就成功了:
CrystalTest:PRIMARY> rs.add("10.0.0.102:27017")
{
"ok" : 1,
"operationTime" : Timestamp(1539056959, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1539056959, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
總結遇到的幾次MongoDB副本集初始化失敗問題