1. 程式人生 > >kafka連zk報錯:Unable to read additional data from server sessionid 0x0...

kafka連zk報錯:Unable to read additional data from server sessionid 0x0...

問題描述:

主機資訊:

IP hostname
10.0.0.10 host10
10.0.0.12 host12
10.0.0.13 host13

在這三臺主機上部署一套zookeeper&kafka叢集環境的時候,zk叢集程序和埠都起來了。然後在啟動kafka的時候,報錯了,提示連不上zk。
因為該環境要求必須開啟防火牆,所以想到應該是因為2181埠沒有開放,所以kafka去連zk,提示連不上。於是在修改了防火牆,打開了2181埠和kafka的9092埠:

[[email protected] ~]#  firewall-cmd --list-ports
2181/tcp  9092/tcp 
[[email protected] ~]#

然後重新啟動kafka,依舊報錯,報錯資訊如下:

[[email protected] ~]# /usr/local/kafka_2.11-0.11.0.0/bin/kafka-server-start.sh /usr/local/kafka_2.11-0.11.0.0/config/server.properties
[2018-06-15 11:33:25,244] INFO KafkaConfig values:
        advertised.host.name = null
......   省略若干資訊
[2018-06-15 11:33:26,693] INFO Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2018-06-15 11:33:28,219] INFO Opening socket connection to server 10.0.0.12/10.0.0.12181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-06-15 11:33:28,221] INFO Socket connection established to 10.0.0.12/10.0.0.12181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-06-15 11:33:28,222] INFO Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2018-06-15 11:33:28,910] INFO Opening socket connection to server 10.0.0.13/10.0.0.13181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-06-15 11:33:28,911] INFO Socket connection established to 10.0.0.13/10.0.0.13181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-06-15 11:33:28,913] INFO Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2018-06-15 11:33:29,901] INFO Opening socket connection to server 10.0.0.10/10.0.0.10181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-06-15 11:33:29,902] INFO Socket connection established to 10.0.0.10/10.0.0.10181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-06-15 11:33:29,905] INFO Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2018-06-15 11:33:31,133] INFO Opening socket connection to server 10.0.0.12/10.0.0.12181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-06-15 11:33:31,135] INFO Socket connection established to 10.0.0.12/10.0.0.12181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-06-15 11:33:31,136] INFO Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2018-06-15 11:33:31,379] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2018-06-15 11:33:31,553] INFO Session: 0x0 closed (org.apache.zookeeper.ZooKeeper)
[2018-06-15 11:33:31,555] FATAL Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server '10.0.0.10:2181,10.0.0.12:2181,10.0.0.13:2181' with timeout of 6000 ms
        at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1233)
        at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:157)
        at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:131)
        at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:103)
        at kafka.utils.ZkUtils$.apply(ZkUtils.scala:85)
        at kafka.server.KafkaServer.initZk(KafkaServer.scala:338)
        at kafka.server.KafkaServer.startup(KafkaServer.scala:191)
        at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
        at kafka.Kafka$.main(Kafka.scala:65)
        at kafka.Kafka.main(Kafka.scala)
[2018-06-15 11:33:31,555] INFO EventThread shut down for session: 0x0 (org.apache.zookeeper.ClientCnxn)
[2018-06-15 11:33:31,562] INFO shutting down (kafka.server.KafkaServer)
[2018-06-15 11:33:31,568] INFO shut down completed (kafka.server.KafkaServer)
[2018-06-15 11:33:31,568] FATAL Exiting Kafka. (kafka.server.KafkaServerStartable)
[2018-06-15 11:33:31,571] INFO shutting down (kafka.server.KafkaServer)

檢視kafka的配置,並未發現任何異常。於是檢查zk自身是否OK。但是程序、埠都一切正常。
使用./zkCli.sh去檢查zk是否正常啟動,出現了報錯:

[[email protected] bin]# ./zkCli.sh
Connecting to localhost:2181
2018-06-15 14:44:05,215 [myid:] - INFO  [main:[email protected]] - Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
......    #省略若干資訊
2018-06-15 14:44:05,225 [myid:] - INFO  [main:
[email protected]
] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 [email protected] Welcome to ZooKeeper! 2018-06-15 14:44:05,255 [myid:] - INFO [main-SendThread(localhost:2181):[email protected]] - Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error) JLine support is enabled 2018-06-15 14:44:05,334 [myid:] - INFO [main-SendThread(localhost:2181):[email protected]] - Socket connection established to localhost/0:0:0:0:0:0:0:1:2181, initiating session 2018-06-15 14:44:05,348 [myid:] - INFO [main-SendThread(localhost:2181):[email protected]] - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect [zk: localhost:2181(CONNECTING) 0] 2018-06-15 14:44:05,564 [myid:] - INFO [main-SendThread(localhost:2181):[email protected]] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 2018-06-15 14:44:05,565 [myid:] - INFO [main-SendThread(localhost:2181):[email protected]] - Socket connection established to localhost/127.0.0.1:2181, initiating session 2018-06-15 14:44:05,573 [myid:] - INFO [main-SendThread(localhost:2181):[email protected]] - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 2018-06-15 14:44:07,626 [myid:] - INFO [main-SendThread(localhost:2181):[email protected]] - Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error) 2018-06-15 14:44:07,627 [myid:] - INFO [main-SendThread(localhost:2181):[email protected]] - Socket connection established to localhost/0:0:0:0:0:0:0:1:2181, initiating session 2018-06-15 14:44:07,629 [myid:] - INFO [main-SendThread(localhost:2181):[email protected]] - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 2018-06-15 14:44:07,929 [myid:] - INFO [main-SendThread(localhost:2181):[email protected]] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 2018-06-15 14:44:07,930 [myid:] - INFO [main-SendThread(localhost:2181):[email protected]] - Socket connection established to localhost/127.0.0.1:2181, initiating session 2018-06-15 14:44:07,932 [myid:] - INFO [main-SendThread(localhost:2181):[email protected]] - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 2018-06-15 14:44:09,329 [myid:] - INFO [main-SendThread(localhost:2181):[email protected]] - Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error) 2018-06-15 14:44:09,330 [myid:] - INFO [main-SendThread(localhost:2181):[email protected]] - Socket connection established to localhost/0:0:0:0:0:0:0:1:2181, initiating session 2018-06-15 14:44:09,332 [myid:] - INFO [main-SendThread(localhost:2181):[email protected]] - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 2018-06-15 14:44:09,464 [myid:] - INFO [main-SendThread(localhost:2181):[email protected]] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)

分析:

查看了網上很多的文章,有以下幾種說法:
1、zk的myid檔案和配置檔案server.xx的編號對不上:
2、zk只啟動了一個節點,其他節點沒有起來

排查:
1、檢查myid編號和配置檔案,是OK的,能對上:

[[email protected] ~]# cat /data/zookeeper/myid
1
[[email protected] ~]#
[[email protected] ~]# vim /usr/local/zookeeper-3.4.10/conf/zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
......  #省略若干
server.1=10.0.0.10:2888:3888   #server.1和myid檔案的1是能對上的
server.2=10.0.0.12:2888:3888
server.3=10.0.0.13:2888:3888

2、zookeeper的配置檔案zoo.cfg裡面配置了三個server,實際上其他兩個節點沒有起來,因此根據zookeeper的選舉演算法,當整個叢集超過半數機器宕機,zookeeper會認為叢集處於不可用狀態。

檢查了三個節點的服務,都是起來了的。
試著將其中一臺zk節點10.0.0.10,修改zoo.cfg,註釋掉配置的其他兩個server,只留本機的server配置:

[[email protected] ~]# vim /usr/local/zookeeper-3.4.10/conf/zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
......  #省略若干
server.1=10.0.0.10:2888:3888   #註釋掉server.2和server.3,只剩下本機節點
#server.2=10.0.0.12:2888:3888
#server.3=10.0.0.13:2888:3888

然後使用zkCli.sh去檢測zk狀態,就是OK的:

[[email protected] ~]# cd /usr/local/zookeeper-3.4.10/bin/
[[email protected] bin]# ./zkCli.sh
Connecting to localhost:2181
2018-06-15 15:58:10,586 [myid:] - INFO  [main:[email protected]] - Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
......  #省略若干
2018-06-15 15:58:10,597 [myid:] - INFO  [main:[email protected]] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 [email protected]
Welcome to ZooKeeper!
2018-06-15 15:58:10,627 [myid:] - INFO  [main-SendThread(localhost:2181):[email protected]] - Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error)
JLine support is enabled
2018-06-15 15:58:10,708 [myid:] - INFO  [main-SendThread(localhost:2181):[email protected]] - Socket connection established to localhost/0:0:0:0:0:0:0:1:2181, initiating session
[zk: localhost:2181(CONNECTING) 0] 2018-06-15 15:58:10,733 [myid:] - INFO  [main-SendThread(localhost:2181):[email protected]] - Session establishment complete on server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x164023cac190003, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null

[zk: localhost:2181(CONNECTED) 0]

於是想到是防火牆只開了zookeeper的2181埠。zk有三個埠:
3個埠的作用
1、2181:對cline端提供服務
2、3888:選舉leader使用
3、2888:叢集內機器通訊使用(Leader監聽此埠)

因為防火牆的作用,三個節點之間2888和3888埠都是互相不通的,因此叢集之前沒有辦法通訊,可能就會以為是剩下兩個節點都掛掉了。於是就出現了上面的報錯

解決:

1、重新修改防火牆,開啟2888和3888埠

[[email protected] ~]#  firewall-cmd --list-ports
 3888/tcp 2181/tcp 2888/tcp  9092/tcp 
[[email protected] ~]#

2、為保證萬一,把整個zk叢集全部按照順序重啟了一遍
3、三個幾點依次都用zkCli.sh去連了一下,狀態都是OK的
4、重新啟動kafka,就OK啦