遠端服務重啟導致httpclient連線池卡死的問題
從鎖說起吧,內建鎖synchroinzed和jucl(java.util.concurrent.locks)lock還是有很大區別的,一個很重要的區別就是使用jstack匯出執行緒dump,使用synchronized命令可以很容易看到鎖被哪個執行緒持有,但是jucl lock缺不行。上文中使用httpclient寫了一個連線池,使用了一個socket設定,setSoLinger(60),然後在一次測試中重啟了遠端的服務,結果導致了系統的卡死,檢視本機的連線,發現很多連線都處於close_wait狀態。考慮到遠端服務重啟,所以已有的連線肯定需要全部斷開,斷開就需要四次揮手,那麼close_wait是什麼狀態呢?
看上圖,很明顯是遠端server要重啟,主動關閉連線,傳送了FIN,本機收到應該會立刻迴應ACK,然後,本機應立刻傳送FIN,但是卻沒有發,停留在CLOSE_WAIT狀態。聯想到使用的SocketConfig setSoLinger(60),可能是這個設定導致這個問題。在hang住的時候,使用jstack檢視執行緒dump,如下:
"Thread-2158" #2171 prio=5 os_prio=0 tid=0x00000000193fe800 nid=0xa3c80 waiting on condition [0x000000003e70e000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000081eab2a8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at org.apache.http.pool.AbstractConnPool.getTotalStats(AbstractConnPool.java:509) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.formatStats(PoolingHttpClientConnectionManager.java:227) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.requestConnection(PoolingHttpClientConnectionManager.java:265) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:176) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at http.connectionPool.TestConnectionPool.lambda$main$1(TestConnectionPool.java:85) at http.connectionPool.TestConnectionPool$$Lambda$3/670035812.run(Unknown Source) at java.lang.Thread.run(Thread.java:748)
很多執行緒在等待進入鎖,那麼查詢一下誰持有0x0000000081eab2a8不就行了,哈哈。可是查了整個執行緒dump檔案,卻找不到到底是哪個執行緒持有了這把鎖,因為用的是ReentrantLock,和synchronized不一樣。 jstack命令可以加引數,-l列印lock詳情 -e列印執行緒詳情,試試 jstack -l pid 多了這麼一句
Locked ownable synchronizers:
再次搜尋,可以使用正則表示式排除大部分記錄
^.*(?<!wait for )<0x0000000081eb5388>.*$
查詢不是wait for的行
結果:
"Thread-2623" #2636 prio=5 os_prio=0 tid=0x000000001aa7f000 nid=0xba4c8 runnable [0x000000005daed000] java.lang.Thread.State: RUNNABLE at java.net.DualStackPlainSocketImpl.close0(Native Method) at java.net.DualStackPlainSocketImpl.socketClose0(DualStackPlainSocketImpl.java:167) at java.net.AbstractPlainSocketImpl.socketPreClose(AbstractPlainSocketImpl.java:693) at java.net.AbstractPlainSocketImpl.close(AbstractPlainSocketImpl.java:530) - locked <0x00000000839b0280> (a java.lang.Object) at java.net.PlainSocketImpl.close(PlainSocketImpl.java:237) at java.net.SocksSocketImpl.close(SocksSocketImpl.java:1075) at java.net.Socket.close(Socket.java:1495) - locked <0x00000000839b00e0> (a java.lang.Object) - locked <0x00000000839b00c0> (a java.net.Socket) at sun.security.ssl.BaseSSLSocketImpl.close(BaseSSLSocketImpl.java:624) - locked <0x00000000839aff68> (a sun.security.ssl.SSLSocketImpl) at sun.security.ssl.SSLSocketImpl.closeSocket(SSLSocketImpl.java:1585) at sun.security.ssl.SSLSocketImpl.closeInternal(SSLSocketImpl.java:1723) at sun.security.ssl.SSLSocketImpl.close(SSLSocketImpl.java:1612) at org.apache.http.impl.BHttpConnectionBase.close(BHttpConnectionBase.java:334) at org.apache.http.impl.conn.LoggingManagedHttpClientConnection.close(LoggingManagedHttpClientConnection.java:81) at org.apache.http.impl.conn.CPoolEntry.closeConnection(CPoolEntry.java:70) at org.apache.http.impl.conn.CPoolEntry.close(CPoolEntry.java:96) at org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(AbstractConnPool.java:318) at org.apache.http.pool.AbstractConnPool.access$200(AbstractConnPool.java:69) at org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:246) - locked <0x00000000d814f998> (a org.apache.http.pool.AbstractConnPool$2) at org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:193) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:303) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:279) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:191) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at http.connectionPool.TestConnectionPool.lambda$main$1(TestConnectionPool.java:90) at http.connectionPool.TestConnectionPool$$Lambda$3/670035812.run(Unknown Source) at java.lang.Thread.run(Thread.java:748) Locked ownable synchronizers: - <0x0000000081eb5388> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
找到了一個執行緒!!,可以看到正在close,並且持有了連線池的鎖,這樣的話,由於設定了setSoLinger,它會一直等待60s,於是請求連線的執行緒