1. 程式人生 > >記一次線上應用連線池滿的處理

記一次線上應用連線池滿的處理

記一次線上應用dubbo-claim連線池滿的處理

這裡寫圖片描述
首先看到dubbo-claim應用突然大面積報錯,基本反饋是凍結預算出問題了,看了看凍結預算的程式碼,發現寫的非常複雜,果斷放棄看程式碼來排查問題。

Could not open jdbc connection for transaction: Pool empty. Unable to fetch a connection in 10 seconds, none available[size:20,busy:20; idle:0, lastwait:10000]

報錯中能獲得的資訊是資料庫的連線池滿了。因為是突然報錯,所以排查的主要兩個方向:
1. 呼叫量增大導致響應慢
2. 資料庫資料量增大導致響應慢

其他可能的小概率事件:
1. 資料庫連線未正常關閉
2. 新上的功能比較耗時拖慢資料庫
3. 資料庫有異常,或者機器網路等硬體問題(概率極小,不到最後不要想)
4. 資料庫連線是否配置小了?(20*4臺一共80。假設平均處理時間在200ms,能夠支撐400qps,足夠了)

接下來就需要驗證猜測,第一反應想看看jvm裡的執行緒都幹嘛呢,到底是誰在拖後腿,用 jstack 來打印出執行緒的活動狀態,我是用jvisualVM來直接 dump 執行緒。
接下來就是看日誌中的執行緒都幹嘛呢,首先看狀態為 BLOCK,和 WAITING 狀態的執行緒,看他們是因為什麼導致不執行。

"DubboServerHandler-10.3.0.87:20888-thread-152"
- Thread t@229 java.lang.Thread.State: BLOCKED at com.nfsq.xs.claim.biz.budget.BudgetService.freezeBudget(BudgetService.java:85) - waiting to lock <7153f861> (a com.nfsq.xs.claim.biz.budget.BudgetService) owned by "DubboServerHandler-10.3.0.87:20888-thread-176" t@253 at com.nfsq.xs
.claim.biz.budget.BudgetService$$FastClassBySpringCGLIB$$8dee689c.invoke(<generated>) at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)

接下來發現日誌裡一個非常重要資訊執行緒152在等執行緒176的一個鎖(7153f861),且其他很多執行緒也是在等待這把鎖
接下來看執行緒176這傢伙拿著這把鎖幹嘛呢

"DubboServerHandler-10.3.0.87:20888-thread-176" - Thread t@253
   java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:170)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:114)
    at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:161)
    at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:189)
    - locked <776bc7e0> (a com.mysql.jdbc.util.ReadAheadInputStream)
    at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3014)
    at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3467)
    at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3456)
    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3997)
    at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2468)
    at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2629)
    at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2719)
    - locked <54dd815d> (a com.mysql.jdbc.JDBC4Connection)
    at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2155)
    - locked <54dd815d> (a com.mysql.jdbc.JDBC4Connection)
    at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1379)
    - locked <54dd815d> (a com.mysql.jdbc.JDBC4Connection)
    at org.apache.ibatis.executor.statement.PreparedStatementHandler.query(PreparedStatementHandler.java:56)
    at org.apache.ibatis.executor.statement.RoutingStatementHandler.query(RoutingStatementHandler.java:70)
    at org.apache.ibatis.executor.SimpleExecutor.doQuery(SimpleExecutor.java:57)
    at org.apache.ibatis.executor.BaseExecutor.queryFromDatabase(BaseExecutor.java:267)
    at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:141)
    at org.apache.ibatis.executor.CachingExecutor.query(CachingExecutor.java:105)
    at org.apache.ibatis.executor.CachingExecutor.query(CachingExecutor.java:81)
    at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:101)
    at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:95)
    at sun.reflect.GeneratedMethodAccessor215.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:355)
    at com.sun.proxy.$Proxy21.selectList(Unknown Source)
    at org.mybatis.spring.SqlSessionTemplate.selectList(SqlSessionTemplate.java:195)
    at org.apache.ibatis.binding.MapperMethod.executeForMany(MapperMethod.java:124)
    at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:90)
    at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:40)
    at com.sun.proxy.$Proxy32.listActivityWalletDTO(Unknown Source)
    at com.nfsq.xs.claim.service.budget.ActivityWalletOperationService.listActivityWalletDTO(ActivityWalletOperationService.java:363)
    at com.nfsq.xs.claim.biz.budget.BudgetService.signContract(BudgetService.java:610)
    at com.nfsq.xs.claim.biz.budget.BudgetService.freezeBudget(BudgetService.java:95)
    - locked <7153f861> (a com.nfsq.xs.claim.biz.budget.BudgetService)

看到了執行緒176拿了鎖後最後一步執行的操作

ActivityWalletOperationService.listActivityWalletDTO

去看了程式碼發現對應的 SQL:

 select * from ACTIVITY_WALLET
 where 
    source_id = '240'
    and source = '1';   

這麼簡單的 SQL ,去資料看了資料量28萬條,一看索引竟然只有 id 索引,source_id 查詢走的全表掃描。找少炳來加了索引後,應用報錯全部都沒了,持續觀察了兩天問題解決。
最後在這裡感謝@張少炳,@趙統國 @凡小夢 @馬瑾 提供的幫助。