1. 程式人生 > >JDK的sql設計不合理導致的驅動類初始化死鎖問題

JDK的sql設計不合理導致的驅動類初始化死鎖問題

問題描述
當我們一個系統既需要mysql驅動,也需要oracle驅動的時候,在併發載入初始化這些驅動類的過程中產生死鎖的可能性非常大,下面是一個模擬的例子,對於Thread2的實現其實是jdk裡java.sql.DriverService的邏輯,也是我們第一次呼叫java.sql.DriverManager.registerDriver註冊一個驅動例項要走的邏輯(jdk1.6下),不過這篇文章是使用我們生產環境的一個系統的執行緒dump和記憶體dump為基礎進行分析展開的。

import java.util.Iterator;

import sun.misc.Service;

public class Main {
    public static void main(String[] args) throws ClassNotFoundException {
        Thread1 thread1 = new Thread1();
        Thread2 thread2 = new Thread2();
        thread1.start();
        thread2.start();
    }
}

class Thread1 extends Thread {
    public void run() {
        try {
            Class clazz = Class.forName("com.mysql.jdbc.Driver", true, Thread.currentThread()
                .getContextClassLoader());
            System.out.println(clazz);
        } catch (ClassNotFoundException e) {
        }
    }
}

class Thread2 extends Thread {
    public void run() {
        Iterator ps = Service.providers(java.sql.Driver.class);
        try {
            while (ps.hasNext()) {
                System.out.println(ps.next());
            }
        } catch (Throwable t) {

        }
    }
}

如果以上程式碼執行過程中發現有執行緒一直卡死在Class.forName的呼叫裡,那麼說明問題已經重現了。

先上兩張圖
thread_2
heap_thread_1

下面是問題程序執行緒dump的分析”thread dump”,還有這個程序的記憶體dump分析”heap dump”,這個後面展開分析的基礎

存疑點
仔細看看上面的執行緒dump分析和記憶體dump分析裡的執行緒分析模組,您可能會有如下兩個疑惑:

【為什麼執行緒”Thread-0″一直卡在Class.forName的位置】:這有點出乎意料,做一個類載入要麼找不到丟擲ClassNotFoundException,要麼找到直接返回,為什麼會一直卡在這個位置呢?

【明明”Thread-0″註冊的是mysql驅動為什麼會去載入Odbc的驅動類】:通過”Thread-0″在棧上看倒數第二幀展開看到傳入Class.forName的引數是com.mysql.jdbc.Driver,然後展開棧上順序第二幀,看到傳入的引數是sun.jdbc.odbc.JdbcOdbcDriver,這意味著在對mysql驅動類做載入初始化的過程中又觸發了JdbcOdbc驅動類的載入

疑惑點解釋
疑惑二:
第一個疑惑我們先留著,先解釋下第二個疑惑,大家可以對照堆疊通過反編譯rt.jar還有ojdbc6-11.2.0.3.0.jar看具體的程式碼

驅動類載入過程簡要介紹:

當要註冊某個sql驅動的時候是通過呼叫java.sql.DriverManager.registerDriver來實現的(注意這個方法加了synchronized關鍵字,後面解釋第一個疑惑的時候是關鍵),而這個方法在第一次執行過程中,會在當前執行緒classloader的classpath下尋找所有/META-INF/services/java.sql.Driver檔案,這個檔案在mysql和oracle驅動jar裡都有,裡面寫的是對應的驅動實現類名,這種機制是jdk提供的spi實現,找到這些檔案之後,依次使用Class.forName(driverClassName, true, this.loader)來對這些驅動類進行載入,其中第二個引數是true,意味著不僅僅做一次loadClass的動作,還會初始化該類,即呼叫包含靜態塊的< clinit >方法,執行完之後才會返回,這樣就解釋了第二個疑惑,在mysql驅動註冊過程中還會對odbc驅動類進行載入並初始化

感想:

其實我覺得這種設計有點傻,為什麼要乾和自己不相關的事情呢,畫蛇添足的設計,首先類初始化的開銷是否放到一起做並沒有多大區別,其次正由於這種設計導致了今天這個死鎖的發生

疑惑一:
現在來說第一個疑惑,為什麼會一直卡在Class.forName呢,到底卡在哪裡,於是再通過jstack -m 命令將jvm裡的堆疊也打印出來,如下所示

----------------- 5738 -----------------
0x003f67a2      _dl_sysinfo_int80 + 0x2
0xb79a71ae      _ZN2os13PlatformEvent4parkEv + 0xee
0xb7997acb      _ZN13ObjectMonitor4waitExbP6Thread + 0x5fb
0xb7a73c53      _ZN18ObjectSynchronizer19waitUninterruptiblyE6HandlexP6Thread + 0x53
0xb777eb34      _ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread + 0x74
0xb777e288      _ZN13instanceKlass10initializeEP6Thread + 0x58
0xb7821ad9      _Z28find_class_from_class_loaderP7JNIEnv_12symbolHandleh6HandleS2_hP6Thread + 0xb9
0xb7807d99      JVM_FindClassFromClassLoader + 0x269
0xb734c236      Java_java_lang_Class_forName0 + 0x116
0xb433064a      * java.lang.Class.forName0(java.lang.String, boolean, java.lang.ClassLoader) bci:0 (Interpreted frame)
0xb4328fa7      * java.lang.Class.forName(java.lang.String, boolean, java.lang.ClassLoader) bci:32 line:247 (Interpreted frame)
0xb4328fa7      * sun.misc.Service$LazyIterator.next() bci:31 line:271 (Interpreted frame)
0xb4329483      * java.sql.DriverService.run() bci:26 line:664 (Interpreted frame)
0xb43263e6
0xb77a4e31      _ZN9JavaCalls11call_helperEP9JavaValueP12methodHandleP17JavaCallArgumentsP6Thread + 0x1c1
0xb79a6418      _ZN2os20os_exception_wrapperEPFvP9JavaValueP12methodHandleP17JavaCallArgumentsP6ThreadES1_S3_S5_S7_ + 0x18
0xb77a4c5f      _ZN9JavaCalls4callEP9JavaValue12methodHandleP17JavaCallArgumentsP6Thread + 0x2f
0xb780aace      JVM_DoPrivileged + 0x40e
0xb734b95d      Java_java_security_AccessController_doPrivileged__Ljava_security_PrivilegedAction_2 + 0x3d
0xb433064a      * java.security.AccessController.doPrivileged(java.security.PrivilegedAction) bci:0 (Interpreted frame)
0xb4328fa7      * java.sql.DriverManager.loadInitialDrivers() bci:31 line:506 (Interpreted frame)
0xb432910d      * java.sql.DriverManager.initialize() bci:11 line:612 (Interpreted frame)
0xb432910d      * java.sql.DriverManager.registerDriver(java.sql.Driver) bci:6 line:281 (Interpreted frame)
0xb432910d      * com.mysql.jdbc.Driver.() bci:7 line:65 (Interpreted frame)
0xb43263e6
0xb77a4e31      _ZN9JavaCalls11call_helperEP9JavaValueP12methodHandleP17JavaCallArgumentsP6Thread + 0x1c1
0xb79a6418      _ZN2os20os_exception_wrapperEPFvP9JavaValueP12methodHandleP17JavaCallArgumentsP6ThreadES1_S3_S5_S7_ + 0x18
0xb77a4c5f      _ZN9JavaCalls4callEP9JavaValue12methodHandleP17JavaCallArgumentsP6Thread + 0x2f
0xb77800c1      _ZN13instanceKlass27call_class_initializer_implE19instanceKlassHandleP6Thread + 0xa1
0xb777ed8e      _ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread + 0x2ce
0xb777e288      _ZN13instanceKlass10initializeEP6Thread + 0x58
0xb7821ad9      _Z28find_class_from_class_loaderP7JNIEnv_12symbolHandleh6HandleS2_hP6Thread + 0xb9
0xb7807d99      JVM_FindClassFromClassLoader + 0x269
0xb734c236      Java_java_lang_Class_forName0 + 0x116
0xb433064a      * java.lang.Class.forName0(java.lang.String, boolean, java.lang.ClassLoader) bci:0 (Interpreted frame)
0xb4328fa7      * java.lang.Class.forName(java.lang.String, boolean, java.lang.ClassLoader) bci:32 line:247 (Interpreted frame)
0xb4328fa7      * Thread1.run() bci:9 line:17 (Interpreted frame)

我們看到其實正在做類的初始化動作,並且執行緒正在呼叫ObjectSynchronizer::waitUninterruptibly一直沒返回,在看這方法的呼叫者instanceKlass1::initialize_impl,我們找到原始碼位置如下:

void instanceKlass::initialize_impl(instanceKlassHandle this_oop, TRAPS) {
  // Make sure klass is linked (verified) before initialization
  // A class could already be verified, since it has been reflected upon.
  this_oop->link_class(CHECK);

  DTRACE_CLASSINIT_PROBE(required, instanceKlass::cast(this_oop()), -1);

  bool wait = false;

  // refer to the JVM book page 47 for description of steps
  // Step 1
  { ObjectLocker ol(this_oop, THREAD);

    Thread *self = THREAD; // it's passed the current thread

    // Step 2
    // If we were to use wait() instead of waitInterruptibly() then
    // we might end up throwing IE from link/symbol resolution sites
    // that aren't expected to throw.  This would wreak havoc.  See 6320309.
    while(this_oop->is_being_initialized() && !this_oop->is_reentrant_initialization(self)) {
        wait = true;
      ol.waitUninterruptibly(CHECK);
    }

    // Step 3
    if (this_oop->is_being_initialized() && this_oop->is_reentrant_initialization(self)) {
      DTRACE_CLASSINIT_PROBE_WAIT(recursive, instanceKlass::cast(this_oop()), -1,wait);
      return;
    }

    // Step 4
    if (this_oop->is_initialized()) {
      DTRACE_CLASSINIT_PROBE_WAIT(concurrent, instanceKlass::cast(this_oop()), -1,wait);
      return;
    }

    // Step 5
    if (this_oop->is_in_error_state()) {
      DTRACE_CLASSINIT_PROBE_WAIT(erroneous, instanceKlass::cast(this_oop()), -1,wait);
      ResourceMark rm(THREAD);
      const char* desc = "Could not initialize class ";
      const char* className = this_oop->external_name();
      size_t msglen = strlen(desc) + strlen(className) + 1;
      char* message = NEW_RESOURCE_ARRAY(char, msglen);
      if (NULL == message) {
        // Out of memory: can't create detailed error message
        THROW_MSG(vmSymbols::java_lang_NoClassDefFoundError(), className);
      } else {
        jio_snprintf(message, msglen, "%s%s", desc, className);
        THROW_MSG(vmSymbols::java_lang_NoClassDefFoundError(), message);
      }
    }

    // Step 6
    this_oop->set_init_state(being_initialized);
    this_oop->set_init_thread(self);
  }
  ...
}

類的初始化過程:

當某個執行緒獲得機會對某個類進行初始化的時候(請看上面的Step 6),會設定這個類的init_state屬性為being_initialized(如果初始化好了會設定為fully_initialized,異常的話會設定為initialization_error),還會設定init_thread屬性為當前執行緒,在這個設定過程中是有針對這個類提供了一把互斥鎖的,因此當有別的執行緒進來的時候會被攔截在外面,如果設定完了,這把互斥鎖也釋放了,但是因為這個類的狀態被設定了,因此併發問題也得到了解決,當另外一個執行緒也嘗試初始化這個類的時候會判斷這個類的狀態是不是being_initialized,並且其init_thread不是當前執行緒,那麼就會一直卡在那裡,也就是此次執行緒dump的執行緒所處的狀態,正在初始化類的執行緒會呼叫< clinit >方法,如果正常結束了,那麼就設定其狀態為fully_initialized,並且通知之前卡在那裡等待初始化完成的執行緒,然他們繼續往下走(下一個動作就是再判斷下狀態,發現完成了就直接return了)

猜想:

在瞭解了上面的過程之後,於是我們猜測兩種可能

第一,這個類的狀態還是being_intialized,還在while迴圈裡沒有跳出來
第二,事件通知機制出現了問題,也就是pthread_cond_wait和pthread_cond_signal之間的通訊過程出現了問題。
不過第二種可能性非常小,比較linux久經考驗了,那接下來我們驗證其實是第一個猜想

驗證

我們通過GDB attach的方式連到了問題機器上(好在機器沒有掛),首先我們要找到具體的問題執行緒,我們通過上面的jstack -m命令看到了執行緒ID是5738,然後通過info threads找到對應的執行緒,並得到它的序號14

(gdb) info threads
17 process 5724 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
16 process 6878 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
15 process 5739 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
14 process 5738 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
13 process 5737 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
12 process 5736 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
11 process 5735 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
10 process 5734 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
9 process 5733 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
8 process 5732 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
7 process 5731 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
6 process 5730 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
5 process 5729 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
4 process 5728 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
3 process 5727 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
2 process 5726 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
1 process 5725 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2

然後通過thread 14切換到對應的執行緒,並通過bt看到了如下的堆疊,正如我們想象的那樣,正在做類的初始化,一直卡在那裡

(gdb) thread 14
[Switching to thread 14 (process 5738)]#0 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) bt
#0 0x003f67a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x005e0d76 in [email protected]@GLIBC_2.3.2 () from /lib/tls/i686/nosegneg/libpthread.so.0
#2 0x005e13ee in [email protected]_2.0 () from /lib/tls/i686/nosegneg/libpthread.so.0
#3 0xb79a71ae in os::PlatformEvent::park () from /home/opt/taobao/install/jdk1.6.0_33/jre/lib/i386/server/libjvm.so
#4 0xb7997acb in ObjectMonitor::wait () from /home/opt/taobao/install/jdk1.6.0_33/jre/lib/i386/server/libjvm.so
#5 0xb7a73c53 in ObjectSynchronizer::waitUninterruptibly () from /home/opt/taobao/install/jdk1.6.0_33/jre/lib/i386/server/libjvm.so
#6 0xb777eb34 in instanceKlass::initialize_impl () from /home/opt/taobao/install/jdk1.6.0_33/jre/lib/i386/server/libjvm.so
#7 0xb777e288 in instanceKlass::initialize () from /home/opt/taobao/install/jdk1.6.0_33/jre/lib/i386/server/libjvm.so
#8 0xb7821ad9 in find_class_from_class_loader () from /home/opt/taobao/install/jdk1.6.0_33/jre/lib/i386/server/libjvm.so
#9 0xb7807d99 in JVM_FindClassFromClassLoader () from /home/opt/taobao/install/jdk1.6.0_33/jre/lib/i386/server/libjvm.so
#10 0xb734c236 in Java_java_lang_Class_forName0 () from /home/opt/taobao/install/jdk1.6.0_33/jre/lib/i386/libjava.so
#11 0xb433064a in ?? ()
#12 0x0813b120 in ?? ()
#13 0x70aaa690 in ?? ()
#14 0x70aaa6a0 in ?? ()
#15 0x00000001 in ?? ()
#16 0x70aaa698 in ?? ()
#17 0x00000000 in ?? ()

我們通過f 6選擇第7幀,在通過disassemble反彙編該幀,也就是對instanceKlass::initialize_impl ()這個方法反彙編

0xb777eaed <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+45>:  lea    0xfffffff4(%ebp),%esp    //將%ebp偏移0xfffffff4位置的值存到%esp棧頂,然後下面的pop操作存到%ebx
0xb777eaf0 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+48>:  pop    %ebx
0xb777eaf1 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+49>:  pop    %esi
0xb777eaf2 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+50>:  pop    %edi
0xb777eaf3 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+51>:  pop    %ebp
0xb777eaf4 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+52>:  ret
0xb777eaf5 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+53>:  push   $0x1
0xb777eaf7 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+55>:  lea    0xffffffd8(%ebp),%edx
0xb777eafa <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+58>:  push   %esi
0xb777eafb <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+59>:  push   %ebx
0xb777eafc <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+60>:  push   %edx
0xb777eafd <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+61>:  call   0xb7a73a80 <_ZN12ObjectLockerC1E6HandleP6Threadb>
0xb777eb02 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+66>:  add    $0x10,%esp
0xb777eb05 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+69>:  xor    %eax,%eax
0xb777eb07 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+71>:  test   %ebx,%ebx
0xb777eb09 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+73>:  je     0xb777eb0d <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+77>
0xb777eb0b <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+75>:  mov    (%ebx),%eax      //將%ebx的值移到%eax
0xb777eb0d <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+77>:  cmpl   $0x4,0xe0(%eax)  //對比%eax偏移0xe0位置的值和0x4(這個值其實就是上面提到的being_initialized狀態,這就說明了%eax偏移0xe0位置其實存的就是初始化類的初始化狀態)
0xb777eb14 <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+84>:  jne    0xb777eb4f <_ZN13instanceKlass15initialize_implE19instanceKlassHandleP6Thread+143>

從上面的註釋我們其實得出了,我們要看當前類的初始化狀態,那就是看eax暫存器偏移0xe0的位置的值,而eax其實就是ebp暫存器偏移0xfffffff4位置的值,於是我們通過如下地址記憶體查到得到是4

(gdb) x $ebp + 0xfffffff4
0x70aaa45c: 0x71af2180
(gdb) x/3w 0x71af2180 + 0xe0
0x71af2260: 0x00000004  0x0813c800  0x0000001a

而4其實代表的就是being_initialized這個狀態,程式碼如下

enum ClassState {
unparsable_by_gc = 0,
allocated,
loaded,
linked,
being_initialized,
fully_initialized,
initialization_error
};

從這於是我們驗證了第一個猜想,其實是狀態一直沒有變更,因此一直卡在那裡,為了更進一步確認這個問題,要是我們能找到該類的init_thread執行緒id就更清楚了,拿到這個ID我們就能看到這個執行緒棧,就知道它在幹什麼了,但是很遺憾,這個很難獲取到,至少我一直沒有找到辦法,因為執行緒ID線上程物件裡一直沒有存,都是呼叫的os函式來獲取的,得換個思路。

突然發現instanceKlass.hpp程式碼中得知兩個屬性原來是相鄰的(init_state和init_thread),於是斷定下一個地址的值就代表是這個執行緒物件了,但是其屬性何其多,找到想要的太不易了,最主要的是還擔心自己看的程式碼和伺服器上的jvm程式碼不一致,這樣更蛋疼了,於是繼續檢視Thread.hpp中的JavaThread類,找到個關鍵字0xDEAD-2=0xDEAB,這個有可能是volatile TerminatedTypes _terminated屬性的值,於是把執行緒物件打印出來,果然查到了關鍵字0xDEAB

(gdb) x/100w 0x0813c800
0x813c800: 0xb7bc06e8 0x00000000 0x00000000 0x00000000
0x813c810: 0x0813c488 0x0813d2c8 0x00000000 0x00000000
0x813c820: 0x080f9bf8 0x080f8b50 0x70a59b60 0x00000000
0x813c830: 0x00000000 0x00000000 0x00000000 0x00000000
0x813c840: 0x00014148 0x00000505 0x00000000 0x00000000
0x813c850: 0x00000000 0x00000000 0x00000000 0x3f800021
0x813c860: 0x00000001 0x00000023 0x3f800021 0x0001b530
0x813c870: 0x00000000 0x00000000 0x00000000 0x080ffdc0
0x813c880: 0x00000001 0x00000000 0x080ffe24 0x00000014
0x813c890: 0x00000031 0x00000000 0x00000000 0x0813dab0
0x813c8a0: 0x0813c428 0x0813ce98 0x70a5b000 0x00051000
0x813c8b0: 0x00000000 0xffffffff 0x00000000 0x080ffdc0
0x813c8c0: 0x00002bad 0x0813d400 0x0813d500 0x0813d700
0x813c8d0: 0x0813d800 0x00000000 0x00000000 0x104aa1ad
0x813c8e0: 0x544a5ab2 0x32378fc7 0x00008767 0x00000000
0x813c8f0: 0x00000000 0x00000000 0x0ee9547d 0x00000000
0x813c900: 0x00000000 0x00000000 0x0813b000 0x75878760
0x813c910: 0x70a59a94 0x00000000 0x70a59abc 0xb7829020
0x813c920: 0xb7bb7100 0x00000000 0x00000000 0x00000000
0x813c930: 0x00000000 0x00000000 0x00000000 0x00000000
0x813c940: 0x00000000 0x00000000 0x00000000 0x00000000
0x813c950: 0x00000000 0x00000000 0x00000000 0x0000000a
0x813c960: 0x0813da98 0x00000000 0x0000deab 0x00000001
0x813c970: 0x00000000 0x00000000 0x00000002 0x00000000
0x813c980: 0x00000000 0x00000000 0x00000000 0x00000000

因此順著這個屬性繼續往上找,找到了_thread_state表示執行緒狀態的值(向上偏移三個字),0x0000000a,即10,然後檢視程式碼知道原來執行緒是出於block狀態

public:
volatile JavaThreadState _thread_state;
private:
ThreadSafepointState *_safepoint_state;
address _saved_exception_pc;
volatile TerminatedTypes _terminated;
enum JavaThreadState {
_thread_uninitialized = 0, // should never happen (missing initialization)
_thread_new = 2, // just starting up, i.e., in process of being initialized
_thread_new_trans = 3, // corresponding transition state (not used, included for completness)
_thread_in_native = 4, // running in native code
_thread_in_native_trans = 5, // corresponding transition state
_thread_in_vm = 6, // running in VM
_thread_in_vm_trans = 7, // corresponding transition state
_thread_in_Java = 8, // running in Java or in stub code
_thread_in_Java_trans = 9, // corresponding transition state (not used, included for completness)
_thread_blocked = 10, // blocked in vm
_thread_blocked_trans = 11, // corresponding transition state
_thread_max_state = 12 // maximum thread state+1 - used for statistics allocation
};

這樣一來檢視下執行緒dump,發現”Thread-1″正好處於BLOCKED狀態,也就是說Thread-1就是那個正在對mysql驅動類做初始化的執行緒,這說明Thread-0和Thread-1成功互鎖了

於是我們展開【Thread-1】,看到- waiting to lock (a java.lang.Class for java.sql.DriverManager),該執行緒正在等待java.sql.DriverManager型別鎖,而blocked在那裡,而這個型別鎖是被【Thread-0】執行緒持有的,從【Thread-1】這個執行緒堆疊來看它其實也是在做Class.forName動作,並且通過”Thread-1″,展開第四幀我們可以看到其正在對載入sun.jdbc.odbc.JdbcOdbcDriver

問題現場遐想:

於是我們大膽設想一個場景,【Thread-1】先獲取到初始化sun.jdbc.odbc.JdbcOdbcDriver的機會,然後在執行sun.jdbc.odbc.JdbcOdbcDriver這個類的靜態塊的時候呼叫DriverManager.registerDriver(new Driver());,而該方法之前已經提到了是會加同步鎖的,再想象一下,在這個這個靜態塊之前,並且設定了sun.jdbc.odbc.JdbcOdbcDriver類的初始化狀態為being_initialized之後,【Thread-0】這個執行緒執行到了卡在的那個位置,並且我們從其堆疊可以看出它已經持有了java.sql.DriverManager這個型別的鎖,因此這兩個執行緒陷入了互鎖狀態

解決方案
解決方案目前想到的是將驅動類的載入過程變成單執行緒載入,不存在併發情況就沒問題了。