1. 程式人生 > >Kerberos認證程式碼分析Can't get Kerberos realm

Kerberos認證程式碼分析Can't get Kerberos realm

1. Can't get Kerberos realm

原因分析

原始程式碼為:

1

2

org.apache.hadoop.security.UserGroupInformation.setConfiguration(conf)

sun.security.krb5.Config.refresh()

  

首先根據傳進來的Hadoop配置conf,去設定UserGroupInformation(UGI),方法的呼叫關係如下(刪除了部分不相關程式碼):

1

2

3

public static void setConfiguration(Configuration conf) {

  initialize(conf, true);

}

initialize方法如下 

1

2

3

4

5

6

7

8

9

10

11

12

private static synchronized void initialize(Configuration conf, boolean

 overrideNameRules) {

  authenticationMethod = SecurityUtil.getAuthenticationMethod(conf);

  if (overrideNameRules || !HadoopKerberosName.hasRulesBeenSet()) {

    try {

      HadoopKerberosName.setConfiguration(conf);

    catch (IOException ioe) {

      throw new RuntimeException(

          "Problem with Kerberos auth_to_local name configuration", ioe);

    }

  }

  ......

}

  

setConfiguration方法如下

1

2

3

4

5

6

7

8

9

10

11

12

13

14

public static void setConfiguration(Configuration conf) throws IOException {

  final String defaultRule;

  switch (SecurityUtil.getAuthenticationMethod(conf)) {

    case KERBEROS:

    case KERBEROS_SSL:

      try {

        KerberosUtil.getDefaultRealm();

      catch (Exception ke) {

        throw new IllegalArgumentException("Can't get Kerberos realm", ke);

      }

      ......

  }

  ......

}

  

getDefaultRealm使用了反射,目的是為了相容兩套jdk,即IBM(com.ibm.security.krb5.internal.Config) 和 Oracle(sun.security.krb5.Config)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

public static String getDefaultRealm()

    throws ClassNotFoundException, NoSuchMethodException,

    IllegalArgumentException, IllegalAccessException,

    InvocationTargetException {

  Object kerbConf;

  Class<?> classRef;

  Method getInstanceMethod;

  Method getDefaultRealmMethod;

  if (System.getProperty("java.vendor").contains("IBM")) {

    classRef = Class.forName("com.ibm.security.krb5.internal.Config"); // 獲取IBM jdk的類引用

  else {

    classRef = Class.forName("sun.security.krb5.Config"); // 獲取Oracle jdk的類引用

  }

  getInstanceMethod = classRef.getMethod("getInstance"new Class[0]);

  kerbConf = getInstanceMethod.invoke(classRef, new Object[0]);

  getDefaultRealmMethod = classRef.getDeclaredMethod("getDefaultRealm"new Class[0]);

  return (String)getDefaultRealmMethod.invoke(kerbConf, new Object[0]);

}

  

從上述程式碼來看,先獲取Config類引用,然後getInstanceMethod是獲得getInstance方法,再次getDefaultRealmMethod是獲得getDefaultRealm方法。

因此,假設我們是使用的Oracle的JDK,那麼最後是呼叫的sun.security.krb5.getDefaultRealm()。接下來看一下sun.security.krb5.getDefaultRealm()是如何實現的。

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

public String getDefaultRealm() throws KrbException {

  if(this.defaultRealm != null) { // 如果defaultRealm不為空,直接返回defaultRealm

    return this.defaultRealm;

  else // 如果defaultRealm為null,獲取defaultRealm

    KrbException var1 = null;

    String var2 = this.getDefault("default_realm""libdefaults");

    if(var2 == null && this.useDNS_Realm()) {

      try {

        var2 = this.getRealmFromDNS();

      catch (KrbException var4) {

        var1 = var4;

      }

    }

    ......

  }

我們假設defaultRealm = null,看一下如何從var2 = this.getRealmFromDNS();來獲取defaultRealm

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

private String getRealmFromDNS() throws KrbException {

  String var1 = null;

  String var2 = null;

 

  try {

    var2 = InetAddress.getLocalHost().getCanonicalHostName(); // 1. 獲取local host name

  catch (UnknownHostException var7) {

    KrbException var4 = new KrbException(60"Unable to locate Kerberos realm: " + var7.getMessage());

    var4.initCause(var7);

    throw var4;

  }

 

  String var3 = PrincipalName.mapHostToRealm(var2); // 2. 根據local host name獲取realm

  ....

mapHostToRealm()方法如下:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

static String mapHostToRealm(String var0) {

  String var1 = null;

 

  try {

    String var2 = null;

    Config var3 = Config.getInstance(); // 獲取Config的單例物件

    if((var1 = var3.getDefault(var0, "domain_realm")) != null) {

      return var1;

    }

    .......

  catch (KrbException var5) {

    ;

  }

  return var1;

}

  

這裡會獲取Config的單例物件,

1

2

3

4

5

6

7

public static synchronized Config getInstance() throws KrbException {

  if(singleton == null) {

    singleton = new Config();

  }

 

  return singleton;

}

 

再看Config.getInstance();的具體動作就是判斷單例物件是否為null,不為null直接返回,為null重新new一個Config物件。

同時,Config類中還有一個方法refresh,其程式碼如下:

1

2

3

4

public static synchronized void refresh() throws KrbException {

  singleton = new Config();

  KdcComm.initStatic();

}

  

從refresh的程式碼看,只要呼叫refresh()方法,就會重新生成Config的單例物件。這個refresh()方法,也是我們程式碼裡面要呼叫的。

再回顧一下我們的原始程式碼:

1

2

org.apache.hadoop.security.UserGroupInformation.setConfiguration(conf)

sun.security.krb5.Config.refresh()

回到getInstance()方法,假設singleton單例是null,會生成Config的單例物件。以後,再次呼叫getInstance方法都會直接返回這個單例物件了,沒有再new的機會了。有人開始質疑沒有機會new Config()物件了? 呼叫Config.refresh()方法不是可以new嗎? 答案是可以new,但是如果我們的UserGroupInformation.setConfiguration(conf)會丟擲異常,是不是Config.refresh()方法就不會被呼叫了! 我們的錯誤就是出現在這裡,後面分析UserGroupInformation.setConfiguration(conf)怎麼丟擲異常了。

 

在我們來看一下new Config()具體做了什麼事情。

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

private Config() throws KrbException {

  String var1 = getProperty("java.security.krb5.kdc"); // 從系統變數獲取kdc地址,假設我們啟動JVM時沒有設定該變數

  if(var1 != null) {

    this.defaultKDC = var1.replace(':'' ');

  else {

    this.defaultKDC = null;

  }

 

  this.defaultRealm = getProperty("java.security.krb5.realm"); // 從系統變數獲取realm,假設我們啟動JVM時也沒有設定該變數

  if((this.defaultKDC != null || this.defaultRealm == null) && (this.defaultRealm != null || this.defaultKDC == null)) {

    try {

      String var3 = this.getJavaFileName(); // 該方法會從JVM引數java.security.krb5.conf以及<java-home>/lib/security/krb5.conf獲取到krb5.conf檔案

      Vector var2;

      if(var3 != null) {

        var2 = this.loadConfigFile(var3);

        this.stanzaTable = this.parseStanzaTable(var2);

        if(DEBUG) {

          System.out.println("Loaded from Java config");

        }

      else // 假設JVM引數java.security.krb5.conf以及<java-home>/lib/security/krb5.conf都沒有獲取到krb5.conf檔案

        boolean var4 = false;

        if(isMacosLionOrBetter()) {

          try {

            this.stanzaTable = SCDynamicStoreConfig.getConfig();

            if(DEBUG) {

              System.out.println("Loaded from SCDynamicStoreConfig");

            }

 

            var4 = true;

          catch (IOException var6) {

            ;

          }

        }

 

        if(!var4) {

          var3 = this.getNativeFileName(); // 我們是centos機器, 會拿到/etc/krb5.conf

          var2 = this.loadConfigFile(var3); // 載入/etc/krb5.conf檔案

          this.stanzaTable = this.parseStanzaTable(var2);

          if(DEBUG) {

            System.out.println("Loaded from native config");

          }

        }

      }

    catch (IOException var7) {

      ;

    }

 

  else {

    throw new KrbException("System property java.security.krb5.kdc and java.security.krb5.realm both must be set or neither must be set.");

  }

}

  

我們的問題就出在var2 = this.loadConfigFile(var3); 位置,因為載入/etc/krb5.conf檔案的時候,恰好/etc/krb5.conf檔案不存在,因為我們會把修改的krb5.conf去替換/etc/krb5.conf檔案,在替換的時間內,恰好去loadConfigFile(),該方法就報了FileNotFoundException的異常。這個異常一直throw到UserGroupInformation.setConfiguration(conf)呼叫的地方,導致我們永遠呼叫不到Config.refresh()方法。

 

 

2. 報錯com.google.common.util.concurrent.UncheckedTimeoutException: java.util.concurrent.TimeoutException

原因分析:首先這個異常是因為除錯上述報錯產生的,所以順便分析下原因。

上述報錯是Can't get Kerberos realm,網上查一下,大概是因為拿不到kdc和realm。

因此,我在JVM啟動引數中添加了如下3個引數:

1

2

3

-Djava.security.krb5.conf=/etc/krb5.conf \

-Djava.security.krb5.kdc=node1:8080 \

-Djava.security.krb5.realm=KFC.com \

指定了krb5.conf檔案,kdc地址,realm值。然後重啟程式,發現可以正常使用,然後把/etc/krb5.conf檔案刪除了(上個錯誤其實猜想到了是因為讀不到krb5.conf造成的)。

程式竟然報錯 java.util.concurrent.TimeoutException,打jstack

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

TimeoutException 的jstack如下:

"builtin-checker-serviceId-58" prio=10 tid=0x00007f678800e800 nid=0x4084 waiting for monitor entry [0x00007f672fffe000]

   java.lang.Thread.State: BLOCKED (on object monitor)

        at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytabAndReturnUGI(UserGroupInformation.java:1074)

        - waiting to lock <0x00000000a8b940d0> (a java.lang.Class for org.apache.hadoop.security.UserGroupInformation)

        ......       

        at java.util.concurrent.FutureTask.run(FutureTask.java:262)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

        at java.lang.Thread.run(Thread.java:745)

 

 

呼叫UserGroupInformation.loginUserFromKeytabAndReturnUGI被block了

 

往上找jstack,

"builtin-checker-serviceId-59" prio=10 tid=0x00007f67680b3800 nid=0x4097 runnable [0x00007f672f2ee000]

   java.lang.Thread.State: RUNNABLE

        at java.net.PlainDatagramSocketImpl.receive0(Native Method)

        - locked <0x000000009a0076e0> (a java.net.PlainDatagramSocketImpl)

        at java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:146)

        - locked <0x000000009a0076e0> (a java.net.PlainDatagramSocketImpl)

        at java.net.DatagramSocket.receive(DatagramSocket.java:816)

        - locked <0x000000009a017848> (a java.net.DatagramPacket)

        - locked <0x000000009a0076a0> (a java.net.DatagramSocket)

        at sun.security.krb5.internal.UDPClient.receive(NetClient.java:207)  // 卡主了

        at sun.security.krb5.KdcComm$KdcCommunication.run(KdcComm.java:390)

        at sun.security.krb5.KdcComm$KdcCommunication.run(KdcComm.java:343)

        at java.security.AccessController.doPrivileged(Native Method)

        at sun.security.krb5.KdcComm.send(KdcComm.java:327)

        at sun.security.krb5.KdcComm.send(KdcComm.java:219)

        at sun.security.krb5.KdcComm.send(KdcComm.java:191)

        at sun.security.krb5.KrbAsReqBuilder.send(KrbAsReqBuilder.java:319)

        at sun.security.krb5.KrbAsReqBuilder.action(KrbAsReqBuilder.java:364)

        at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:735)

        at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:584)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:606)

        at javax.security.auth.login.LoginContext.invoke(LoginContext.java:762)

        at javax.security.auth.login.LoginContext.access$000(LoginContext.java:203)

        at javax.security.auth.login.LoginContext$4.run(LoginContext.java:690)

        at javax.security.auth.login.LoginContext$4.run(LoginContext.java:688)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:687)

        at javax.security.auth.login.LoginContext.login(LoginContext.java:595)

        at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytabAndReturnUGI(UserGroupInformation.java:1092)

        - locked <0x00000000a8b940d0> (a java.lang.Class for org.apache.hadoop.security.UserGroupInformation)

        ........

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

        at java.lang.Thread.run(Thread.java:745)

  

從jstack中看到UDPClient.receive卡主了,為什麼卡主了,不知道! 問大神,大神說加入JVM除錯引數-Dsun.security.krb5.debug=true,可以列印日誌到console中。在console中看到如下日誌:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

Ordering keys wrt default_tkt_enctypes list

default etypes for default_tkt_enctypes: 3 1 16.

default etypes for default_tkt_enctypes: 3 1 16.

>>> KrbAsReq creating message

>>> KrbKdcReq send: kdc=node1    UDP:88, timeout=30000, number of retries =3, #bytes=134

>>> KDCCommunication: kdc=node1   UDP:88, timeout=30000,Attempt =1, #bytes=134

SocketTimeOutException with attempt: 1

>>> KDCCommunication: kdc=node1   UDP:88, timeout=30000,Attempt =2, #bytes=134

SocketTimeOutException with attempt: 2

>>> KDCCommunication: kdc=node1   UDP:88, timeout=30000,Attempt =3, #bytes=134

SocketTimeOutException with attempt: 3

>>> KrbKdcReq send: error trying node1 

java.net.SocketTimeoutException: Receive timed out

        at java.net.PlainDatagramSocketImpl.receive0(Native Method)

        at java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:146)

        at java.net.DatagramSocket.receive(DatagramSocket.java:816)

        at sun.security.krb5.internal.UDPClient.receive(NetClient.java:207)

        at sun.security.krb5.KdcComm$KdcCommunication.run(KdcComm.java:390)

        at sun.security.krb5.KdcComm$KdcCommunication.run(KdcComm.java:343)

        at java.security.AccessController.doPrivileged(Native Method)

        at sun.security.krb5.KdcComm.send(KdcComm.java:327)

        at sun.security.krb5.KdcComm.send(KdcComm.java:219)

        at sun.security.krb5.KdcComm.send(KdcComm.java:191)

        at sun.security.krb5.KrbAsReqBuilder.send(KrbAsReqBuilder.java:319)

        at sun.security.krb5.KrbAsReqBuilder.action(KrbAsReqBuilder.java:364)

        at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:735)

        at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:584)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:606)

        at javax.security.auth.login.LoginContext.invoke(LoginContext.java:762)

        at javax.security.auth.login.LoginContext.access$000(LoginContext.java:203)

        at javax.security.auth.login.LoginContext$4.run(LoginContext.java:690)

        at javax.security.auth.login.LoginContext$4.run(LoginContext.java:688)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:687)

        at javax.security.auth.login.LoginContext.login(LoginContext.java:595)

        at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytabAndReturnUGI(UserGroupInformation.java:1092)

        ........

看到預設去連了KDC的88埠,預設埠被改成了1088,所以連線失敗,導致超時。 聽說沒有引數可以設定KDC的埠, 不知道真假,在-Djava.security.krb5.kdc引數中指定kdc埠無效。

 

 

參考: https://docs.oracle.com/javase/7/docs/technotes/guides/security/jgss/tutorials/KerberosReq.html 及原始碼