安全加固導致的CRS啟動失敗(CRS-1612:Network communication xxx timeout ,but “PING” was fine)
安全問題近幾年一直是關注的焦點,不斷湧現出一些網站敏感資訊洩漏的新聞, 在《網路安全法》頒佈以後更加有了法律依據,前段時間在一次網路安全宣傳中看到關於《刑法》第286條中對於直接責任人的解釋,讓我及替所有DBA及運維人員安全擔憂。 對於一些有”關鍵資訊基礎設施”的單位,集團和二部委也開始了關於安全的審查。所以今年很多時間都是在做安全相關的工作,於是就出現了下面的這起故障。
某天晚上突然幾乎同一時間5套資料庫出現可用性告警,兩節點的RAC全是2節點crash. 掃了一遍日誌是腦裂,環境全是11.2.03 2-Nodes RAC ON AIX, 這也是當前版本的演算法決定的,在12C以前當只有網路心跳異常時是保留節點號最小的節點,這點在12C版本發生了改變新的演算法,引入了節點權重(node weight),當腦裂發生是是權重高的活下來。
諮詢了當時無網路策略或硬體變更,CRS無法啟動,附幾個重啟時的日誌資訊。
# Node2 GI alert log
2018-09-18 17:30:34.985 [gpnpd(5571146)]CRS-2328:GPNPD started on node anbob2. 2018-09-18 17:30:38.430 [cssd(4326132)]CRS-1713:CSSD daemon is started in clustered mode 2018-09-18 17:30:39.944 [ohasd(4784458)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE 2018-09-18 17:30:58.605 [cssd(4326132)]CRS-1707:Lease acquisition for node anbob2 number 2 completed 2018-09-18 17:31:00.017 [cssd(4326132)]CRS-1605:CSSD voting file is online: /dev/rlv_vote2; details in /oracle/app/11.2.0.3/grid/log/anbob2/cssd/ocssd.log. 2018-09-18 17:31:00.020 [cssd(4326132)]CRS-1605:CSSD voting file is online: /dev/rlv_vote3; details in /oracle/app/11.2.0.3/grid/log/anbob2/cssd/ocssd.log. 2018-09-18 17:31:00.032 [cssd(4326132)]CRS-1605:CSSD voting file is online: /dev/rlv_vote1; details in /oracle/app/11.2.0.3/grid/log/anbob2/cssd/ocssd.log. 2018-09-18 17:31:20.572 <strong>[cssd(4326132)]CRS-1612:Network communication with node anbob1 (1) missing for 50% of timeout interval.Removal of this node from cluster in 14.376 seconds 2018-09-18 17:31:27.573 [cssd(4326132)]CRS-1611:Network communication with node anbob1 (1) missing for 75% of timeout interval.Removal of this node from cluster in 7.375 seconds 2018-09-18 17:31:32.573 [cssd(4326132)]CRS-1610:Network communication with node anbob1 (1) missing for 90% of timeout interval.Removal of this node from cluster in 2.375 seconds 2018-09-18 17:31:34.955</strong> [cssd(4326132)]CRS-1609:This node is unable to communicate with other nodes in the cluster and is going down to preserve cluster integrity; details at (:CSSNM00008:) in /oracle/app/11.2.0.3/grid/log/anbob2/cssd/ocssd.log. 2018-09-18 17:31:34.955 [cssd(4326132)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /oracle/app/11.2.0.3/grid/log/anbob2/cssd/ocssd.log 2018-09-18 17:31:35.020 [cssd(4326132)]CRS-1603:CSSD on node anbob2 shutdown by user.
# Node2 crsd log
# node2 crsd log 2018-09-10 15:32:21.234: [OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203 2018-09-10 15:32:21.434: [OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203 2018-09-10 15:32:21.635: [OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203 2018-09-10 15:32:21.672: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found to node for 4294967295 ms, node 110ef48f0 { host 'anbob1', haName '420d-6a69-ed3b-01e1', srcLuid 0d64970f-8036598f, dstLuid 00000000-00000000 numInf 1, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [998 : 998], createTime 1975300868, sentRegister 1, localMonitor 0, flags 0x4 } 2018-09-10 15:32:21.835: [OCRMAS][3342]proath_master:100b: Polling, <strong>connect to master not complete retval1 </strong>= 203, retval2 = 203 2018-09-10 15:32:22.035: [OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203 2018-09-10 15:32:22.235: [OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203 2018-09-10 15:32:22.436: [OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203 2018-09-10 15:32:22.636: [OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203 2018-09-10 15:32:22.836: [OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203 2018-09-10 15:32:23.036: [OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203 2018-09-10 15:32:23.236: [OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203 2018-09-10 15:32:23.436: [OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203 2018-09-10 15:32:23.637: [OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203 2018-09-10 15:32:23.837: [OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203 2018-09-10 15:32:24.037: [OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203 2018-09-10 15:32:24.237: [OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203
# node 1 crsd log
2018-09-10 15:25:44.586: [GIPCHALO][2314] gipchaLowerDropMsg: dropping because of sequence timeout, waited 30006, msg 116893738 { len 1160, seq 572, type gipchaHdrTypeRecvEstablish (5), lastSeq 0, lastAck 0, minAck 571, flags 0x1, srcLuid 0d64970f-8036598f, dstLuid 00000000-00000000, msgId 570 }, node 11177a490 { host 'anbob2', haName '0e69-b2a9-e176-7ec8', srcLuid 535d9395-30941506, dstLuid 0d64970f-8036598f numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [59 : 59], createTime 2526153249, sentRegister 1, localMonitor 0, flags 0x0 } 2018-09-10 15:25:45.586: [GIPCHALO][2314] gipchaLowerDropMsg: dropping because of sequence timeout, waited 30006, msg 1168a2898 { len 1160, seq 573, type gipchaHdrTypeRecvEstablish (5), lastSeq 0, lastAck 0, minAck 572, flags 0x1, srcLuid 0d64970f-8036598f, dstLuid 00000000-00000000, msgId 571 }, node 11177a490 { host 'anbob2', haName '0e69-b2a9-e176-7ec8', srcLuid 535d9395-30941506, dstLuid 0d64970f-8036598f numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [60 : 60], createTime 2526153249, sentRegister 1, localMonitor 0, flags 0x0 } 2018-09-10 15:25:45.586: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found to node for 2526214261 ms, node 11177a490 { host 'anbob2', haName '0e69-b2a9-e176-7ec8', srcLuid 535d9395-30941506, dstLuid 0d64970f-8036598f numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [60 : 60], createTime 2526153249, sentRegister 1, localMonitor 0, flags 0x4 } ... 2018-09-10 15:25:49.587: [GIPCHALO][2314] gipchaLowerDropMsg: dropping because of sequence timeout, waited 30006, msg 1168af158 { len 1160, seq 577, type gipchaHdrTypeRecvEstablish (5), lastSeq 0, lastAck 0, minAck 576, flags 0x1, srcLuid 0d64970f-8036598f, dstLuid 00000000-00000000, msgId 575 }, node 11177a490 { host 'anbob2', haName '0e69-b2a9-e176-7ec8', srcLuid 535d9395-30941506, dstLuid 0d64970f-8036598f numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [64 : 64], createTime 2526153249, sentRegister 1, localMonitor 0, flags 0x0 } 2018-09-10 15:31:36.663: [GIPCHALO][2314] <strong>gipchaLowerProcessNode: no valid interfaces found to node</strong> for 2526565337 ms, node 111775ef0 { host 'anbob2', haName '0e69-b2a9-e176-7ec8', srcLuid 535d9395-9e9ed2e1, dstLuid 0d64970f-8036598f numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [50 : 50], createTime 2526514328, sentRegister 1, localMonitor 0, flags 0x4 } 2018-09-10 15:31:37.663: [GIPCHALO][2314] gipchaLowerDropMsg: dropping because of sequence timeout, waited 30006, msg 11688a4b8 { len 1160, seq 925, type gipchaHdrTypeRecvEstablish (5), lastSeq 0, lastAck 0, minAck 924, flags 0x1, srcLuid 0d64970f-8036598f, dstLuid 00000000-00000000, msgId 923 }, node 111775ef0 { host 'anbob2', haName '0e69-b2a9-e176-7ec8', srcLuid 535d9395-9e9ed2e1, dstLuid 0d64970f-8036598f numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [51 : 51], createTime 2526514328, sentRegister 1, localMonitor 0, flags 0x0 } 2018-09-10 15:31:38.663: [GIPCHALO][2314] gipchaLowerDropMsg: dropping because of sequence timeout, waited 30007
# node 2 cssd log
2018-09-18 17:31:02.553: [CSSD][1029]clssgmClientConnectMsg: msg flags 0x0000 2018-09-18 17:31:03.032: [CSSD][2587]clssnmvDHBValidateNcopy: node 1, anbob1, has a disk HB, but no network HB, DHB has rcfg 432128556, wrtcnt, 37948455, LATS 2674619339, lastSeqNo 37948452, uniqueness 1536633719, timestamp 1537263062/3224929250 2018-09-18 17:31:03.052: [CSSD][4900]clssgmWaitOnEventValue: after CmInfo Stateval 3, eval 1 waited 0 2018-09-18 17:31:03.067: [CSSD][4129]clssnmvDHBValidateNcopy: node 1, anbob1, <strong>has a disk HB, but no network HB, </strong>DHB has rcfg 432128556, wrtcnt, 37948456, LATS 2674619374, lastSeqNo 37948453, uniqueness 1536633719, timestamp 1537263062/3224929768 2018-09-18 17:31:03.687: [CSSD][5928]clssnmConnSetNames: hostname anbob1 privname 192.168.43.21 con 60d 2018-09-18 17:31:03.687: [CSSD][5928]clssnmSetNodeProperties: properties node 1 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17
# node2 gipc log
[OCRMSG][1]<strong>GIPC error [29] msg [gipcretConnectionRefused]</strong> [OCRMSG][1]GIPC error [29] msg [gipcretConnectionRefused] [OCRMSG][1]GIPC error [29] msg [gipcretConnectionRefused] ...
Note:
截至到當前表象都是心跳網通訊出現了問題,但是手動互相ping private IP, HAIP都正常, traceroute 也正常,因為之前遇到過一個網路異常的案例ofollow,noindex"><Crsd start fail and crsd.log show “Policy Engine is not initialized yet”& evmd.log show “[gipcretConnectionRefused] [29]”> , 這次有很多日誌相似之處。
想到前幾日有在主機層(OS layer)增加白名單, 於是建議主機工程師先關閉主機白名單,不久主機的管理員說已關閉,可以測試了,提醒DBA們以後處理問題一定要自己確認呀,不要相信任何人說的話。我們手動再次關閉了2節點手動重啟,更糟糕的事情發生了,節點2啟動,節點1被驅逐,但是節點2又啟動失敗。導致了無任何節點正常無法提供服務的局面,目前還沒有想到較好的方法避免這個問題,後手動重啟節點1無法啟動,當時為了儘快恢復業務手動重啟了OS, 節點1都啟動正常。接下來分析就更加困難,嘗試啟2節點可能會影響1節點, 還好是半夜,有一套資料庫可以暫停一段業務。MOS中關於AIX有一篇也提到”
IBM PowewSC disables some UDP/TCP related features and has network packet filtering feature, it blocks the private network layer communication, causing CRSD can not communicate with each other and the 2nd node CRSD can not join the cluster.”
主機的人也沒給說法,這時來傳來了訊息,問題的導火索找到了, 同一時間點安全室在做全網的埠掃描。說是掃描的是開啟NFS的主機,但是本次有套資料庫無NFS也同樣導致節點驅逐。後來各種嘗試,後來一同事說關閉了整個IPsec服務,啟動正常了, 隨後主機工程師說只是恢復了最近新加的埠限制, 我沒見反正是不信了。當然後來了儘快恢復業務 ,都停掉了所有的IPsec服務恢復業務。後期找環境做測試。關閉IPsec4的方法:
Start/Stop IP Security
Stop IP Security
KEEP definition in database [yes]
Command: OK stdout: yes stderr: no
Before command completion, additional instructions may appear below.
ipsec_v4 Defined
anbob1:/> lsdev -l ipsec_v4
ipsec_v4 Defined IP Version 4 Security Extension
關於IPsec
IPsec 是用來建立伺服器之間加密通訊通道的協議,此通道也常被稱為隧道或 VPN 隧道。本文不會詳細討論 IPsec,如果想要在您的環境中使用 IPSec,要保證已安裝以下包:
bos.msg.LANG.net.ipsec
bos.net.ipsec.websm
bos.crypto-priv
lsfilt:列出表中的過濾規則。建立之後,每條規則都會被分配一個編號,可以輕鬆地使用該命令看到。
genfilt:向表中新增一條過濾規則。這是用來建立新過濾的命令。如果未使用 –n 引數來指定位置,那麼新的規則將會被新增到表格末尾。
chfilt:用來改變現有的過濾規則。您需要提供規則 ID 以指明要修改哪條規則。規則 1 是預設規則,無法使用此命令修改。
rmfilt:rm 字尾對所有 UNIX 管理員來說應該很熟悉。您可以使用此命令在任何時候根據規則 ID 來刪除過濾規則。
mkfilt:這是一個重要的命令,它可以啟用或停用表中的過濾規則,啟用或禁用過濾日誌,並改變預設規則。如果要使對過濾表的更改生效,需要在執行此命令時帶上一些引數。
當談到 TCP/IP 過濾中的策略時,通常是指兩種可能的安全方法:
預設拒絕所有流量,只允許您許可的。
預設允許所有流量,只拒絕您限制的。
這裡的策略是預設允許所有流量,然後針對指定埠號限制所有通訊,再允許該埠的特定IP段通訊,如SSH 22埠。我們把相同的主機白名單規則應用到了測試主機,當天主機配置的埠涉及22 \ 2049\ 123.
經測試22埠不影響CRS啟動。確認IPsec已啟用:
anbob1:/> lsdev -l ipsec_v4 ipsec_v4 Available IP Version 4 Security Extension anbob1:/> lsfilt -v4 Beginning of IPv4 filter rules. Rule 1: Rule action: permit Source Address: 0.0.0.0 Source Mask: 0.0.0.0 Destination Address : 0.0.0.0 Destination Mask: 0.0.0.0 Source Routing: no Protocol: udp Source Port: eq4001 Destination Port: eq4001 Scope: both Direction: both Logging control: no Fragment control: all packets Tunnel ID number: 0 Interface: all Auto-Generated: yes Expiration Time: 0 Description: Default Rule genfilt -v 4|6 [ -n fid] [ -a D|P|I|L|E|H|S ] -s s_addr -m s_mask [-d d_addr] [ -M d_mask] [ -g Y|N ] [ -c protocol] [ -o s_opr] [ -p s_port] [ -O d_opr] [ -P d_port] [ -r R|L|B ] [ -w I|O|B ] [ -l Y|N ] [ -f Y|N|O|H ] [ -t tid] [ -i interface] [-D description] [-e expiration_time] [-x quoted_pattern] [-X pattern_filename ] [-C antivirus_filename] -C antivirus_filename 指定抗病毒名。-C 標誌意味著ClamAV病毒庫的一些版本。 -D description 描述介紹。 -v 4|6 指定IP版本 -n fid 所新增ID將會被新增至第 fid 條規則之前 -a Action D(eny) | P(ermit) | I(f) | (e)L(se) | E(ndif)。所有IF規則必須關聯ENDIF規則結束。 -s s_addr 源地址 -m s_mask 源地址掩碼 -d d_addr 目標地址 -M d_mask 目標地址掩碼 -g Y|N 用於Permit規則,預設為Y,表示過濾規則可以使用源路由的IP包。 -c protocol 協議,預設all。有效值udp/icmp/icmpv6/tcp/tcp.ack/ospf/ipip/esp/ah/all -o s_opr | ICMP Code Opertion 源埠或者ICMP型別 操作。有效值:lt/le/gt/ge/eq/neq/any。預設any,當-c ospf時,必須為any。 -p s_port 源埠或ICMP型別。 -O d_opr | ICMP Code Opertion 目標埠或者ICMP型別 操作。有效值:lt/le/gt/ge/eq/neq/any。預設any,當-c ospf時,必須為any。 -P d_port 目標埠或ICMP型別 -r R|L|B 路由,預設B。指定規則是用於R(轉發包)、L(發往或來自本機的包)、B(兩者都使用) -w I|O|B 預設B。指定規則應用於I(輸入包)、O(輸出包)、B(兩者都使用)。使用代-x -X或-C 模式是使用O選項無效,使用B有效,但只檢查輸入包。 -l Y|N 是否記錄(匹配規則的包)日誌,預設N。 -f Y|N|O|H 分段控制、預設為Y(所有包)。N(未分段包)、O(只用於分段和分段頭)、H(只應用於分段頭和未分段)。 -t tid 指定於該規則相關的通道標識,所有匹配包都要經過此通道。不指定此項,規則只作用於非流量通道。 -i interface 指定介面卡,預設為all。 -e expiration_time 過期時間(秒)。 -x pattern 匹配模式 -X patternfile 匹配模式檔案。每行一個模式
Enable Logging
## Backup syslog.conf file before modifying it. cp /etc/syslog.conf /etc/syslog.conf.bak20180918 ## Append entry for IP filters logs. echo "local4.debug /var/adm/ipsec.log" >> /etc/syslog.conf ## Create log file and set permissions (permissions may depend on ## company policies) touch /var/adm/ipsec.log chmod 644 /var/adm/ipsec.log ## Refresh the syslog subsystem to activate the new configuration. refresh -s syslogd 0513-095 The request for subsystem refresh was completed successfully.
常用方法
oracle@anbob1:/home/oracle:11G> netstat -in NameMtuNetworkAddressIpkts IerrsOpkts OerrsColl en121500link#234.40.b5.a8.cd.ce 1405459250 8295299820 en121500133.96.43133.96.43.211405459250 8295299820 en121500133.96.43133.96.43.2211405459250 8295299820 en121500133.96.43133.96.43.1211405459250 8295299820 en131500link#334.40.b5.a8.cd.66 106713420197252320 en131500192.168.43192.168.43.21106713420197252320 en131500169.254169.254.47.5106713420197252320 lo016896 link#1603001390 6029982500 lo016896 127127.0.0.1603001390 6029982500 lo016896 ::1%1603001390 6029982500 anbob1:/var/adm> ifconfig en13 en13: flags=1e080863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN> inet 192.168.43.21 netmask 0xffffff00 broadcast 192.168.43.255 inet 169.254.47.5 netmask 0xffff0000 broadcast 169.254.255.255 tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0 如生成規則 anbob2:/>genfilt -v 4 -a P -s 192.168.43.0 -m 255.255.255.0 -d 0 -M 0 -g Y -c udp-O eq -P 123-w I -l Y-i en12 Filter rule 17 for IPv4 has been added successfully. 使規則生效 anbob2:/> mkfilt -v 4 -g stop -u 檢視規則 anbob2:/> lsfilt -v 4 -O|grep 123 13|deny|0.0.0.0|0.0.0.0|0.0.0.0|0.0.0.0|yes|udp|any|0|eq|123|both|inbound|yes|all packets|0|all|0||| 16|permit|133.96.60.0|255.255.255.0|0.0.0.0|0.0.0.0|yes|udp|any|0|eq|123|both|inbound|yes|all packets|0|all|0||| 17|permit|192.168.43.0|255.255.255.0|0.0.0.0|0.0.0.0|yes|udp|any|0|eq|123|both|inbound|yes|all packets|0|en12|0||| 改變#17規則生效網絡卡 anbob2:/> chfilt -v 4 -n 17 -i en13 Filter rule 17 for IPv4 has been changed successfully. anbob2:/> mkfilt -v 4 -g start -u anbob2:/> lsfilt -v 4 -O|grep 123 13|deny|0.0.0.0|0.0.0.0|0.0.0.0|0.0.0.0|yes|udp|any|0|eq|123|both|inbound|yes|all packets|0|all|0||| 16|permit|133.96.60.0|255.255.255.0|0.0.0.0|0.0.0.0|yes|udp|any|0|eq|123|both|inbound|yes|all packets|0|all|0||| 17|permit|192.168.43.0|255.255.255.0|0.0.0.0|0.0.0.0|yes|udp|any|0|eq|123|both|inbound|yes|all packets|0|en13|0||| 移除規則 anbob2:/> rmfilt -v 4 -n 13 Filter rule 13 for IPv4 has been removed successfully. anbob2:/> mkfilt -v 4-u 注意順序,要permit在前, deny在後 anbob2:/> lsfilt -v4 -O|grep 123 14|permit|133.96.60.0|255.255.255.0|0.0.0.0|0.0.0.0|yes|udp|any|0|eq|123|both|inbound|yes|all packets|0|en12|0||| 15|permit|192.168.43.0|255.255.255.0|0.0.0.0|0.0.0.0|yes|udp|any|0|eq|123|both|inbound|yes|all packets|0|en13|0||| 17|deny|0.0.0.0|0.0.0.0|0.0.0.0|0.0.0.0|yes|udp|any|0|eq|123|both|inbound|yes|all packets|0|all|0|||
總結:
本次故障是因為前期主機配置了白名單,安全掃描導致CRS 2節點crash, 在CRS自動重啟中又因為白名單,網路通訊異常,無法啟動CRS程序。此時手動啟動2節點甚至會導致1節點crash. 因為目前沒有找到官方文件描述對123埠的描述,123用於NTP服務,當前的資料庫主機使用的是NTP做時間同步,但是對於NTP server的IP段是允許的,同樣出現了該問題,不知是否在程式碼中寫入了對於埠的檢測。ORACLE原廠在SR中只是說ORACLE RAC不支援在private network增加任何網路防火牆限制,,同樣我使用tcpdump 也沒有發現節點間的123埠的通訊。
tcpdump -i en13 -vnn ‘dst host 192.168.43.22 and dst port 123’
tcpdump -i en13 -vnn ‘dst port 123’
tcpdump -i en13 -v ‘port 123’
tcpdump -i en13 ‘port 123’ or ‘port 2049’
經測試,只有在加白名單的在安全掃描時才會導致CRS驅逐,主機不存在白名單不會有影響. 如果要使用IPsec或者是Linux IPTABLES(前兩天同樣有其它客戶是在linux中使用iptables同樣出現了此類問題有找我諮詢), 兩種方案我們測試是可行的:
1, 埠限制只增加在public network, private network不過濾
2,埠限制在所有網絡卡, 但是對於private network要允許private ip通訊
目前我們限制的埠目前沒有增加169.254(HAIP)段,是不影響CRS啟動,如果後期出現HAIP通訊的埠限制,同樣也要增加HAIP白名間在private network.