1. 程式人生 > >Oracle RAC CRS-0184 --Cannot communicate with the CRS daemon

Oracle RAC CRS-0184 --Cannot communicate with the CRS daemon

Oracle 11gR2 下RAC 安裝後,啟動CRS. 錯誤如下:

[[email protected] bin]# ./crsctl check crs

CRS-4638: Oracle High Availability Services is online

CRS-4535: Cannot communicate with Cluster Ready Services

CRS-4529: Cluster Synchronization Services is online

CRS-4533: Event Manager is online

從這個錯誤提示,可以看到是CRS啟動失敗了。 CRS是關鍵程序。 它不能啟動,Clusterware 也是啟動不了。 導致這個問題的原因很多。

Log 如下:

[[email protected] rac1]# tail -50 /u01/app/11.2.0/grid/log/rac1/crsd/crsd.log

ORA-15077: could not locate ASM instance serving a required diskgroup

2010-11-16 17:13:44.286: [  OCRASM][3046411024]proprasmo: kgfoCheckMount returned [7]

2010-11-16 17:13:44.286: [  OCRASM][3046411024]proprasmo: The ASM instance is down

2010-11-16 17:13:44.287: [  OCRRAW][3046411024]proprioo: Failed to open [+CRS]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.

2010-11-16 17:13:44.287: [  OCRRAW][3046411024]proprioo: No OCR/OLR devices are usable

2010-11-16 17:13:44.287: [  OCRASM][3046411024]proprasmcl: asmhandle is NULL

2010-11-16 17:13:44.287: [  OCRRAW][3046411024

]proprinit: Could not open raw device

2010-11-16 17:13:44.287: [  OCRASM][3046411024]proprasmcl: asmhandle is NULL

2010-11-16 17:13:44.287: [  OCRAPI][3046411024]a_init:16!: Backend init unsuccessful : [26]

2010-11-16 17:13:44.288: [  CRSOCR][3046411024] OCR context init failure.  Error: PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=7, opn=kgfoAl06, dep=15077, loc=kgfokge

ORA-15077: could not locate ASM instance serving a required diskgroup

] [7]

2010-11-16 17:13:44.288: [    CRSD][3046411024][PANIC] CRSD exiting: Could not init OCR, code: 26

2010-11-16 17:13:44.288: [    CRSD][3046411024] Done.

       這裡的提示是ASM 沒有啟動造成的。 這裡牽涉到的問題較複雜。

       這篇文章不打算去具體分析這個問題。 Oracle 官網上有一篇文章對這個問題進行了非常詳細的說明。轉到了我的Blog。 參考:

       How to Troubleshoot Grid Infrastructure Startup Issues [ID 1050908.1]

In this Document

  Goal

  Solution

     Start up sequence:

     Cluster status

     Case 1: OHASD.BIN does not start

     Case 2: OHASD Agents does not start

     Case 3: CSSD.BIN does not start

     Case 4: CRSD.BIN does not start

     Case 5: GPNPD.BIN does not start

     Case 6: Various other daemons does not start

     Case 7: CRSD Agents does not start

     Network and Naming Resolution Verification

     Log File Location, Ownership and Permission

     Network Socket File Location, Ownership and Permission

     Diagnostic file collection

  References

在這裡寫下我分析問題的思路:

1. 根據log,看能否找到問題的原因。 如果不能清楚的定位問題。 就只能繼續分析。

2. 根據CRS 啟動的順序來分析。

       在啟動的時候,要先啟動ASM 例項, 這裡牽涉到儲存問題。

       (1)網路是否正常

       (2)儲存是否正常的對映到相關的位置, 我的實驗採用的是multipath,將儲存對映到/dev/mapper/* 目錄下。 在遇到問題的時候,會去檢查這個問題是否有相關的對映。

       (3)儲存的許可權問題。 因為對映之後,預設是的root使用者。 我在/etc/rc.d/rc.local 檔案裡添加了改變許可權的指令碼。 開機啟動的時候,就將相關對映檔案改成Oracle 使用者。

3.  如果這些都正常,沒有問題, 可以嘗試重啟CRS 或者重啟作業系統。

補充:

       在網上還搜尋到一個導致CSSD啟動失敗的原因。 這個我關注的是,它講到了一個知識點。 講到了 /tmp/.oracle  /var/tmp/.oracle 這兩個目錄的作用。每次Server重啟的時候,會在這兩個檔案裡存放鎖的資訊。 當某次重啟後,這兩個檔案不能被刪除,就會導致鎖不能更新,從而不能啟動。

由此也理解了,在刪除Clusterware的時候,為什麼需要刪除這2個目錄了。

在RAC 刪除的那篇文件裡提到了解除安裝RAC時要刪除這2個目錄。 參考:

       RAC 解除安裝說明

crs.log 日誌內容:

2007-04-11 14:37:34.020: [ COMMCRS][1693]clsc_connect: (100f78610) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_drdb1_crs))

2007-04-11 14:37:34.020: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9

2007-04-11 14:37:34.021: [ CRSRTI][1] CSS is not ready. Received status 3 from CSS. Waiting for good status ..

2007-04-11 14:37:35.740: [ COMMCRS][1695]clsc_connect: (100f78610) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_drdb1_crs))

2007-04-11 14:37:35.740: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9

When we checked ocssd.log it contained the following

CSSD]2007-04-11 12:53:56.211 [6] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/rdsk/c5t8d0s5)
[ CSSD]2007-04-11 12:53:56.211 [10] >TRACE: clssnmvKillBlockThread: spawned for disk 1 (/dev/rdsk/c5t9d0s5) initial sleep interval (1000)ms
[ CSSD]2007-04-11 12:53:56.211 [11] >TRACE: clssnmvKillBlockThread: spawned for disk 0 (/dev/rdsk/c5t8d0s5) initial sleep interval (1000)ms
[ CSSD]2007-04-11 12:53:56.228 [1] >TRACE: clssnmFatalInit: fatal mode enabled
[ CSSD]2007-04-11 12:53:56.269 [13] >TRACE: clssnmconnect: connecting to node 1, flags 0×0001, connector 1
[ CSSD]2007-04-11 12:53:56.274 [13] >TRACE: clssnmClusterListener: Listening on (ADDRESS=(PROTOCOL=tcp)(HOST=drdb1-priv)(PORT=49895))

[ CSSD]2007-04-11 12:53:56.274 [13] >TRACE: clssnmconnect: connecting to node 0, flags 0×0000, connector 1
[ CSSD]2007-04-11 12:53:56.279 [14] >TRACE: clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1))

[ CSSD]2007-04-11 12:53:56.279 [14] >ERROR: clssgmclientlsnr: listening failed for (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1)) (3)
[ CSSD]2007-04-11 12:53:56.279 [14] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1))
[ CSSD]2007-04-11 12:53:56.279 [14] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_drdb1_crs))
[ CSSD]2007-04-11 13:07:36.516 >USER: Oracle Database 10g CSS Release 10.2.0.2.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
[ clsdmt]Fail to listen to (ADDRESS=(PROTOCOL=ipc)(KEY=drdb1DBG_CSSD))
[ CSSD]2007-04-11 13:07:36.516 >USER: CSS daemon log for node drdb1, number 1, in cluster crs
[ clsdmt]Terminating clsdm listening thread
[ CSSD]2007-04-11 13:07:36.536 [1] >TRACE: clssscmain: local-only set to false
[ CSSD]2007-04-11 13:07:36.545 [1] >TRACE: clssnmReadNodeInfo: added node 1 (drdb1) to cluster
[ CSSD]2007-04-11 13:07:36.588 [5] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 1
[ CSSD]2007-04-11 13:07:36.588 [1] >TRACE: clssnm_skgxnonline: Using vacuous skgxn monitor

解決方法:

       By checking the above logs we have realised the listener of CSS deamon was unable to start.

       the reason why it was unable to start was that each time server reboots it creates a socket at /tmp/.oracle or /var/tmp/.oracle directory .

       Also if there are previously existing sockets they cannot be reused or deleted automatically from this directory .oracle.

       Therefore the solution to above problem was obtained by deleting all the files inside .oracle directoery in /var/tmp or /tmp.

Hence the crs started and cluster came up.

2

 解決:CRS-0184: Cannot communicate with the CRS daemon. 2013-05-09 08:59:09  早上過來,啟動rac ,節點1出現了CRS-0184的錯誤;
 [[email protected] ~]$ crs_stat -t -v
CRS-0184: Cannot communicate with the CRS daemon.
而節點2都是正常的



網上找到一個比較簡單的方法在/tmp/和/var/tmp 下面有個.oracle 的目錄,刪除掉或許能解決問題;

登入節點1 ,在 /tmp 和/var/tmp 目錄下面發現了.oracle 目錄
分別在各自的目錄中建立了一個backup 目錄,把.oracle 移動到backup目錄下面,重啟節點1

解決:

mv  /var/log/.oracle  /var/log/.oracle.bak

重啟後正常了



我採用的第二種方法,果然很管用。

相關推薦

Oracle RAC CRS-0184 --Cannot communicate with the CRS daemon

Oracle 11gR2 下RAC 安裝後,啟動CRS. 錯誤如下: [[email protected] bin]# ./crsctl check crs CRS-4638: Oracle High Availability Services is onlin

oralce11g RAC 啟動後 CRS-0184: Cannot communicate with the CRS daemon.

asm art bili 解決 completed target let 服務器 style 很奇怪的一個問題! ORACLE數據庫服務器,系統啟動之後,查看集群狀態,發現CRS實例不可用,然後網上查找資料; 隔了幾分鐘之後,再次查詢相關集群服務狀態,發現正常了!!!

CRS-0184: Cannot communicate with the CRS daemon.之儲存故障解決辦法

$crs_stat-t CRS-0184: Cannot communicate with the CRSdaemon. $crsctlcheck crs CRS-4638: Oracle High Availability Servicesis online CRS-45

解決:Server IPC version 9 cannot communicate with client version 4

ipv mon code server ide pve ica col plugins 使用idea的maven項目運行mapreduce程序Server IPC version 9 cannot communicate with client version 4 原因:

Ask HN: Which can be the best way to communicate with the user?

I work with a cms and the eternal argument with the clients is how to interact with the users of the website. WhatsApp business or chatbot?

Cannot connect to the Docker daemon at unix:///var

systemctl start roo system local daemon cal conn 解決辦法 docker 已經成功安裝了。但是執行docker的時候報錯。 Cannot connect to the Docker daemon at unix:///var/

Docker未啟動錯誤:Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

tin comm system img eat man clas 分享圖片 ges 此問題是因為Docker安裝後未啟動所致,執行以下命令啟動docker: systemctl start docker.service 具體日誌如下: Conne

docker報錯:Cannot connect to the Docker daemon. Is the docker daemon running on this host?

docker這種報錯一般情況都是docker未啟動對於這種情況只用重啟docker就行了: service docker restart   還要一種情況則是docker配置檔案出錯按照提示檢視報錯,並找到相應位置進行更改: systemctl status do

Docker出現"Cannot connect to the Docker daemon at unix:///var/run/docker.sock. ..."問題

前言: 發現無論怎麼做 都會出現這種問題!而且docker啟動後立即停止stop了 sudo service docker start 看一下確實啟動了 但是 之後sudo service docker status檢視下 發現還是stop/waiting ok!搞事情! 查了一堆博

Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?錯誤

Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?錯誤及解決方式 前言 錯誤訊息 解決方式 參考連結

docker出現如下錯誤:Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

在docker中配置deepo時出現了錯誤: 在出現這個錯誤之前,我是先用如下命令檢視NVIDIA-docker是否安裝成功。 docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi 出現如下顯示: 在網上檢視原因,顯示是nv

Docker實踐6:Cannot connect to the Docker daemon.

正在免費適用著Aliyun主機,當然要用docker來部署我的伺服器啦。但是今天碰到了題目的問題,細節如下: # docker info FATA[0000] Cannot connect to the Docker daemon. Is 'docker

Cannot connect to the Docker daemon. Is 'docker -d' running on this host?

一、報錯原因安裝好了docker環境後,運行了一個centos7映象啟動的容器(為宿主機58080埠到容器8080的埠對映javaweb容器),容器啟動成功後,想進入容器,但是docker attach  xxxxx 之後,命令視窗卡住,沒有反應(當前的環境為虛擬機器)。當切

Docker初次安裝執行出錯:Cannot connect to the Docker daemon at ***/docker.sock. Is the docker daemon running?

報錯資訊如下:Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon runni

Cannot connect to the Docker daemon. Is the docker daemon running on this host?

當執行 service docker status 的時候,發現docker正常啟動,但是執行docker  ps 等命令的時候,出現  Cannot connect to the Docker daemon. Is the docker daemon running on

This problem will occur when running in 64 bit mode with the 32 bit Oracle client components installed.

safe bit min bad exc oci tex 應用 exceptio Attempt to load Oracle client libraries threw BadImageFormatException. This problem will occur w

ERROR 1290 (HY000): The MySQL server is running with the --secure-file-priv option so it cannot execute this statement

== running \n 就是 linux下 錯誤 var 網上 fontsize 今天在學習MySQL時候,想要將文本文件的數據導入到數據庫中,卻發現一直報錯,換了導入文本的路徑也還是同樣的錯誤,錯誤顯示ERROR 1290 (HY000): The MySQL s

ERROR 1290 (HY000): The MySQL server is running with the --secure-file-priv option so it cannot ···

MySQL報錯:ERROR 1290 (HY000): The MySQL server is running with the --secure-file-priv option so it cannot execute this statement 1.報錯 ERROR 12

ERROR 1290 (HY000): The MySQL server is running with the --skip-grant-tables option so it cannot exe

mysql 配置檔案目錄:/etc/my.cnf root 密碼為空的時候配置檔案中下面這句: skip-grant-tables GRANT ALL PRIVILEGES ON *.* TO IDENTIFIED BY '123' WITH GRAN

Driving Cars Could Communicate with You in the Future

Anyone who has crossed a busy street likely knows the informal language between pedestrians and drivers. A driver might wave her hand to indicate to the pe