定位“kubernetes pod卡在ContainerCreating狀態”問題的方法
阿新 • • 發佈:2018-12-31
經過千辛萬苦終於在本地搭建k8s環境後,昨天在除錯的時候有出現了pod卡在ContainerCreating狀態的問題。
這個問題的原因有幾種,我遇到的問題是拉去image失敗,如“image pull failed for gcr.io/google_containers/pause:2.0”。原來k8s預設從gcr.io/google_containers拉去映象,國內網路無法訪問。原來忘了連線VPN了…
問題是比較低階,其實主要是想跟大家分享下定位的方法。主要是通過“kubectl describe pod PodName”指令檢視pod發生的事件,從事件列表中可以查詢到錯誤資訊。
[email protected] -ubuntu-trusty-64:~/work/k8s-foo$ kubectl run foo --image=hello-world
deployment "foo" created
[email protected]-ubuntu-trusty-64:~/work/k8s-foo$ kubectl get pods
NAME READY STATUS RESTARTS AGE
foo-928603113-igh2x 0/1 ContainerCreating 0 4 m
[email protected]-ubuntu-trusty-64:~/work/k8s-foo$ kubectl describe pod foo
Name: foo-928603113-igh2x
Namespace: default
Node: 127.0.0.1/127.0.0.1
Start Time: Mon, 11 Apr 2016 15:11:49 +0000
Labels: pod-template-hash=928603113,run=foo
Status: Pending
IP:
Controllers: ReplicaSet/foo- 928603113
Containers:
foo:
Container ID:
Image: hello-world
Image ID:
Port:
QoS Tier:
memory: BestEffort
cpu: BestEffort
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment Variables:
Conditions:
Type Status
Ready False
Volumes:
default-token-fbasq:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-fbasq
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
7m 7m 1 {default-scheduler } Normal Scheduled Successfully assigned foo-928603113-igh2x to 127.0.0.1
4m 4m 1 {kubelet 127.0.0.1} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for gcr.io/google_containers/pause:2.0, this may be because there are no credentials on this request. details: (API error (500): unable to ping registry endpoint https://gcr.io/v0/\nv2 ping attempt failed with error: Get https://gcr.io/v2/: dial tcp 74.125.203.82:443: i/o timeout\n v1 ping attempt failed with error: Get https://gcr.io/v1/_ping: dial tcp 74.125.203.82:443: i/o timeout\n)"
晚間嘗試啟動kube-dns時也遇到了類似的問題。檢視kube-dns Service時一切正常:
vagrant@vagrant-ubuntu-trusty-64:~/work/k8s-foo$ kubectl get services kube-dns --namespace=kube-system
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns 10.0.0.10 <none> 53/UDP,53/TCP 56m
但啟動一個Service之後嘗試使用Service名稱解析dns卻失敗了。執行“kubectl get pods –namespace=kube-system”檢視發現kube-dns相關pod啟動失敗了。
再通過“kubectl describe”檢視相關pod的事件時發現原來kube-dns啟動時也需要下載新映象。果斷開啟VPN,再重啟叢集,over。