1. 程式人生 > >CentOS 7 yum安裝 k8s 建立Pod一直處於ContainerCreating狀態 問題解決

CentOS 7 yum安裝 k8s 建立Pod一直處於ContainerCreating狀態 問題解決

問題描述

使用CentOS7的 yum 包管理器安裝了 Kubernetes 叢集,使用 kubectl 建立服務成功後,執行 kubectl get pods,發現AGE雖然在不斷增加,但狀態始終不變

本文內容

  • 分析問題原因
  • 給出直接解決此問題的方式 (不完美)
  • 給出其他方案

且聽我娓娓道來~

問題分析與解決

kubectl 提供了 describe 子命令來輸出指定的一個/多個資源的詳細資訊。

執行 kubectl describe pod mytomcat-9lcq5,檢視問題 Pod 的狀態資訊,輸出如下:

[root@kube-master app]# kubectl describe pod mytomcat-9lcq5
Name:		mytomcat-9lcq5
Namespace:	default
Node:		kube-node-2/192.168.87.145
Start Time:	Fri, 17 Apr 2020 15:53:50 +0800
Labels:		app=mytomcat
Status:		Pending
IP:		
Controllers:	ReplicationController/mytomcat
Containers:
  mytomcat:
    Container ID:		
    Image:			tomcat:9-jre8-alpine
    Image ID:			
    Port:			8080/TCP
    State:			Waiting
      Reason:			ContainerCreating
    Ready:			False
    Restart Count:		0
    Volume Mounts:		<none>
    Environment Variables:	<none>
Conditions:
  Type		Status
  Initialized 	True 
  Ready 	False 
  PodScheduled 	True 
No volumes.
QoS Class:	BestEffort
Tolerations:	<none>
Events:
  FirstSeen	LastSeen	Count	From			SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----			-------------	--------	------		-------
  5m		5m		1	{default-scheduler }			Normal		Scheduled	Successfully assigned mytomcat-9lcq5 to kube-node-2
  4m		4m		1	{kubelet kube-node-2}			Warning		FailedSync	Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for registry.access.redhat.com/rhel7/pod-infrastructure:latest, this may be because there are no credentials on this request.  details: (Get https://registry.access.redhat.com/v1/_ping: net/http: TLS handshake timeout)"

  3m	3m	1	{kubelet kube-node-2}		Warning	FailedSync	Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for registry.access.redhat.com/rhel7/pod-infrastructure:latest, this may be because there are no credentials on this request.  details: (Network timed out while trying to connect to https://registry.access.redhat.com/v1/repositories/rhel7/pod-infrastructure/images. You may want to check your internet connection or if you are behind a proxy.)"

  2m	2m	1	{kubelet kube-node-2}		Warning	FailedSync	Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for registry.access.redhat.com/rhel7/pod-infrastructure:latest, this may be because there are no credentials on this request.  details: (Error: image rhel7/pod-infrastructure:latest not found)"

  3m	1m	3	{kubelet kube-node-2}		Warning	FailedSync	Error syncing pod, skipping: failed to "StartContainer" for "POD" with ImagePullBackOff: "Back-off pulling image \"registry.access.redhat.com/rhel7/pod-infrastructure:latest\""

通過檢視最下方的輸出資訊,Successfully assigned mytomcat-9lcq5 to kube-node-2 說明這個 Pod 分配到 kube-node-2 這個主機上了,然後在這個主機上建立 Pod 失敗,

原因是 image pull failed for registry.access.redhat.com/rhel7/pod-infrastructure:latest, this may be because there are no credentials on this request.

通過以上資訊,我們瞭解到通過紅帽自家的 docker 倉庫 pull 映象,需要使用 CA 證書進行認證,才能 pull 成功

docker的證書在 /etc/docker/certs.d 目錄下,根據上邊的錯誤提示域名是 registry.access.redhat.com,證書在這個目錄中

經過 ll 命令檢視,發現 /etc/docker/certs.d/registry.access.redhat.com/redhat-ca.crt 是一個軟連結(軟連結是什麼?),指向到 /etc/rhsm/ca/redhat-uep.pem

熟悉軟連線的我們知道,處於紅色閃爍狀態的目標是不存在,需要生成 /etc/rhsm/ca/redhat-uep.pem 證書檔案

生成證書:

# openssl s_client -showcerts -servername registry.access.redhat.com -connect registry.access.redhat.com:443 </dev/null 2>/dev/null | openssl x509 -text > /etc/rhsm/ca/redhat-uep.pem

生成證書命令執行有時會出現 unable to load certificate 139930742028176:error:0906D06C:PEM routines:PEM_read_bio:no start line:pem_lib.c:707:Expecting: TRUSTED CERTIFICATE 問題,重新執行就好

命令執行完畢後,檢視軟連結指向的證書檔案:

[root@kube-node-2 registry.access.redhat.com]# ll /etc/rhsm/ca/redhat-uep.pem
-rw-r--r-- 1 root root 9233 Apr 17 16:55 /etc/rhsm/ca/redhat-uep.pem

證書檔案已經存在,我們去 k8s 管理節點 kube-master 主機刪除剛才的 Pods,等待 Pod 重新建立成功 (第二個節點因為網路問題沒有拉成功映象……)

至此完成 Pod 的建立

但是還有存在些問題的,當前國內網路環境訪問外邊的網路偶爾會有問題,導致建立 Pod 失敗,通過 describe 描述還是同樣的資訊提示,但是檢視證書檔案卻存在且有內容

原因分析與其他方案

k8s 管理節點分配建立 Pod 到執行節點,到達執行節點後,拉取紅帽 docker 倉庫的 Pod基礎映象 pod-infrastructure:latest,由於其倉庫使用 https 需要驗證證書,證書不存在導致失敗

另外就是因為拉取的映象是紅帽 docker 倉庫中的,在國內網路環境下握手失敗,無法下載映象

所以問題就成了 如何解決 k8s pod-infrastructure 映象拉取失敗,這裡給出一個方案,步驟如下:

  • 拉取 docker 官方倉庫其他人上傳的 pod-infrastructure 映象,docker pull tianyebj/pod-infrastructure

  • 新增tag標籤,改為私有倉庫地址,如:docker tag tianyebj/pod-infrastructure 10.2.7.70:5000/dev/pod-infrastructure

  • push映象到私有倉庫,如:docker push 10.2.7.70:5000/dev/pod-infrastructure

  • 修改所有 worker 節點的 /etc/kubernetes/kubelet,修改 registry.access.redhat.com/rhel7/pod-infrastructure 為剛才設定的 tag 標籤

    sed -i "s#registry.access.redhat.com/rhel7/pod-infrastructure#<私有倉庫pod-infrastructure映象tag>#" /etc/kubernetes/kubelet
    

  • 重啟所有 worker 節點的 kubelet,systemctl restart kubelet,即可

注意事項:

  • 上傳的映象要設為公開映象,否則 kubelet 自己沒許可權拉映象的,另外也可以去 ssh 登入 worker 節點登入倉庫,執行docker pull <私有倉庫pod-infrastructure映象tag>

最後的效果:

參考

https://github.com/CentOS/sig-atomic-buildscripts/issues/329
https://cloud.tencent.com/developer/article/1156329

本文采用 CC BY 4.0 協議進行授權,轉載請標註作者署名及來源。
https://www.cnblogs.com/hellxz/p/k8s-pod-always-container-creating-status-problem.h