使用Kubeadm搭建Kubernetes(1.12.2)叢集
Kubeadm是Kubernetes官方提供的用於快速安裝Kubernetes叢集的工具,伴隨Kubernetes每個版本的釋出都會同步更新,在2018年將進入GA狀態,說明離生產環境中使用的距離越來越近了。
使用Kubeadm搭建Kubernetes叢集本來是件很簡單的事,但由於眾所周知的原因,在中國大陸是無法訪問k8s.gcr.io
的。這就使我們無法按照官方的教程來建立叢集。而國內的教程參差不齊,大多也無法執行成功,我也是踩了很多坑,才部署成功,故在此分享出來。
準備
- 多臺Ubuntu 16.04+、CentOS 7或HypriotOSv1.0.1+ 系統。
- 每臺機器最少2GB記憶體,2CPUs。
- 叢集中所有機器之間網路連線正常。
- 開啟相應的埠,詳見:ofollow,noindex" target="_blank">Check required ports 。
-
關閉防火牆和selinux。
# 關閉防火牆 systemctl stop firewalld systemctl disable firewalld # 禁用SELINUX setenforce 0 vim /etc/selinux/config SELINUX=disabled
-
關閉系統的Swap,Kubernetes 1.8開始要求必須禁用Swap,如果不關閉,預設配置下kubelet將無法啟動。
# 關閉系統的Swap方法如下: # 編輯`/etc/fstab`檔案,註釋掉引用`swap`的行,儲存並重啟後輸入: sudo swapoff -a
-
驗證Mac地址和product_uuid是否唯一。
Kubernetes要求叢集中所有機器具有不同的Mac地址、產品uuid、Hostname。可以使用如下命令檢視:
# UUID cat /sys/class/dmi/id/product_uuid # Mac地址 ip link # Hostname cat /etc/hostname
在本示例中使用2臺Ubuntu 18.04主機:
cat /etc/hosts 192.168.0.8 ubuntu1 192.168.0.7 ubuntu2
安裝Docker
Kubernetes從1.6開始使用CRI(Container Runtime Interface)容器執行時介面。預設的容器執行時仍然是Docker,是使用kubelet中內建dockershim CRI 來實現的。
Docker的安裝可以參考之前的部落格:Docker初體驗。
需要注意的是,Kubernetes 1.12已經針對Docker的1.11.1, 1.12.1, 1.13.1, 17.03, 17.06, 17.09, 18.06等版本做了驗證,最低支援的Docker版本是1.11.1,最高支援是18.06,而Docker最新版本已經是18.09
了,故我們安裝時需要指定版本為18.06.1-ce
:
sudo apt install docker-ce=18.06.1~ce~3-0~ubuntu
安裝kubeadm, kubelet 和 kubectl
部署之前,我們需要安裝一下三個包:
-
kubeadm: 引導啟動k8s叢集的命令列工具。
-
kubelet: 在群集中所有節點上執行的核心元件, 用來執行如啟動pods和containers等操作。
-
kubectl: 操作叢集的命令列工具。
首先新增apt-key:
sudo apt update && sudo apt install -y apt-transport-https curl curl -s https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add -
新增kubernetes源:
sudo vim /etc/apt/sources.list.d/kubernetes.list deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
安裝:
sudo apt update sudo apt install -y kubelet kubeadm kubectl sudo apt-mark hold kubelet kubeadm kubectl
使用kubeadm建立一個單Master叢集
初始化Master節點
K8s的控制面板元件執行在Master節點上,包括etcd和API server(Kubectl便是通過API server與k8s通訊)。
在執行初始化之前,我們還有一下3點需要注意:
-
選擇一個網路外掛,並檢查它是否需要在初始化Master時指定一些引數,比如我們可能需要根據選擇的外掛來設定
--pod-network-cidr
引數。參考:Installing a pod network add-on 。 -
kubeadm使用eth0的預設網路介面(通常是內網IP)做為Master節點的advertise address,如果我們想使用不同的網路介面,可以使用
--apiserver-advertise-address=<ip-address>
引數來設定。如果適應IPv6,則必須使用IPv6d的地址,如:--apiserver-advertise-address=fd00::101
。 -
由於國內的網路問題,建議使用
kubeadm config images pull
來預先拉取初始化需要用到的映象,並檢查是否能連線到gcr.io
的registries。
很明顯,在國內並不能訪問gcr.io,在上篇文章使用kubeadm搭建Kubernetes(1.10.2)叢集(國內環境) 中使用了打tag的方式,而這次,我們通過修改配置檔案來拉實現。
在kubeadm v1.11+版本中,增加了一個kubeadm config print-default
命令,可以讓我們方便的將kubeadm的預設配置列印到檔案中:
kubeadm config print-default > kubeadm.conf
然後我們修改kubeadm.conf
中的映象倉儲地址:
sed -i "s/imageRepository: .*/imageRepository: registry.aliyuncs.com\/google_containers/g" kubeadm.conf
指定我們要的版本號,避免初始化時從https://dl.k8s.io/release/stable-1.12.txt
讀取,可使用如下命令來設定:
sed -i "s/kubernetesVersion: .*/kubernetesVersion: v1.12.2/g" kubeadm.conf
現在我們可以使用--config
引數指定kubeadm.conf
檔案來執行kubeadm
的images pull
的命令:
kubeadm config images pull --config kubeadm.conf W1103 06:10:18.78295823149 common.go:105] WARNING: Detected resource kinds that may not apply: [InitConfiguration MasterConfiguration JoinConfiguration NodeConfiguration] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1alpha3, Kind=JoinConfiguration [config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.12.2 [config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.12.2 [config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.12.2 [config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.12.2 [config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.1 [config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.2.24 [config/images] Pulled registry.aliyuncs.com/google_containers/coredns:1.2.2
可以看到,已經成功拉取了需要的映象。
但是,此處還有一個坑,基礎映象pause
的拉取地址需要單獨設定,否則還是會從k8s.gcr.io
來拉取,導致init
的時候卡住,並最終失敗:
[init] this might take a minute or longer if the control plane images have to be pulled Unfortunately, an error has occurred: timed out waiting for the condition This error is likely caused by: - The kubelet is not running - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled) If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands: - 'systemctl status kubelet' - 'journalctl -xeu kubelet'
解決辦法有2種:
最簡單就是打一個k8s.gcr.io/pause:3.1
的Tag:
docker tag registry.aliyuncs.com/google_containers/pause:3.1 k8s.gcr.io/pause:3.1
其次可以通過修改kubeadm.conf
中的InitConfiguration
的nodeRegistration:kubeletExtraArgs:pod-infra-container-image
引數來設定基礎映象,大約在14行,修改後如下:
kind: InitConfiguration nodeRegistration: kubeletExtraArgs: pod-infra-container-image: registry.aliyuncs.com/google_containers/pause:3.1
通常,我們在執行init
命令時,可能還需要指定advertiseAddress
、--pod-network-cidr
等引數,但是由於我們這裡使用kubeadm.conf
配置檔案來初始化,就不能在命令列中指定其他引數了,因此需要我們在kubeadm.conf
來設定。
如下,我們修改kubeadm.conf
中與--apiserver-advertise-address
引數對應的advertiseAddress
引數,我的虛擬機器IP是:192.168.0.8
,大家根據自己的實際情況來設定:
sed -i "s/advertiseAddress: .*/advertiseAddress: 192.168.0.8/g" kubeadm.conf
在本示例中,我使用的是Canal
網路外掛,因此需要將--pod-network-cid
設定為10.244.0.0/16
,修改如下:
sed -i "s/podSubnet: .*/podSubnet: \"10.244.0.0\/16\"/g" kubeadm.conf
現在可以執行初始化命令了:
sudo kubeadm init --config kubeadm.conf
輸出如下:
W1109 17:01:47.07149442929 common.go:105] WARNING: Detected resource kinds that may not apply: [InitConfiguration MasterConfiguration JoinConfiguration NodeConfiguration] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1alpha3, Kind=JoinConfiguration [init] using Kubernetes version: v1.12.2 [preflight] running pre-flight checks [preflight/images] Pulling images required for setting up a Kubernetes cluster [preflight/images] This might take a minute or two, depending on the speed of your internet connection [preflight/images] You can also perform this action in beforehand using 'kubeadm config images pull' [kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [preflight] Activating the kubelet service [certificates] Generated ca certificate and key. [certificates] Generated apiserver certificate and key. [certificates] apiserver serving cert is signed for DNS names [ubuntu1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.0.8] [certificates] Generated apiserver-kubelet-client certificate and key. [certificates] Generated front-proxy-ca certificate and key. [certificates] Generated front-proxy-client certificate and key. [certificates] Generated etcd/ca certificate and key. [certificates] Generated etcd/server certificate and key. [certificates] etcd/server serving cert is signed for DNS names [ubuntu1 localhost] and IPs [127.0.0.1 ::1] [certificates] Generated apiserver-etcd-client certificate and key. [certificates] Generated etcd/peer certificate and key. [certificates] etcd/peer serving cert is signed for DNS names [ubuntu1 localhost] and IPs [192.168.0.8 127.0.0.1 ::1] [certificates] Generated etcd/healthcheck-client certificate and key. [certificates] valid certificates and keys now exist in "/etc/kubernetes/pki" [certificates] Generated sa key and public key. [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf" [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf" [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf" [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf" [controlplane] wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml" [controlplane] wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml" [controlplane] wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml" [etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml" [init] waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests" [init] this might take a minute or longer if the control plane images have to be pulled [apiclient] All control plane components are healthy after 57.002438 seconds [uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config-1.12" in namespace kube-system with the configuration for the kubelets in the cluster [markmaster] Marking the node ubuntu1 as master by adding the label "node-role.kubernetes.io/master=''" [markmaster] Marking the node ubuntu1 as master by adding the taints [node-role.kubernetes.io/master:NoSchedule] [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "ubuntu1" as an annotation [bootstraptoken] using token: abcdef.0123456789abcdef [bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes master has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of machines by running the following on each node as root: kubeadm join 192.168.0.8:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:67ea537411822fe684d1ddb984802da62a4f22aa1c32fefe7c3404bb8f3f52e0
如果我們想使用非root使用者操作kubectl
,可以使用以下命令,這也是kubeadm init
輸出的一部分:
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
安裝網路外掛
為了讓Pods間可以相互通訊,我們必須安裝一個網路外掛,並且必須在部署任何應用之前安裝,CoreDNS 也是在網路外掛安裝之後才會啟動的。
網路的外掛完整列表,請參考Networking and Network Policy 。
在安裝之前,我們先檢視一下當前Pods的狀態:
kubectl get pods --all-namespaces # 輸出 NAMESPACENAMEREADYSTATUSRESTARTSAGE kube-systemcoredns-5c545769d8-j9vzw0/1Pending0110s kube-systemcoredns-5c545769d8-wqrlm0/1Pending0111s kube-systemetcd-ubuntu11/1Running075s kube-systemkube-apiserver-ubuntu11/1Running087s kube-systemkube-controller-manager-ubuntu11/1Running096s kube-systemkube-proxy-snhqr1/1Running0111s kube-systemkube-scheduler-ubuntu11/1Running098s
如上,可以看到CoreDND
的狀態是Pending
,就是因為我們還沒有安裝網路外掛。
我是比較推薦的是Calico
網路外掛,但是由於我的虛擬機器網段是192.168.0.x,無法使用Calico
網路,所以使用了Canal
網路外掛,它是Calico
和Flannel
的結合體,在上面kubeadm init
的時候我們已經指定了--pod-network-cidr=10.244.0.0/16
,這是Canal
外掛所要求的。
可使用如下命令命令來安裝Canal
外掛:
# 源地址:https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/canal/rbac.yaml kubectl apply -f http://mirror.faasx.com/k8s/canal/v3.3/rbac.yaml # 源地址:https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/canal/canal.yaml # 只是將quay.io修改成了國內映象 kubectl apply -f http://mirror.faasx.com/k8s/canal/v3.3/canal.yaml
關於更多Canal
的資訊,可以檢視Installing Calico for policy and flannel for networking
。
稍等片刻,再使用kubectl get pods --all-namespaces
命令來檢視網路外掛的安裝情況:
NAMESPACENAMEREADYSTATUSRESTARTSAGE kube-systemcanal-frf6b3/3Running325m kube-systemcoredns-5c545769d8-j9vzw1/1Running29h kube-systemcoredns-5c545769d8-wqrlm1/1Running29h kube-systemetcd-ubuntu11/1Running19h kube-systemkube-apiserver-ubuntu11/1Running19h kube-systemkube-controller-manager-ubuntu11/1Running19h kube-systemkube-proxy-snhqr1/1Running19h kube-systemkube-scheduler-ubuntu11/1Running19h
如上,STATUS全部變為了Running
,表示安裝成功,接下來就可以加入其他節點以及部署應用了。
Master隔離
預設情況下,由於安全原因,叢集並不會將pods部署在Master節點上。但是在開發環境下,我們可能就只有一個Master節點,這時可以使用下面的命令來解除這個限制:
kubectl taint nodes --all node-role.kubernetes.io/master- ## 輸出 node/ubuntu1 untainted
加入工作節點
要為群集新增工作節點,需要為每臺計算機執行以下操作:
- SSH到機器
- 成為root使用者,(如: sudo su -)
-
執行上面的
kubeadm init
命令輸出的:kubeadm join --token <token> <master-ip>:<master-port> --discovery-token-ca-cert-hash sha256:<hash>
如果我們忘記了Master節點的加入token,可以使用如下命令來檢視:
kubeadm token list # 輸出: # TOKENTTLEXPIRESUSAGESDESCRIPTIONEXTRA GROUPS # abcdef.0123456789abcdef22h2018-11-10T14:24:51Zauthentication,signing<none>system:bootstrappers:kubeadm:default-node-token
預設情況下,token的有效期是24小時,如果我們的token已經過期的話,可以使用以下命令重新生成:
kubeadm token create # 輸出: # 9w6mbu.3k2z7pprl3eaozk9
如果我們也沒有--discovery-token-ca-cert-hash
的值,可以使用以下命令生成:
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //' # 輸出: # 9fcb02a0f4ab216866f87986106437b7305474850f0de81b9ac9c36a468f7c67
現在,我們登入到工作節點伺服器,準備加入到叢集。
但是還有最重要的一點就是,基礎映象pause
需要單獨設定,否則還是會從k8s.gcr.io
來拉取,我們可以使用類似Init
時修改配置檔案的方式來實現,不過,由於就這一個映象拉取有問題,我們可以簡單的打個tag:
docker pull registry.aliyuncs.com/google_containers/pause:3.1 docker tag registry.aliyuncs.com/google_containers/pause:3.1 k8s.gcr.io/pause:3.1
然後執行如下命令加入叢集:
sudo kubeadm join 192.168.0.8:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:67ea537411822fe684d1ddb984802da62a4f22aa1c32fefe7c3404bb8f3f52e0
輸出如下:
[preflight] running pre-flight checks [WARNING RequiredIPVSKernelModulesAvailable]: the IPVS proxier will not be used, because the following required kernel modules are not loaded: [ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh] or no builtin kernel ipvs support: map[ip_vs_wrr:{} ip_vs_sh:{} nf_conntrack_ipv4:{} ip_vs:{} ip_vs_rr:{}] you can solve this problem with following methods: 1. Run 'modprobe -- ' to load missing kernel modules; 2. Provide the missing builtin kernel ipvs support [discovery] Trying to connect to API Server "192.168.0.8:6443" [discovery] Created cluster-info discovery client, requesting info from "https://192.168.0.8:6443" [discovery] Requesting info from "https://192.168.0.8:6443" again to validate TLS against the pinned public key [discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "192.168.0.8:6443" [discovery] Successfully established connection with API Server "192.168.0.8:6443" [kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.12" ConfigMap in the kube-system namespace [kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [preflight] Activating the kubelet service [tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap... [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "ubuntu2" as an annotation This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the master to see this node join the cluster.
等待一會,我們可以在Master節點上使用kubectl get nodes
命令來檢視節點的狀態:
kubectl get nodes # 輸出: # NAMESTATUSROLESAGEVERSION # ubuntu1Readymaster9hv1.12.2 # ubuntu2Ready<none>2m24sv1.12.2
如上全部Ready
,大功告成,我們可以執行一些命令來測試一下。
測試
首先驗證kube-apiserver ,kube-controller-manager ,kube-scheduler ,pod network 是否正常:
# 部署一個 Nginx Deployment,包含兩個Pod # https://kubernetes.io/docs/concepts/workloads/controllers/deployment/ kubectl create deployment nginx --image=nginx:alpine kubectl scale deployment nginx --replicas=2 # 驗證Nginx Pod是否正確執行,並且會分配10.244.開頭的叢集IP kubectl get pods -l app=nginx -o wide # 輸出如下: # NAMEREADYSTATUSRESTARTSAGEIPNODENOMINATED NODE # nginx-65d5c4f7cc-7pzgp1/1Running088s10.244.1.2ubuntu2<none> # nginx-65d5c4f7cc-l2h261/1Running082s10.244.1.3ubuntu2<none>
再驗證一下kube-proxy
是否正常:
# 以 NodePort 方式對外提供服務 https://kubernetes.io/docs/concepts/services-networking/connect-applications-service/ kubectl expose deployment nginx --port=80 --type=NodePort # 檢視叢集外可訪問的Port kubectl get services nginx # 輸出如下: # NAMETYPECLUSTER-IPEXTERNAL-IPPORT(S)AGE # nginxNodePort10.110.142.125<none>80:30092/TCP7s # 可以通過任意 NodeIP:Port 在叢集外部訪問這個服務,本示例中部署的2臺叢集IP分別是192.168.0.8和192.168.0.7 curl http://192.168.0.8:30092 curl http://192.168.0.7:30092
最後驗證一下dns ,pod network 是否正常:
# 執行Busybox並進入互動模式 kubectl run -it curl --image=radial/busyboxplus:curl # 輸入`nslookup nginx`檢視是否可以正確解析出叢集內的IP,已驗證DNS是否正常 [ root@curl-5cc7b478b6-tlf46:/ ]$ nslookup nginx # 輸出如下: # Server:10.96.0.10 # Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local # # Name:nginx # Address 1: 10.110.142.125 nginx.default.svc.cluster.local # 通過服務名進行訪問,驗證kube-proxy是否正常 [ root@curl-5cc7b478b6-tlf46:/ ]$ curl http://nginx/ # 輸出如下: # <!DOCTYPE html> ---省略 # 分別訪問一下2個Pod的內網IP,驗證跨Node的網路通訊是否正常 [ root@curl-5cc7b478b6-tlf46:/ ]$ curl http://10.244.1.2/ [ root@curl-5cc7b478b6-tlf46:/ ]$ curl http://10.244.1.3/
驗證通過,叢集搭建成功,接下來我們就可以參考官方文件 來部署其他服務,愉快的玩耍了。
解除安裝叢集
想要撤銷kubeadm執行的操作,首先要排除節點 ,並確保該節點為空, 然後再將其關閉。
在Master節點上執行:
kubectl drain <node name> --delete-local-data --force --ignore-daemonsets kubectl delete node <node name>
然後在需要移除的節點上,重置kubeadm的安裝狀態:
sudo kubeadm reset
如果你想重新配置叢集,使用新的引數重新執行kubeadm init
或者kubeadm join
即可。