1. 程式人生 > >使用Kubeadm(1.13+)快速搭建Kubernetes叢集

使用Kubeadm(1.13+)快速搭建Kubernetes叢集

原文: 使用Kubeadm(1.13+)快速搭建Kubernetes叢集

Kubeadm是管理叢集生命週期的重要工具,從建立到配置再到升級,Kubeadm處理現有硬體上的生產叢集的引導,並以最佳實踐方式配置核心Kubernetes元件,以便為新節點提供安全而簡單的連線流程並支援輕鬆升級。隨著Kubernetes 1.13 的釋出,現在Kubeadm正式成為GA。

準備

首先準備2臺虛擬機器(CPU最少2核),我是使用Hyper-V建立的2臺Ubuntu18.04虛擬機器,IP和機器名如下:

172.17.20.210 master

172.17.20.211 node1

禁用Swap

Kubernetes 1.8開始要求必須禁用Swap,如果不關閉,預設配置下kubelet將無法啟動。

編輯/etc/fstab檔案:

sudo vim /etc/fstab

UUID=8be04efd-f7c5-11e8-be8b-00155d000500 / ext4 defaults 0 0
UUID=C0E3-6A72 /boot/efi vfat defaults 0 0
#/swap.img      none    swap    sw      0       0

如上,將/swap.img所在的行註釋掉,然後執行:

sudo swapoff -a

(可選)DNS配置

在Ubuntu18.04+版本中,DNS由systemd全面接管,介面監聽在127.0.0.53:53,配置檔案在/etc/systemd/resolved.conf

中。

有時候會導致無法解析域名的問題,可使用如下2種方式來解決:

1.最簡單的就是關閉systemd-resolvd服務

sudo systemctl stop systemd-resolved
sudo systemctl disable systemd-resolved

然後手動修改/etc/resolv.conf檔案就可以了。

2.更加推薦的做法是修改systemd-resolv的設定:

sudo vim /etc/systemd/resolved.conf

# 修改為如下
[Resolve]
DNS=1.1.1.1 1.0.0.1
#FallbackDNS=
#Domains=
LLMNR=no
#MulticastDNS=no
#DNSSEC=no
#Cache=yes
#DNSStubListener=yes

DNS=設定的是域名解析伺服器的IP地址,這裡分別設為1.1.1.1和1.0.0.1
LLMNR=設定的是禁止執行LLMNR(Link-Local Multicast Name Resolution),否則systemd-resolve會監聽5535埠。

安裝Docker

Kubernetes從1.6開始使用CRI(Container Runtime Interface)容器執行時介面。預設的容器執行時仍然是Docker,是使用kubelet中內建dockershim CRI來實現的。

Docker的安裝可以參考之前的部落格:Docker初體驗

需要注意的是,Kubernetes 1.13已經針對Docker的1.11.1, 1.12.1, 1.13.1, 17.03, 17.06, 17.09, 18.06等版本做了驗證,最低支援的Docker版本是1.11.1,最高支援是18.06,而Docker最新版本已經是18.09了,故我們安裝時需要指定版本為18.06.1-ce

sudo apt install docker-ce=18.06.1~ce~3-0~ubuntu

安裝kubeadm, kubelet 和 kubectl

部署之前,我們需要安裝三個包:

  • kubeadm: 引導啟動k8s叢集的命令列工具。

  • kubelet: 在群集中所有節點上執行的核心元件, 用來執行如啟動pods和containers等操作。

  • kubectl: 操作叢集的命令列工具。

首先新增apt-key:

sudo apt update && sudo apt install -y apt-transport-https curl
curl -s https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add -

新增kubernetes源:

sudo vim /etc/apt/sources.list.d/kubernetes.list

deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main

安裝:

sudo apt update
sudo apt install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

使用kubeadm建立一個單Master叢集

初始化Master節點

K8s的控制面板元件執行在Master節點上,包括etcd和API server(Kubectl便是通過API server與k8s通訊)。

在執行初始化之前,我們還有一下3點需要注意:

1.選擇一個網路外掛,並檢查它是否需要在初始化Master時指定一些引數,比如我們可能需要根據選擇的外掛來設定--pod-network-cidr引數。參考:Installing a pod network add-on

2.kubeadm使用eth0的預設網路介面(通常是內網IP)做為Master節點的advertise address,如果我們想使用不同的網路介面,可以使用--apiserver-advertise-address=<ip-address>引數來設定。如果適應IPv6,則必須使用IPv6d的地址,如:--apiserver-advertise-address=fd00::101

3.使用kubeadm config images pull來預先拉取初始化需要用到的映象,用來檢查是否能連線到Kubenetes的Registries。

Kubenetes預設Registries地址是k8s.gcr.io,很明顯,在國內並不能訪問gcr.io,因此在kubeadm v1.13之前的版本,安裝起來非常麻煩,但是在1.13版本中終於解決了國內的痛點,其增加了一個--image-repository引數,預設值是k8s.gcr.io,我們將其指定為國內映象地址:registry.aliyuncs.com/google_containers,其它的就可以完全按照官方文件來愉快的玩耍了。

其次,我們還需要指定--kubernetes-version引數,因為它的預設值是stable-1,會導致從https://dl.k8s.io/release/stable-1.txt下載最新的版本號,我們可以將其指定為固定版本(最新版:v1.13.1)來跳過網路請求。

現在,我們就來試一下:

# 使用calico網路 --pod-network-cidr=192.168.0.0/16
sudo kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.13.1 --pod-network-cidr=192.168.0.0/16

# 輸出
[init] Using Kubernetes version: v1.13.1
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.17.20.210]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master localhost] and IPs [172.17.20.210 127.0.0.1 ::1]
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master localhost] and IPs [172.17.20.210 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[apiclient] All control plane components are healthy after 42.003645 seconds
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.13" in namespace kube-system with the configuration for the kubelets in the cluster
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "master" as an annotation
[mark-control-plane] Marking the node master as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: 6pkrlg.8glf2fqpuf3i489m
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 172.17.20.210:6443 --token 6pkrlg.8glf2fqpuf3i489m --discovery-token-ca-cert-hash sha256:eebfe256113bee397b218ba832f412273ae734bd4686241fb910885d26efd222

這次非常順利的就部署成功了,如果我們想使用非root使用者操作kubectl,可以使用以下命令,這也是kubeadm init輸出的一部分:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

安裝網路外掛

為了讓Pods間可以相互通訊,我們必須安裝一個網路外掛,並且必須在部署任何應用之前安裝,CoreDNS也是在網路外掛安裝之後才會啟動的。

網路的外掛完整列表,請參考 Networking and Network Policy

在安裝之前,我們先檢視一下當前Pods的狀態:

kubectl get pods --all-namespaces

# 輸出
NAMESPACE     NAME                             READY   STATUS    RESTARTS   AGE
kube-system   coredns-78d4cf999f-6pgfr         0/1     Pending   0          87s
kube-system   coredns-78d4cf999f-m9kgs         0/1     Pending   0          87s
kube-system   etcd-master                      1/1     Running   0          47s
kube-system   kube-apiserver-master            1/1     Running   0          38s
kube-system   kube-controller-manager-master   1/1     Running   0          55s
kube-system   kube-proxy-mkg24                 1/1     Running   0          87s
kube-system   kube-scheduler-master            1/1     Running   0          41s

如上,可以看到CoreDND的狀態是Pending,這是因為我們還沒有安裝網路外掛。

Calico是一個純三層的虛擬網路方案,Calico 為每個容器分配一個 IP,每個 host 都是 router,把不同 host 的容器連線起來。與 VxLAN 不同的是,Calico 不對資料包做額外封裝,不需要 NAT 和埠對映,擴充套件性和效能都很好。

預設情況下,Calico網路外掛使用的的網段是192.168.0.0/16,在init的時候,我們已經通過--pod-network-cidr=192.168.0.0/16來適配Calico,當然你也可以修改calico.yml檔案來指定不同的網段。

可以使用如下命令命令來安裝Canal外掛:

kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml

# 上面的calico.yaml會去quay.io拉取映象,如果無法拉取,可使用下面的國內映象
kubectl apply -f http://mirror.faasx.com/k8s/calico/v3.3.2/rbac-kdd.yaml
kubectl apply -f http://mirror.faasx.com/k8s/calico/v3.3.2/calico.yaml

關於更多Canal的資訊可以檢視Calico官方文件:kubeadm quickstart

稍等片刻,再使用kubectl get pods --all-namespaces命令來檢視網路外掛的安裝情況:

kubectl get pods --all-namespaces

# 輸出
NAMESPACE     NAME                             READY   STATUS    RESTARTS   AGE
kube-system   calico-node-x96gn                2/2     Running   0          47s
kube-system   coredns-78d4cf999f-6pgfr         1/1     Running   0          54m
kube-system   coredns-78d4cf999f-m9kgs         1/1     Running   0          54m
kube-system   etcd-master                      1/1     Running   3          53m
kube-system   kube-apiserver-master            1/1     Running   3          53m
kube-system   kube-controller-manager-master   1/1     Running   3          53m
kube-system   kube-proxy-mkg24                 1/1     Running   2          54m
kube-system   kube-scheduler-master            1/1     Running   3          53m

如上,STATUS全部變為了Running,表示安裝成功,接下來就可以加入其他節點以及部署應用了。

Master隔離

預設情況下,由於安全原因,叢集並不會將pods部署在Master節點上。但是在開發環境下,我們可能就只有一個Master節點,這時可以使用下面的命令來解除這個限制:

kubectl taint nodes --all node-role.kubernetes.io/master-

## 輸出
node/master untainted

加入工作節點

要為群集新增工作節點,需要為每臺計算機執行以下操作:

  • SSH到機器
  • 成為root使用者,(如: sudo su -)
  • 執行上面的kubeadm init命令輸出的:kubeadm join --token <token> <master-ip>:<master-port> --discovery-token-ca-cert-hash sha256:<hash>

如果我們忘記了Master節點的加入token,可以使用如下命令來檢視:

kubeadm token list

# 輸出
TOKEN                     TTL       EXPIRES                USAGES                   DESCRIPTION                                                EXTRA GROUPS
6pkrlg.8glf2fqpuf3i489m   22h       2018-12-07T13:46:33Z   authentication,signing   The default bootstrap token generated by 'kubeadm init'.   system:bootstrappers:kubeadm:default-node-token

預設情況下,token的有效期是24小時,如果我們的token已經過期的話,可以使用以下命令重新生成:

kubeadm token create

# 輸出
u2mt59.tyqpo0v5wf05lx2q

如果我們也沒有--discovery-token-ca-cert-hash的值,可以使用以下命令生成:

openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'

# 輸出
eebfe256113bee397b218ba832f412273ae734bd4686241fb910885d26efd222

現在,我們登入到工作節點伺服器,然後執行如下命令加入叢集(這也是上面init輸出的一部分):

sudo kubeadm join 172.17.20.210:6443 --token 6pkrlg.8glf2fqpuf3i489m --discovery-token-ca-cert-hash sha256:eebfe256113bee397b218ba832f412273ae734bd4686241fb910885d26efd222

# 輸出
[sudo] password for raining: 
[preflight] Running pre-flight checks
[discovery] Trying to connect to API Server "172.17.20.210:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://172.17.20.210:6443"
[discovery] Requesting info from "https://172.17.20.210:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "172.17.20.210:6443"
[discovery] Successfully established connection with API Server "172.17.20.210:6443"
[join] Reading configuration from the cluster...
[join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.13" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "node1" as an annotation

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.

等待一會,我們可以在Master節點上使用kubectl get nodes命令來檢視節點的狀態:

kubectl get nodes

# 輸出
NAME     STATUS   ROLES    AGE   VERSION
master   Ready    master   17m   v1.13.1
node1    Ready    <none>   15m   v1.13.1

如上全部Ready,大功告成,我們可以執行一些命令來測試一下叢集是否正常。

測試

首先驗證kube-apiserver, kube-controller-manager, kube-scheduler, pod network 是否正常:

# 部署一個 Nginx Deployment,包含兩個Pod
# https://kubernetes.io/docs/concepts/workloads/controllers/deployment/
kubectl create deployment nginx --image=nginx:alpine
kubectl scale deployment nginx --replicas=2

# 驗證Nginx Pod是否正確執行,並且會分配192.168.開頭的叢集IP
kubectl get pods -l app=nginx -o wide

# 輸出如下:
NAME                     READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
nginx-54458cd494-p8jzs   1/1     Running   0          31s   192.168.1.2   node1   <none>           <none>
nginx-54458cd494-v2m4b   1/1     Running   0          24s   192.168.1.3   node1   <none>           <none>

再驗證一下kube-proxy是否正常:

# 以 NodePort 方式對外提供服務 https://kubernetes.io/docs/concepts/services-networking/connect-applications-service/
kubectl expose deployment nginx --port=80 --type=NodePort

# 檢視叢集外可訪問的Port
kubectl get services nginx

# 輸出
NAME    TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
nginx   NodePort   10.110.49.49   <none>        80:31899/TCP   4s

# 可以通過任意 NodeIP:Port 在叢集外部訪問這個服務,本示例中部署的2臺叢集IP分別是172.17.20.210和172.17.20.211
curl http://172.17.20.210:31899
curl http://172.17.20.211:31899

最後驗證一下dns, pod network是否正常:

# 執行Busybox並進入互動模式
kubectl run -it curl --image=radial/busyboxplus:curl

# 輸入`nslookup nginx`檢視是否可以正確解析出叢集內的IP,已驗證DNS是否正常
[ [email protected]:/ ]$ nslookup nginx

# 輸出
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      nginx
Address 1: 10.110.49.49 nginx.default.svc.cluster.local

# 通過服務名進行訪問,驗證kube-proxy是否正常
[ [email protected]:/ ]$ curl http://nginx/

# 輸出如下:
# <!DOCTYPE html> ---省略

# 分別訪問一下2個Pod的內網IP,驗證跨Node的網路通訊是否正常
[ [email protected]:/ ]$ curl http://192.168.1.2/
[ [email protected]:/ ]$ curl http://192.168.1.3/

驗證通過,叢集搭建成功,接下來我們就可以參考官方文件來部署其他服務,愉快的玩耍了。

解除安裝叢集

想要撤銷kubeadm執行的操作,首先要排除節點,並確保該節點為空, 然後再將其關閉。

在Master節點上執行:

kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
kubectl delete node <node name>

然後在需要移除的節點上,重置kubeadm的安裝狀態:

sudo kubeadm reset

如果你想重新配置叢集,使用新的引數重新執行kubeadm init或者kubeadm join即可。

參考資料