1. 程式人生 > >centos7使用kubeadm配置高可用k8s叢集

centos7使用kubeadm配置高可用k8s叢集

 


CountingStars_  2018.08.12 09:06* 字數 464 閱讀 88評論 0

簡介

使用kubeadm配置多master節點,實現高可用。

安裝

實驗環境說明

實驗架構圖
lab1: etcd master haproxy keepalived 11.11.11.111 lab2: etcd master haproxy keepalived 11.11.11.112 lab3: etcd master haproxy keepalived 11.11.11.113 lab4: node 11.11.11.114 lab5: node 11.11.11.115 lab6: node 11.11.11.116 vip(loadblancer ip): 11.11.11.110 
實驗使用的Vagrantfile
# -*- mode: ruby -*-
# vi: set ft=ruby :

ENV["LC_ALL"] = "en_US.UTF-8"

Vagrant.configure("2") do |config| (1..6).each do |i| config.vm.define "lab#{i}" do |node| node.vm.box = "centos-7.4-docker-17" node.ssh.insert_key = false node.vm.hostname = "lab#{i}" node.vm.network "private_network", ip: "11.11.11.11#{i}" node.vm.provision "shell", inline: "echo hello from node #{i}" node.vm.provider "virtualbox" do |v| v.cpus = 2 v.customize ["modifyvm", :id, "--name", "lab#{i}", "--memory", "2048"] end end end end 

在所有機器上安裝kubeadm

參考之前的文章《centos7安裝kubeadm》

配置所有節點的kubelet

# 配置kubelet使用國內可用映象
# 修改/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
# 新增如下配置 
Environment="KUBELET_EXTRA_ARGS=--pod-infra-container-image=registry.cn-shanghai.aliyuncs.com/gcr-k8s/pause-amd64:3.0"

# 使用命令 sed -i '/ExecStart=$/i Environment="KUBELET_EXTRA_ARGS=--pod-infra-container-image=registry.cn-shanghai.aliyuncs.com/gcr-k8s/pause-amd64:3.0"' /etc/systemd/system/kubelet.service.d/10-kubeadm.conf # 重新載入配置 systemctl daemon-reload 

配置所有節點的hosts

cat >>/etc/hosts<<EOF
11.11.11.111 lab1
11.11.11.112 lab2
11.11.11.113 lab3
11.11.11.114 lab4
11.11.11.115 lab5
11.11.11.116 lab6
EOF

啟動etcd叢集

lab1,lab2,lab3節點上啟動etcd叢集

# lab1
docker stop etcd && docker rm etcd
rm -rf /data/etcd
mkdir -p /data/etcd
docker run -d \
--restart always \
-v /etc/etcd/ssl/certs:/etc/ssl/certs \
-v /data/etcd:/var/lib/etcd \
-p 2380:2380 \
-p 2379:2379 \ --name etcd \ registry.cn-hangzhou.aliyuncs.com/google_containers/etcd-amd64:3.1.12 \ etcd --name=etcd0 \ --advertise-client-urls=http://11.11.11.111:2379 \ --listen-client-urls=http://0.0.0.0:2379 \ --initial-advertise-peer-urls=http://11.11.11.111:2380 \ --listen-peer-urls=http://0.0.0.0:2380 \ --initial-cluster-token=9477af68bbee1b9ae037d6fd9e7efefd \ --initial-cluster=etcd0=http://11.11.11.111:2380,etcd1=http://11.11.11.112:2380,etcd2=http://11.11.11.113:2380 \ --initial-cluster-state=new \ --auto-tls \ --peer-auto-tls \ --data-dir=/var/lib/etcd # lab2 docker stop etcd && docker rm etcd rm -rf /data/etcd mkdir -p /data/etcd docker run -d \ --restart always \ -v /etc/etcd/ssl/certs:/etc/ssl/certs \ -v /data/etcd:/var/lib/etcd \ -p 2380:2380 \ -p 2379:2379 \ --name etcd \ registry.cn-hangzhou.aliyuncs.com/google_containers/etcd-amd64:3.1.12 \ etcd --name=etcd1 \ --advertise-client-urls=http://11.11.11.112:2379 \ --listen-client-urls=http://0.0.0.0:2379 \ --initial-advertise-peer-urls=http://11.11.11.112:2380 \ --listen-peer-urls=http://0.0.0.0:2380 \ --initial-cluster-token=9477af68bbee1b9ae037d6fd9e7efefd \ --initial-cluster=etcd0=http://11.11.11.111:2380,etcd1=http://11.11.11.112:2380,etcd2=http://11.11.11.113:2380 \ --initial-cluster-state=new \ --auto-tls \ --peer-auto-tls \ --data-dir=/var/lib/etcd # lab3 docker stop etcd && docker rm etcd rm -rf /data/etcd mkdir -p /data/etcd docker run -d \ --restart always \ -v /etc/etcd/ssl/certs:/etc/ssl/certs \ -v /data/etcd:/var/lib/etcd \ -p 2380:2380 \ -p 2379:2379 \ --name etcd \ registry.cn-hangzhou.aliyuncs.com/google_containers/etcd-amd64:3.1.12 \ etcd --name=etcd2 \ --advertise-client-urls=http://11.11.11.113:2379 \ --listen-client-urls=http://0.0.0.0:2379 \ --initial-advertise-peer-urls=http://11.11.11.113:2380 \ --listen-peer-urls=http://0.0.0.0:2380 \ --initial-cluster-token=9477af68bbee1b9ae037d6fd9e7efefd \ --initial-cluster=etcd0=http://11.11.11.111:2380,etcd1=http://11.11.11.112:2380,etcd2=http://11.11.11.113:2380 \ --initial-cluster-state=new \ --auto-tls \ --peer-auto-tls \ --data-dir=/var/lib/etcd # 驗證檢視叢集 docker exec -ti etcd ash etcdctl member list etcdctl cluster-health exit 

在第一臺master節點初始化

# 生成token
# 保留token後面還要使用
token=$(kubeadm token generate)
echo $token

# 生成配置檔案 cat >kubeadm-master.config<<EOF apiVersion: kubeadm.k8s.io/v1alpha1 kind: MasterConfiguration kubernetesVersion: v1.10.1 #imageRepository: registry.cn-shanghai.aliyuncs.com/gcr-k8s imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers api: advertiseAddress: 11.11.11.111 apiServerExtraArgs: endpoint-reconciler-type: lease controllerManagerExtraArgs: node-monitor-grace-period: 10s pod-eviction-timeout: 10s networking: podSubnet: 192.168.0.0/16 etcd: endpoints: - "http://11.11.11.111:2379" - "http://11.11.11.112:2379" - "http://11.11.11.113:2379" apiServerCertSANs: - "lab1" - "lab2" - "lab3" - "11.11.11.111" - "11.11.11.112" - "11.11.11.113" - "11.11.11.110" - "127.0.0.1" token: $token tokenTTL: "0" featureGates: CoreDNS: true EOF # 初始化 kubeadm init --config kubeadm-master.config systemctl enable kubelet # 儲存初始化完成之後的join命令 # 如果丟失可以使用命令"kubeadm token list"獲取 # kubeadm join 11.11.11.111:6443 --token nevmjk.iuh214fc8i0k3iue --discovery-token-ca-cert-hash sha256:0e4f738348be836ff810bce754e059054845f44f01619a37b817eba83282d80f # 配置kubectl使用 mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config # 安裝網路外掛 # 下載配置 mkdir flannel && cd flannel wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml # 修改配置 # 此處的ip配置要與上面kubeadm的pod-network一致 net-conf.json: | { "Network": "192.168.0.0/16", "Backend": { "Type": "vxlan" } } # 修改映象 image: registry.cn-shanghai.aliyuncs.com/gcr-k8s/flannel:v0.10.0-amd64 # 啟動 kubectl apply -f kube-flannel.yml # 如果Node有多個網絡卡的話,參考flannel issues 39701, # https://github.com/kubernetes/kubernetes/issues/39701 # 目前需要在kube-flannel.yml中使用--iface引數指定叢集主機內網網絡卡的名稱, # 否則可能會出現dns無法解析。容器無法通訊的情況,需要將kube-flannel.yml下載到本地, # flanneld啟動引數加上--iface=<iface-name> containers: - name: kube-flannel image: registry.cn-shanghai.aliyuncs.com/gcr-k8s/flannel:v0.10.0-amd64 command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr - --iface=eth1 # 檢視 kubectl get pods --namespace kube-system kubectl get svc --namespace kube-system # 設定master允許部署應用pod,參與工作負載,現在可以部署其他系統元件 # 如 dashboard, heapster, efk等 kubectl taint nodes --all node-role.kubernetes.io/master- 

啟動其他master節點

# 打包第一臺master初始化之後的/etc/kubernetes/pki目錄
cd /etc/kubernetes && tar czvf /root/pki.tgz pki/ && cd ~

# 上傳到其他master的/etc/kubernetes目錄下
tar xf pki.tgz -C /etc/kubernetes/

# 刪除pki目錄下的apiserver.crt 和 apiserver.key檔案 rm -rf /etc/kubernetes/pki/{apiserver.crt,apiserver.key} # 生成配置檔案 # 使用和之前master一樣的配置檔案 # token保持一致 cat >kubeadm-master.config<<EOF apiVersion: kubeadm.k8s.io/v1alpha1 kind: MasterConfiguration kubernetesVersion: v1.10.1 #imageRepository: registry.cn-shanghai.aliyuncs.com/gcr-k8s imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers # 注意修改IP api: advertiseAddress: 11.11.11.112 apiServerExtraArgs: endpoint-reconciler-type: lease controllerManagerExtraArgs: node-monitor-grace-period: 10s pod-eviction-timeout: 10s networking: podSubnet: 192.168.0.0/16 etcd: endpoints: - "http://11.11.11.111:2379" - "http://11.11.11.112:2379" - "http://11.11.11.113:2379" apiServerCertSANs: - lab1 - lab2 - lab3 - "11.11.11.111" - "11.11.11.112" - "11.11.11.113" - "11.11.11.110" - "127.0.0.1" token: nevmjk.iuh214fc8i0k3iue tokenTTL: "0" featureGates: CoreDNS: true EOF # 初始化 kubeadm init --config kubeadm-master.config systemctl enable kubelet # 檢視狀態 kubectl get pod --all-namespaces -o wide | grep lab1 kubectl get pod --all-namespaces -o wide | grep lab2 kubectl get pod --all-namespaces -o wide | grep lab3 kubectl get nodes -o wide 

配置haproxy代理和keepalived

lab1,lab2,lab3節點上啟動haproxykeepalived

# 拉取haproxy映象
docker pull haproxy:1.7.8-alpine
mkdir /etc/haproxy
cat >/etc/haproxy/haproxy.cfg<<EOF
global
  log 127.0.0.1 local0 err
  maxconn 50000
  uid 99
  gid 99
  #daemon
  nbproc 1
  pidfile haproxy.pid

defaults
  mode http
  log 127.0.0.1 local0 err
  maxconn 50000
  retries 3
  timeout connect 5s
  timeout client 30s
  timeout server 30s
  timeout check 2s

listen admin_stats
  mode http
  bind 0.0.0.0:1080 log 127.0.0.1 local0 err stats refresh 30s stats uri /haproxy-status stats realm Haproxy\ Statistics stats auth will:will stats hide-version stats admin if TRUE frontend k8s-https bind 0.0.0.0:8443 mode tcp #maxconn 50000 default_backend k8s-https backend k8s-https mode tcp balance roundrobin server lab1 11.11.11.111:6443 weight 1 maxconn 1000 check inter 2000 rise 2 fall 3 server lab2 11.11.11.112:6443 weight 1 maxconn 1000 check inter 2000 rise 2 fall 3 server lab3 11.11.11.113:6443 weight 1 maxconn 1000 check inter 2000 rise 2 fall 3 EOF # 啟動haproxy docker run -d --name my-haproxy \ -v /etc/haproxy:/usr/local/etc/haproxy:ro \ -p 8443:8443 \ -p 1080:1080 \ --restart always \ haproxy:1.7.8-alpine # 檢視日誌 docker logs my-haproxy # 瀏覽器檢視狀態 http://11.11.11.111:1080/haproxy-status http://11.11.11.112:1080/haproxy-status # 拉取keepalived映象 docker pull osixia/keepalived:1.4.4 # 啟動 # 載入核心相關模組 lsmod | grep ip_vs modprobe ip_vs # 啟動keepalived # eth1為本次實驗11.11.11.0/24網段的所在網絡卡 docker run --net=host --cap-add=NET_ADMIN \ -e KEEPALIVED_INTERFACE=eth1 \ -e KEEPALIVED_VIRTUAL_IPS="#PYTHON2BASH:['11.11.11.110']" \ -e KEEPALIVED_UNICAST_PEERS="#PYTHON2BASH:['11.11.11.111','11.11.11.112','11.11.11.113']" \ -e KEEPALIVED_PASSWORD=hello \ --name k8s-keepalived \ --restart always \ -d osixia/keepalived:1.4.4 # 檢視日誌 # 會看到兩個成為backup 一個成為master docker logs k8s-keepalived # 此時會配置 11.11.11.110 到其中一臺機器 # ping測試 ping -c4 11.11.11.110 # 如果失敗後清理後,重新實驗 docker rm -f k8s-keepalived ip a del 11.11.11.110/32 dev eth1 # 修改~/.kube/config檔案裡ip和埠,然後使用kubectl測試 rm -rf .kube/cache .kube/http-cache kubectl get pods -n kube-system -o wide 

修改master節點相關元件配置指向vip

# lab1 lab2 lab3
sed -i '[email protected]: https://11.11.11.*:[email protected]: https://11.11.11.110:[email protected]' /etc/kubernetes/{admin.conf,kubelet.conf,scheduler.conf,controller-manager.conf}

# 重啟kubelet
systemctl daemon-reload
systemctl restart kubelet docker

# 檢視所有節點狀態
kubectl get nodes -o wide

修改kube-proxy的配置

# 修改kube-proxy的配置指定vip
# 執行命令之後修改為 server: https://11.11.11.110:8443
kubectl edit -n kube-system configmap/kube-proxy

# 檢視設定
kubectl get -n kube-system configmap/kube-proxy -o yaml

# 刪除重建kube-proxy
kubectl get pods --all-namespaces -o wide | grep proxy
all_proxy_pods=$(kubectl get pods --all-namespaces -o wide | grep proxy | awk '{print $2}' | xargs) echo $all_proxy_pods kubectl delete pods $all_proxy_pods -n kube-system kubectl get pods --all-namespaces -o wide | grep proxy 

啟動node節點

# 加入master節點
# 這個命令是之前初始化master完成時,輸出的命令
kubeadm join 11.11.11.110:8443 --token nevmjk.iuh214fc8i0k3iue --discovery-token-ca-cert-hash sha256:0e4f738348be836ff810bce754e059054845f44f01619a37b817eba83282d80f systemctl enable kubelet 

修改node節點kubelet配置並重啟

# 修改配置
sed -i '[email protected]: https://11.11.11.*:[email protected]: https://11.11.11.110:[email protected]' /etc/kubernetes/kubelet.conf

# 重啟kubelet
systemctl daemon-reload
systemctl restart kubelet docker

# 檢視所有節點狀態
kubectl get nodes -o wide

禁止master節點發布應用

設定master不接受負載

# 檢視狀態
kubectl get nodes

# 設定
# kubectl patch node lab1 -p '{"spec":{"unschedulable":true}}' kubectl taint nodes lab1 lab2 lab3 node-role.kubernetes.io/master=true:NoSchedule # 檢視狀態 kubectl get nodes 

測試

重建多個coredns副本

# 刪除coredns的pods
kubectl get pods -n kube-system -o wide | grep coredns
all_coredns_pods=$(kubectl get pods -n kube-system -o wide | grep coredns | awk '{print $1}' | xargs)
echo $all_coredns_pods
kubectl delete pods $all_coredns_pods -n kube-system # 修改副本數 # replicas: 3 # 可以修改為node節點的個數 kubectl edit deploy coredns -n kube-system # 檢視狀態 kubectl get pods -n kube-system -o wide | grep coredns 

基礎測試

1. 啟動

# 直接使用命令測試
kubectl run nginx --replicas=2 --image=nginx:alpine --port=80
kubectl expose deployment nginx --type=NodePort --name=example-service-nodeport
kubectl expose deployment nginx --name=example-service

# 使用配置檔案測試
cat >example-nginx.yml<<EOF
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: nginx
    spec:
      restartPolicy: Always
      containers:
      - name: nginx
        image: nginx:alpine
        ports:
        - containerPort: 80
        livenessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 10
          periodSeconds: 3
        readinessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 10
          periodSeconds: 3
---
kind: Service
apiVersion: v1
metadata:
  name: example-service
spec:
    selector:
      app: nginx
    ports:
      - name: http
        port: 80
        targetPort: 80

---
kind: Service
apiVersion: v1
metadata:
  name: example-service-nodeport
spec:
    selector:
      app: nginx
    type: NodePort
    ports:
      - name: http-nodeport
        port: 80
        nodePort: 32223
EOF
kubectl apply -f example-nginx.yml

2. 檢視狀態

kubectl get deploy
kubectl get pods
kubectl get svc
kubectl describe svc example-service

3. DNS解析

kubectl run curl --image=radial/busyboxplus:curl -i --tty
nslookup kubernetes
nslookup example-service
curl example-service

# 如果時間過長會返回錯誤,可以使用如下方式再進入測試
curlPod=$(kubectl get pod | grep curl | awk '{print $1}')
kubectl exec -ti $curlPod -- sh

4. 訪問測試

# 10.96.59.56 為檢視svc時獲取到的clusterip
curl "10.96.59.56:80"

# 32223 為檢視svc時獲取到的 nodeport
http://11.11.11.114:32223/
http://11.11.11.115:32223/ 

3. 清理刪除

kubectl delete svc example-service example-service-nodeport
kubectl delete deploy nginx curl

高可用測試

關閉master節點測試叢集是能否正常執行上一步的基礎測試,檢視相關資訊,不能同時關閉lab1lab2,因為上面有haproxykeepalived服務

kubectl get pod --all-namespaces -o wide
kubectl get pod --all-namespaces -o wide | grep lab1
kubectl get pod --all-namespaces -o wide | grep lab2
kubectl get pod --all-namespaces -o wide | grep lab3
kubectl get nodes -o wide
kubectl get deploy
kubectl get pods
kubectl get svc
kubectl describe svc example-service

注意事項

  • 當直接把node節點關閉時,只有過了5分鐘之後,上面的pod才會被檢測到有問題,並遷移到其他節點

如果想快速遷移可以執行 kubectl delete node

也可以修改controller-manager的pod-eviction-timeout引數,預設5m

node-monitor-grace-period引數,預設40s