1. 程式人生 > >使用kuberspay部署高可用kubernetes叢集

使用kuberspay部署高可用kubernetes叢集

0 環境

環境:

主機名 IP
k8s-node01 172.16.120.151
k8s-node02 172.16.120.152
k8s-node03 172.16.120.153
ansible-client

==mac os x固定vware虛擬機器IP
sudo vi /Library/Preferences/VMware\ Fusion/vmnet8/dhcpd.conf
在檔案末尾新增==

host CentOS01{
    hardware ethernet 00:0C:29:15:5C:F1;
    fixed-address
172.16.120.151; } host CentOS02{ hardware ethernet 00:0C:29:D1:C4:9A; fixed-address 172.16.120.152; } host CentOS03{ hardware ethernet 00:0C:29:C2:A6:93; fixed-address 172.16.120.153; }
  • centos01為固定ip虛擬機器的名稱
  • hardware ethernet 硬體地址
  • fixed-address 固定ip地址

ip地址取值範圍必須在hdcpd.conf給定的範圍內,配置完成後重啟vware。

設定主機名:

hostnamectl --static set-hostname  k8s-node01
hostnamectl --static set-hostname  k8s-node02
hostnamectl --static set-hostname  k8s-node03

關閉防火牆:

systemctl disable firewalld
systemctl stop firewalld
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config

使用阿里雲yum映象,docker安裝速度快

#docker yum源
cat >> /etc/yum.repos.d/docker.repo <<EOF [docker-repo] name=Docker Repository baseurl=http://mirrors.aliyun.com/docker-engine/yum/repo/main/centos/7 enabled=1 gpgcheck=0 EOF

同時配置好阿里雲加速器

mkdir -p /etc/docker
tee /etc/docker/daemon.json <<-'EOF'
{
  "registry-mirrors": ["https://5md0553g.mirror.aliyuncs.com"]
}
EOF

手動安裝docker:

#檢視docker版本
yum list docker-engine –showduplicates
#安裝docker
yum install -y docker-engine-1.13.1-1.el7.centos.x86_64

1 安裝 ansible

在ansible的控制伺服器ansible-client上安裝ansible。

# 安裝 python 及 epel
yum install -y epel-release python-pip python34 python34-pip
# 安裝 ansible(必須先安裝 epel 源再安裝 ansible)
yum install -y ansible

託管節點python的版本需要大於2.5。

ansible不支援windows,其他系統的安裝方式可以查閱ansible官方網站。

2 設定免密登入

在ansible-client執行 ssh-keygen -t rsa 生成金鑰對

ssh-keygen -t rsa -P ''

將~/.ssh/id_rsa.pub複製到其他所有節點,這樣ansible-client到其他所有節點可以免密登入

IP=(172.16.120.151 172.16.120.152 172.16.120.153)
for x in ${IP[*]}; do ssh-copy-id -i ~/.ssh/id_rsa.pub $x; done

要使用root許可權執行上面操作。

3 下載kuberspay原始碼

可以從主分支下載:

git clone https://github.com/kubernetes-incubator/kubespray.git

也可以下載釋出版本:

wget https://github.com/kubernetes-incubator/kubespray/archive/v2.1.2.tar.gz

本文采用v2.1.2 釋出版本安裝。

4 將grc.io和quay.io的映象上傳到阿里雲

kuberspay涉及到的映象。

quay.io/coreos/hyperkube:v1.7.3_coreos.0
quay.io/coreos/etcd:v3.2.4
quay.io/calico/ctl:v1.4.0
quay.io/calico/node:v2.4.1
quay.io/calico/cni:v1.10.0
quay.io/kube-policy-controller:v0.7.0
quay.io/calico/routereflector:v0.3.0
quay.io/coreos/flannel:v0.8.0
quay.io/coreos/flannel-cni:v0.2.0
quay.io/l23network/k8s-netchecker-agent:v1.0
quay.io/l23network/k8s-netchecker-server:v1.0
weaveworks/weave-kube:2.0.1
weaveworks/weave-npc:2.0.1
gcr.io/google_containers/kubernetes-dashboard-amd64:v1.6.3
gcr.io/google_containers/cluster-proportional-autoscaler-amd64:1.1.1
gcr.io/google_containers/fluentd-elasticsearch:1.22
gcr.io/google_containers/kibana:v4.6.1
gcr.io/google_containers/elasticsearch:v2.4.1
gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.2
gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.2
gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.2
gcr.io/google_containers/pause-amd64:3.0
gcr.io/kubernetes-helm/tiller:v2.2.2
gcr.io/google_containers/heapster-grafana-amd64:v4.4.1
gcr.io/google_containers/heapster-amd64:v1.4.0
gcr.io/google_containers/heapster-influxdb-amd64:v1.1.1
gcr.io/google_containers/nginx-ingress-controller:0.9.0-beta.11
gcr.io/google_containers/defaultbackend:1.3

通過grc.io的到阿里雲:

#!/bin/bash
set -o errexit
set -o nounset
set -o pipefail

kubednsautoscaler_version=1.1.1
dns_version=1.14.2
kube_pause_version=3.0
dashboard_version=v1.6.3
fluentd_es_version=1.22
kibana_version=v4.6.1
elasticsearch_version=v2.4.1
heapster_version=v1.4.0
heapster_grafana_version=v4.4.1
heapster_influxdb_version=v1.1.1
nginx_ingress_version=0.9.0-beta.11
defaultbackend_version=1.3

GCR_URL=gcr.io/google_containers
ALIYUN_URL=registry.cn-hangzhou.aliyuncs.com/szss_k8s

images=(
cluster-proportional-autoscaler-amd64:${kubednsautoscaler_version}
k8s-dns-sidecar-amd64:${dns_version}
k8s-dns-kube-dns-amd64:${dns_version}
k8s-dns-dnsmasq-nanny-amd64:${dns_version}
pause-amd64:${kube_pause_version}
kubernetes-dashboard-amd64:${dashboard_version}
fluentd-elasticsearch:${fluentd_es_version}
kibana:${kibana_version}
elasticsearch:${elasticsearch_version}
fluentd-elasticsearch:${fluentd_es_version}
kibana:${kibana_version}
heapster-amd64:${heapster_version}
heapster-grafana-amd64:${heapster_grafana_version}
heapster-influxdb-amd64:${heapster_influxdb_version}
nginx-ingress-controller:${nginx_ingress_version}
defaultbackend:${defaultbackend_version}
)

for imageName in ${images[@]} ; do
  docker pull $GCR_URL/$imageName
  docker tag $GCR_URL/$imageName $ALIYUN_URL/$imageName
  docker push $ALIYUN_URL/$imageName
  docker rmi $ALIYUN_URL/$imageName
done

通過quay.io的映象到阿里雲:

QUAY_URL=quay.io
ALIYUN_URL=registry.cn-hangzhou.aliyuncs.com/szss_quay_io

# master分支
#images=(
#coreos/hyperkube:v1.7.3_coreos.0
#coreos/etcd:v3.2.4
#coreos/flannel:v0.8.0
#coreos/flannel-cni:v0.2.0
#calico/kube-policy-controller:v0.7.0
#calico/ctl:v1.4.0
#calico/node:v2.4.1
#calico/cni:v1.10.0
#calico/routereflector:v0.3.0
#l23network/k8s-netchecker-agent:v1.0
#l23network/k8s-netchecker-server:v1.0
#)
# kuberspay v2.1.2版本
images=(
coreos/hyperkube:v1.6.7_coreos.0
coreos/etcd:v3.2.4
coreos/flannel:v0.8.0
coreos/flannel-cni:v0.2.0
calico/kube-policy-controller:v0.5.4
calico/ctl:v1.1.3
calico/node:v1.1.3
calico/cni:v1.8.0
calico/routereflector:v0.3.0
l23network/k8s-netchecker-agent:v1.0
l23network/k8s-netchecker-server:v1.0
)

for imageName in ${images[@]} ; do
  docker pull $QUAY_URL/$imageName
  docker tag $QUAY_URL/$imageName $ALIYUN_URL/${imageName/\//-}
  docker push $ALIYUN_URL/${imageName/\//-}
  docker rmi $ALIYUN_URL/${imageName/\//-}
done

5 映象替換

在kuberspay原始碼原始碼中搜索包含 gcr.io/google_containers 和 quay.io 映象的檔案,並替換為我們之前已經上傳到阿里雲的進行,替換腳步如下:

grc_image_files=(
./kubespray/extra_playbooks/roles/dnsmasq/templates/dnsmasq-autoscaler.yml
./kubespray/extra_playbooks/roles/download/defaults/main.yml
./kubespray/extra_playbooks/roles/kubernetes-apps/ansible/defaults/main.yml
./kubespray/roles/download/defaults/main.yml
./kubespray/roles/dnsmasq/templates/dnsmasq-autoscaler.yml
./kubespray/roles/kubernetes-apps/ansible/defaults/main.yml
)

for file in ${grc_image_files[@]} ; do
    sed -i 's/gcr.io\/google_containers/registry.cn-hangzhou.aliyuncs.com\/szss_k8s/g' $file
done

quay_image_files=(
./kubespray/extra_playbooks/roles/download/defaults/main.yml
./kubespray/roles/download/defaults/main.yml
)

for file in ${quay_image_files[@]} ; do
    sed -i 's/quay.io\/coreos\//registry.cn-hangzhou.aliyuncs.com\/szss_quay_io\/coreos-/g' $file
    sed -i 's/quay.io\/calico\//registry.cn-hangzhou.aliyuncs.com\/szss_quay_io\/calico-/g' $file
    sed -i 's/quay.io\/l23network\//registry.cn-hangzhou.aliyuncs.com\/szss_quay_io\/l23network-/g' $file
done

如果在mac os x執行指令碼,需要在sed -i 後面新增一個空字串,例如sed -i ” ‘s/a/b/g’ file

6 配置檔案內容

可以對basic_auth的密碼進行修改,網路外掛預設calico,可替換成weave或flannel,還可以配置是否安裝helm和efk。

下面的的配置為kuberspay 2.1.2的配置。

$vi ~/kubespray/inventory/group_vars/k8s-cluster.yml

# Kubernetes configuration dirs and system namespace.
# Those are where all the additional config stuff goes
# the kubernetes normally puts in /srv/kubernets.
# This puts them in a sane location and namespace.
# Editting those values will almost surely break something.
kube_config_dir: /etc/kubernetes
kube_script_dir: "{{ bin_dir }}/kubernetes-scripts"
kube_manifest_dir: "{{ kube_config_dir }}/manifests"
system_namespace: kube-system

# Logging directory (sysvinit systems)
kube_log_dir: "/var/log/kubernetes"

# This is where all the cert scripts and certs will be located
kube_cert_dir: "{{ kube_config_dir }}/ssl"

# This is where all of the bearer tokens will be stored
kube_token_dir: "{{ kube_config_dir }}/tokens"

# This is where to save basic auth file
kube_users_dir: "{{ kube_config_dir }}/users"

kube_api_anonymous_auth: false

## Change this to use another Kubernetes version, e.g. a current beta release
kube_version: v1.6.7

# Where the binaries will be downloaded.
# Note: ensure that you've enough disk space (about 1G)
local_release_dir: "/tmp/releases"
# Random shifts for retrying failed ops like pushing/downloading
retry_stagger: 5

# This is the group that the cert creation scripts chgrp the
# cert files to. Not really changable...
kube_cert_group: kube-cert

# Cluster Loglevel configuration
kube_log_level: 2

# Users to create for basic auth in Kubernetes API via HTTP
# Optionally add groups for user
kube_api_pwd: "changeme"
kube_users:
  kube:
    pass: "{{kube_api_pwd}}"
    role: admin
  root:
    pass: "{{kube_api_pwd}}"
    role: admin
    # groups:
    #   - system:masters



## It is possible to activate / deactivate selected authentication methods (basic auth, static token auth)
#kube_oidc_auth: false
#kube_basic_auth: false
#kube_token_auth: false


## Variables for OpenID Connect Configuration https://kubernetes.io/docs/admin/authentication/
## To use OpenID you have to deploy additional an OpenID Provider (e.g Dex, Keycloak, ...)

# kube_oidc_url: https:// ...
# kube_oidc_client_id: kubernetes
## Optional settings for OIDC
# kube_oidc_ca_file: {{ kube_cert_dir }}/ca.pem
# kube_oidc_username_claim: sub
# kube_oidc_groups_claim: groups


# Choose network plugin (calico, weave or flannel)
# Can also be set to 'cloud', which lets the cloud provider setup appropriate routing
kube_network_plugin: calico

# weave's network password for encryption
# if null then no network encryption
# you can use --extra-vars to pass the password in command line
weave_password: EnterPasswordHere

# Weave uses consensus mode by default
# Enabling seed mode allow to dynamically add or remove hosts
# https://www.weave.works/docs/net/latest/ipam/
weave_mode_seed: false

# This two variable are automatically changed by the weave's role, do not manually change these values
# To reset values :
# weave_seed: uninitialized
# weave_peers: uninitialized
weave_seed: uninitialized
weave_peers: uninitialized

# Enable kubernetes network policies
enable_network_policy: false

# Kubernetes internal network for services, unused block of space.
kube_service_addresses: 10.233.0.0/18

# internal network. When used, it will assign IP
# addresses from this range to individual pods.
# This network must be unused in your network infrastructure!
kube_pods_subnet: 10.233.64.0/18

# internal network node size allocation (optional). This is the size allocated
# to each node on your network.  With these defaults you should have
# room for 4096 nodes with 254 pods per node.
kube_network_node_prefix: 24

# The port the API Server will be listening on.
kube_apiserver_ip: "{{ kube_service_addresses|ipaddr('net')|ipaddr(1)|ipaddr('address') }}"
kube_apiserver_port: 6443 # (https)
kube_apiserver_insecure_port: 8080 # (http)

# DNS configuration.
# Kubernetes cluster name, also will be used as DNS domain
cluster_name: cluster.local
# Subdomains of DNS domain to be resolved via /etc/resolv.conf for hostnet pods
ndots: 2
# Can be dnsmasq_kubedns, kubedns or none
dns_mode: kubedns
# Can be docker_dns, host_resolvconf or none
resolvconf_mode: docker_dns
# Deploy netchecker app to verify DNS resolve as an HTTP service
deploy_netchecker: false
# Ip address of the kubernetes skydns service
skydns_server: "{{ kube_service_addresses|ipaddr('net')|ipaddr(3)|ipaddr('address') }}"
dns_server: "{{ kube_service_addresses|ipaddr('net')|ipaddr(2)|ipaddr('address') }}"
dns_domain: "{{ cluster_name }}"

# Path used to store Docker data
docker_daemon_graph: "/var/lib/docker"

## A string of extra options to pass to the docker daemon.
## This string should be exactly as you wish it to appear.
## An obvious use case is allowing insecure-registry access
## to self hosted registries like so:

docker_options: "--insecure-registry={{ kube_service_addresses }} --graph={{ docker_daemon_graph }}  {{ docker_log_opts }}"
docker_bin_dir: "/usr/bin"

# Settings for containerized control plane (etcd/kubelet/secrets)
etcd_deployment_type: docker
kubelet_deployment_type: docker
cert_management: script
vault_deployment_type: docker

# K8s image pull policy (imagePullPolicy)
k8s_image_pull_policy: IfNotPresent

# Monitoring apps for k8s
efk_enabled: false

# Helm deployment
helm_enabled: false

# dnsmasq
# dnsmasq_upstream_dns_servers:
#  - /resolvethiszone.with/10.0.4.250
#  - 8.8.8.8

#  Enable creation of QoS cgroup hierarchy, if true top level QoS and pod cgroups are created. (default true)
# kubelet_cgroups_per_qos: true

# A comma separated list of levels of node allocatable enforcement to be enforced by kubelet.
# Acceptible options are 'pods', 'system-reserved', 'kube-reserved' and ''. Default is "".
# kubelet_enforce_node_allocatable: pods

7 生成叢集配置

yum install -y python-pip python34 python34-pip
# 定義叢集IP
IP=(
172.16.120.151
172.16.120.152
172.16.120.153
)
# 利用kubespray自帶的python指令碼生成配置
CONFIG_FILE=./kubespray/inventory/inventory.cfg python3 ./kubespray/contrib/inventory_builder/inventory.py ${IP[*]}

叢集配置如下:

$cat ./kubespray/inventory/inventory.cfg 
[all]
node1    ansible_host=172.16.120.151 ip=172.16.120.151
node2    ansible_host=172.16.120.152 ip=172.16.120.152
node3    ansible_host=172.16.120.153 ip=172.16.120.153

[kube-master]
node1    
node2    

[kube-node]
node1    
node2    
node3    

[etcd]
node1    
node2    
node3    

[k8s-cluster:children]
kube-node    
kube-master      

[calico-rr]

[vault]
node1    
node2    
node3

8 安裝叢集

cd kubespray
ansible-playbook -i inventory/inventory.cfg cluster.yml -b -v --private-key=~/.ssh/id_rsa

9 troubles shooting

錯誤1:在安裝過程中報錯:

fatal: [node1]: FAILED! => {"failed": true, "msg": "The ipaddr filter requires python-netaddr be installed on the ansible controller"}

需要安裝 python-netaddr,安裝命令pip install netaddr。

錯誤2:在安裝過程中報錯:

{"failed": true, "msg": "The conditional check '{%- set certs = {'sync': False} -%}\n{% if gen_node_certs[inventory_hostname] or\n  (not etcdcert_node.results[0].stat.exists|default(False)) or\n    (not etcdcert_node.results[1].stat.exists|default(False)) or\n      (etcdcert_node.results[1].stat.checksum|default('') != etcdcert_master.files|selectattr(\"path\", \"equalto\", etcdcert_node.results[1].stat.path)|map(attribute=\"checksum\")|first|default('')) -%}\n        {%- set _ = certs.update({'sync': True}) -%}\n{% endif %}\n{{ certs.sync }}' failed. The error was: no test named 'equalto'\n\nThe error appears to have been in '/root/kubespray/roles/etcd/tasks/check_certs.yml': line 57, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: \"Check_certs | Set 'sync_certs' to true\"\n  ^ here\n"}

10 安裝失敗清理

rm -rf /etc/kubernetes/
rm -rf /var/lib/kubelet
rm -rf /var/lib/etcd
rm -rf /usr/local/bin/kubectl
rm -rf /etc/systemd/system/calico-node.service
rm -rf /etc/systemd/system/kubelet.service
systemctl stop etcd.service
systemctl disable etcd.service
systemctl stop calico-node.service
systemctl disable calico-node.service
docker stop $(docker ps -q)
docker rm $(docker ps -a -q)
service docker restart

11 參考