【Kubernetes】最佳實踐3:服務部署與彈性伸縮
作者:彭靖田
在Kubernetes的世界中,一切服務都是跑在容器中的,最簡單的容器組是Pod。基於現實世界中的具體任務,Kubernetes抽象了更高階的容器組,如:ReplicaSet、Deployment、Job等。對於Web型別的長週期服務來說,重點考察兩個需求:高可用(High Availability)和可伸縮性(Scalability)。換句話說,Kubernetes的目的是想讓服務開發和運維人員,從以前像寵物(pet)一樣呵護服務的軟硬體配置中解放出來,轉而像牲畜(cattle)一樣管理服務,由Kubernetes來保證服務的穩定執行。
為此,Kubernetes提出了ReplicaSet和Deployment。
ReplicaSet是下一代的副本控制器(ReplicationController),在早期的Kubernetes版本中,相同服務的彈性伸縮由Replication Controller控制。在最新的Kubernetes版本中,兩者的唯一區別在於對標籤選擇器(label selector)的支援。ReplicaSet支援set-based的標籤選擇器,而Replication Controller僅支援equality-based的標籤選擇器。這兩種標籤的區別見標籤使用指導:https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
當明確服務本質沒有區別,只需要增加固定數量的副本時,我們可以選擇使用ReplicaSet,比如:nginx服務、redis服務。
apiVersion: extensions/v1beta1
kind: ReplicaSet
metadata:
name: frontend
# theselabels can be applied automatically
#from the labels in the pod template if not set
#labels:
# app: guestbook
# tier: frontend
spec:
#this replicas value is default
#modify it according to your case
replicas: 3
#selector can be applied automatically
#from the labels in the pod template if not set,
#but we are specifying the selector here to
#demonstrate its usage.
selector:
matchLabels:
tier: frontend
matchExpressions:
- {key: tier, operator: In, values: [frontend]}
template:
metadata:
labels:
app: guestbook
tier: frontend
spec:
containers:
- name: php-redis
image: gcr.io/google_samples/gb-frontend:v3
resources:
requests:
cpu: 100m
memory: 100Mi
env:
- name: GET_HOSTS_FROM
value: dns
# If your cluster config does not include adns service, then to
# instead access environment variables tofind service host
# info, comment out the 'value: dns' lineabove, and uncomment the
# line below.
# value: env
ports:
- containerPort: 80
不妨設此服務的作業描述檔案為frontend.yaml,在Kubernetes叢集中建立3個一樣的frontend服務的副本(replicas)。
$ kubectl create -f frontend.yaml
replicaset "frontend" created
$ kubectl describe rs/frontend
Name: frontend
Namespace: default
Image(s): gcr.io/google_samples/gb-frontend:v3
Selector: tier=frontend,tier in (frontend)
Labels: app=guestbook,tier=frontend
Replicas: 3 current / 3 desired
Pods Status: 3 Running / 0 Waiting /0 Succeeded / 0 Failed
No volumes.
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1m 1m 1 {replicaset-controller } Normal SuccessfulCreate Created pod:frontend-qhloh
1m 1m 1 {replicaset-controller } Normal SuccessfulCreate Created pod:frontend-dnjpy
1m 1m 1 {replicaset-controller } Normal SuccessfulCreate Created pod:frontend-9si5l
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
frontend-9si5l 1/1 Running 0 1m
frontend-dnjpy 1/1 Running 0 1m
frontend-qhloh 1/1 Running 0 1m
自此,frontend的3個副本已經可以對外提供服務。但是,如果想要更進一步便捷的管理服務,如版本升級、回退,服務的彈性伸縮。還需要祭出另一個神器——Deployment。
Deployment為Pods和ReplicaSets的管理提供了一套抽象的描述語法,使用者只需要描述自己期望服務達到的狀態,Kubernetes通過自己的“黑魔法”為你實現背後的一切排程工作。
以我們MIND深度學習平臺的推理(inference)服務為例,不妨設其作業描述檔案為inference_deploy.yaml:
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: mnistx-65728444-inference
spec:
replicas: 2
template:
metadata:
labels:
name: mnistx-65728444
type: inference
spec:
containers:
- name: tf-serving
image: xx.xx.xx.xx:xxxx/mind/tf-serving:0.5.1
ports:
- containerPort: xxxx
command:
- "./tensorflow_model_server"
args:
- "--model_name=mnist"
- "--model_base_path=/mnt/nfs/dlks/mnistx-60496603/service"
- "--port=xxxx"
volumeMounts:
- name: mynfs
mountPath: /mnt/nfs/dlks
securityContext:
privileged: true
volumes:
- name: mynfs
nfs:
path: /
server: xx.xx.xx.xx
restartPolicy: Always
我們定義了2個MNIST推理服務的副本(replicas),在Kubernetes叢集中建立此服務:
$ kubectl create -f inference-deploy.yaml
deployment"mnistx-65728444-inference" created
$ kubectl get deploy
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
mnistx-65728444-inference 2 2 2 2 33s
$ kubectl get rs
NAME DESIRED CURRENT READY AGE
mnistx-65728444-inference-409334488 2 2 2 2m
$ kubectl get po
NAME READY STATUS RESTARTS AGE
mnistx-65728444-inference-409334488-08t2l 1/1 Running 0 2m
mnistx-65728444-inference-409334488-jc9mp 1/1 Running 0 2m
不難發現,Deployment和ReplicaSet都是從更高層的抽象角度在描述服務的部署情況,如期望容器(DESIRED)、當前容器(CURRENT)、最新容器(UP-TO-DATE)、可用容器(AVAILABLE/READY)。而Pod則更關注容器本身的執行狀態和重啟次數。
再次強調,所有的高階抽象容器最終都由Pod來實現,高階之處在於抽象的語義滿足了更具體的服務需求。
為了讓MNIST推理服務接收讓Kubernetes叢集外部的訪問請求,需要再建立一個反向代理的Service資源,不妨設其描述檔案為inference-service.yaml。
kind: Service
apiVersion: v1
metadata:
name: mnistx-65728444-inference
spec:
selector:
name: mnistx-65728444
type: NodePort
ports:
-protocol: TCP
port: xxx
targetPort: xxxx
nodePort: xxxxx
建立Service,並檢視其代理的終端(endpoint):
$ kubectl create -f inference-service.yaml
service"mnistx-65728444-inference" created
$ kubectl describe svc
Name: kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Selector: <none>
Type: ClusterIP
IP: xx.xx.xx.xx
Port: https 443/TCP
Endpoints: xx.xx.xx.xx:xxxx
Session Affinity: ClientIP
No events.
Name: mnistx-65728444-inference
Namespace: default
Labels: <none>
Selector: name=mnistx-65728444
Type: NodePort
IP: xx.xx.xx.xx
Port: <unset>xxxx/TCP
NodePort: <unset> xxxx/TCP
Endpoints: xx.xx.xx.xx:xxxx, xx.xx.xx.xx:xxx
Session Affinity: None
No events.
$ kubectl describe po | grep IP
IP: xx.xx.xx.xx
IP: xx.xx.xx.xx
現在,假設訪問MNIST推理服務的請求數量突然暴增10倍,我們需要將MNIST推理服務副本增加到20個。在Kubernetes的世界中,只需要一行命令即可:
$ kubectl scale --replicas=20deploy/mnistx-65728444-inference
deployment"mnistx-65728444-inference" scaled
$ kubectl get deploy
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
mnistx-65728444-inference 20 20 20 3 13m
Service瞬間識別到新增的終端:
$ kubectl describe svc
Name: kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Selector: <none>
Type: ClusterIP
IP: xx.xx.xx.xx
Port: https 443/TCP
Endpoints: xx.xx.xx.xx:xxxx
Session Affinity: ClientIP
No events.
Name: mnistx-65728444-inference
Namespace: default
Labels: <none>
Selector: name=mnistx-65728444
Type: NodePort
IP: xx.xx.xx.xx
Port: <unset> xxxx/TCP
NodePort: <unset> xxxx/TCP
Endpoints: xx.xx.xx.xx:xxxx,xx.xx.xx.xx :xxxx, xx.xx.xx.xx:xxxx + 17 more...
Session Affinity: None
No events.
5秒後,新增的所有服務副本全部Ready,Kubernetes的排程速度非常快。
$ kubectl get rs
NAME DESIRED CURRENT READY AGE
mnistx-65728444-inference-409334488 20 20 20 14m
$ kubectl get po
NAME READY STATUS RESTARTS AGE
mnistx-65728444-inference-409334488-08t2l 1/1 Running 0 14m
mnistx-65728444-inference-409334488-3hrbn 1/1 Running 0 36s
mnistx-65728444-inference-409334488-4p7h4 1/1 Running 0 36s
mnistx-65728444-inference-409334488-775r3 1/1 Running 0 36s
mnistx-65728444-inference-409334488-91lx4 1/1 Running 1 36s
mnistx-65728444-inference-409334488-bj1mh 1/1 Running 0 36s
mnistx-65728444-inference-409334488-d16qn 1/1 Running 0 36s
mnistx-65728444-inference-409334488-fv7g6 1/1 Running 0 36s
mnistx-65728444-inference-409334488-hss1g 1/1 Running 0 36s
mnistx-65728444-inference-409334488-hvjbl 1/1 Running 0 36s
mnistx-65728444-inference-409334488-jc9mp 1/1 Running 0 14m
mnistx-65728444-inference-409334488-q8hq1 1/1 Running 0 36s
mnistx-65728444-inference-409334488-qcpkv 1/1 Running 0 36s
mnistx-65728444-inference-409334488-qdqmb 1/1 Running 0 36s
mnistx-65728444-inference-409334488-qt7wn 1/1 Running 0 36s
mnistx-65728444-inference-409334488-r5scs 1/1 Running 0 36s
mnistx-65728444-inference-409334488-sv7zf 1/1 Running 0 36s
mnistx-65728444-inference-409334488-wr8wv 1/1 Running 0 36s
mnistx-65728444-inference-409334488-ztp48 1/1 Running 0 36s
mnistx-65728444-inference-409334488-zw933 1/1 Running 0 36s
如果請求數量變少,減少服務副本數量也是一行命令的事兒:
$ kubectl scale --replicas=1deploy/mnistx-65728444-inference
deployment"mnistx-65728444-inference" scaled
$ kubectl get deploy
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
mnistx-65728444-inference 1 1 1 1 17m
$ kubectl get rs
NAME DESIRED CURRENT READY AGE
mnistx-65728444-inference-409334488 1 1 1 17m
$ kubectl get po
NAME READY STATUS RESTARTS AGE
mnistx-65728444-inference-409334488-08t2l 1/1 Running 0 17m
mnistx-65728444-inference-409334488-3hrbn 1/1 Terminating 0 4m
mnistx-65728444-inference-409334488-4p7h4 1/1 Terminating 0 4m
mnistx-65728444-inference-409334488-775r3 1/1 Terminating 0 4m
mnistx-65728444-inference-409334488-91lx4 1/1 Terminating 1 4m
mnistx-65728444-inference-409334488-bj1mh 1/1 Terminating 0 4m
mnistx-65728444-inference-409334488-d16qn 1/1 Terminating 0 4m
mnistx-65728444-inference-409334488-fv7g6 1/1 Terminating 0 4m
mnistx-65728444-inference-409334488-hss1g 1/1 Terminating 0 4m
mnistx-65728444-inference-409334488-hvjbl 1/1 Terminating 0 4m
mnistx-65728444-inference-409334488-jc9mp 1/1 Terminating 0 17m
mnistx-65728444-inference-409334488-q8hq1 1/1 Terminating 0 4m
mnistx-65728444-inference-409334488-qcpkv 1/1 Terminating 0 4m
mnistx-65728444-inference-409334488-qdqmb 1/1 Terminating 0 4m
mnistx-65728444-inference-409334488-qt7wn 1/1 Terminating 0 4m
mnistx-65728444-inference-409334488-r5scs 1/1 Terminating 0 4m
mnistx-65728444-inference-409334488-sv7zf 1/1 Terminating 0 4m
mnistx-65728444-inference-409334488-wr8wv 1/1 Terminating 0 4m
mnistx-65728444-inference-409334488-ztp48 1/1 Terminating 0 4m
mnistx-65728444-inference-409334488-zw933 1/1 Terminating 0 4m
整個彈性伸縮過程中,服務始終處於執行(Running)狀態,極大減少了運維人員的負擔。
綜上:Kubernetes對於同類服務的彈性伸縮做了非常強大的處理,不論是版本升級、版本回退、灰度釋出、服務發現等功能,都已經達到了業界頂尖的水準。因此,Kubernetes作為容器編排系統的事實標準也就不難理解了。感謝為此付出的Google Kubernetes組成員和廣大Contributor。