1. 程式人生 > >簡單4步,利用Prometheus Operator實現自定義指標監控

簡單4步,利用Prometheus Operator實現自定義指標監控

> 本文來自[Rancher Labs](https://mp.weixin.qq.com/s/ZZPMHI73GuYaT2MyqHfpxg "Rancher Labs") 在過去的文章中,我們花了相當大的篇幅來聊關於監控的話題。這是因為當你正在管理Kubernetes叢集時,一切都會以極快的速度發生變化。因此有一個工具來監控叢集的健康狀態和資源指標極為重要。 在Rancher 2.5中,我們引入了基於Prometheus Operator的新版監控,它可以提供Prometheus以及相關監控元件的原生Kubernetes部署和管理。Prometheus Operator可以讓你監控叢集節點、Kubernetes元件和應用程式工作負載的狀態和程序。同時,它還能夠通過Prometheus收集的指標來定義告警並且建立自定義儀表盤,通過Grafana可以輕鬆地視覺化收集到的指標。你可以訪問下列連結獲取更多關於新版監控元件的細節: https://rancher.com/docs/rancher/v2.x/en/monitoring-alerting/v2.5/ 新版本的監控也採用prometheus-adapter,開發人員可以利用其基於自定義指標和HPA擴充套件他們的工作負載。 在本文中,我們將探索如何利用Prometheus Operator來抓取自定義指標並利用這些指標進行高階工作負載管理。 ## 安裝Prometheus 在Rancher 2.5中安裝Prometheus極為簡單。僅需訪問Cluster Explorer -> Apps並安裝rancher-monitoring即可。 ![](https://oscimg.oschina.net/oscnet/up-de4a5bf406bf78a6bb597d3d96738f49662.png) 你需要了解以下預設設定: - `prometheus-adapter`將會作為chart安裝的一部分啟用 - `ServiceMonitorNamespaceSelector` 留為空,允許 Prometheus 在所有名稱空間中收集 ServiceMonitors ![](https://oscimg.oschina.net/oscnet/up-0411ad557c0b60d98f16ed8e1e1eb4b8d99.JPEG) 安裝完成後,我們可以從Cluster Explorer訪問監控元件。 ![](https://oscimg.oschina.net/oscnet/up-af1190e3a207a7ccb949105371eca0056f7.JPEG) ## 部署工作負載 現在讓我們部署一個從應用層暴露自定義指標的示例工作負載。該工作負載暴露了一個簡單的應用程式,該應用程式已經使用Prometheus client_golang庫進行了檢測,並在`/metric`端點上提供了一些自定義指標。 它有兩個指標: - http_requests_total - http_request_duration_seconds 以下manifest部署了工作負載、相關服務以及訪問該工作負載的ingress: ``` apiVersion: apps/v1 kind: Deployment metadata: labels: app.kubernetes.io/name: prometheus-example-app name: prometheus-example-app spec: replicas: 1 selector: matchLabels: app.kubernetes.io/name: prometheus-example-app template: metadata: labels: app.kubernetes.io/name: prometheus-example-app spec: containers: - name: prometheus-example-app image: gmehta3/demo-app:metrics ports: - name: web containerPort: 8080 --- apiVersion: v1 kind: Service metadata: name: prometheus-example-app labels: app.kubernetes.io/name: prometheus-example-app spec: selector: app.kubernetes.io/name: prometheus-example-app ports: - protocol: TCP port: 8080 targetPort: 8080 name: web --- apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: name: prometheus-example-app spec: rules: - host: hpa.demo http: paths: - path: / backend: serviceName: prometheus-example-app servicePort: 8080 ``` ## 部署ServiceMonitor ServiceMonitor是一個自定義資源定義(CRD),可以讓我們宣告性地定義如何監控一組動態服務。 你可以訪問以下連結檢視完整的ServiceMonitor規範: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#servicemonitor 現在,我們來部署ServiceMonitor,Prometheus用它來收集組成prometheus-example-app Kubernetes服務的pod。 ```apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: prometheus-example-app spec: selector: matchLabels: app.kubernetes.io/name: prometheus-example-app endpoints: - port: web ``` 如你所見,現在使用者可以在Rancher監控中瀏覽ServiceMonitor。 ![](https://oscimg.oschina.net/oscnet/up-09ba22b67531f884eb1ed1cf084b11aff26.JPEG) 不久之後,新的service monitor和服務相關聯的pod應該會反映在Prometheus服務發現中。 ![](https://oscimg.oschina.net/oscnet/up-fbe33bcc2ee419fbd26061630232caa898a.JPEG) 我們也能夠在Prometheus中看到指標。 ![](https://oscimg.oschina.net/oscnet/up-3df5881540e2389b432e27af4083c587ab5.JPEG) ## 部署Grafana儀表盤 在Rancher 2.5中,監控可以讓使用者將Grafana儀表盤儲存為`cattle-dashboards`名稱空間中的ConfigMaps。 使用者或叢集管理員現在可以在這一名稱空間中新增更多的儀表盤以擴充套件Grafana的自定義儀表盤。 ``` Dashboard ConfigMap Example ``` ``` apiVersion: v1 kind: ConfigMap metadata: name: prometheus-example-app-dashboard namespace: cattle-dashboards labels: grafana_dashboard: "1" data: prometheus-example-app.json: | { "annotations": { "list": [ { "builtIn": 1, "datasource": "-- Grafana --", "enable": true, "hide": true, "iconColor": "rgba(0, 211, 255, 1)", "name": "Annotations & Alerts", "type": "dashboard" } ] }, "editable": true, "gnetId": null, "graphTooltip": 0, "links": [], "panels": [ { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": null, "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 9, "w": 12, "x": 0, "y": 0 }, "hiddenSeries": false, "id": 2, "legend": { "avg": false, "current": false, "max": false, "min": false, "show": true, "total": false, "values": false }, "lines": true, "linewidth": 1, "nullPointMode": "null", "percentage": false, "pluginVersion": "7.1.5", "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "rate(http_requests_total{code=\"200\",service=\"prometheus-example-app\"}[5m])", "instant": false, "interval": "", "legendFormat": "", "refId": "A" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "http_requests_total_200", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": null, "description": "", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 8, "w": 12, "x": 0, "y": 9 }, "hiddenSeries": false, "id": 4, "legend": { "avg": false, "current": false, "max": false, "min": false, "show": true, "total": false, "values": false }, "lines": true, "linewidth": 1, "nullPointMode": "null", "percentage": false, "pluginVersion": "7.1.5", "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "rate(http_requests_total{code!=\"200\",service=\"prometheus-example-app\"}[5m])", "interval": "", "legendFormat": "", "refId": "A" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "http_requests_total_not_200", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } } ], "schemaVersion": 26, "style": "dark", "tags": [], "templating": { "list": [] }, "time": { "from": "now-15m", "to": "now" }, "timepicker": { "refresh_intervals": [ "5s", "10s", "30s", "1m", "5m", "15m", "30m", "1h", "2h", "1d" ] }, "timezone": "", "title": "prometheus example app", "version": 1 } ``` 現在,使用者應該能夠在Grafana中訪問prometheus example app的儀表盤。 ![](https://oscimg.oschina.net/oscnet/up-8e4137b2ff1b9de893f4a309df8b7ca8b42.JPEG) ### 自定義指標的HPA 這一部分假設你已經將`prometheus-adapter`作為監控的一部分安裝完畢了。實際上,在預設情況下,監控安裝程式會安裝prometheus-adapter。 使用者現在可以建立一個HPA spec,如下所示: ``` apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: prometheus-example-app-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: prometheus-example-app minReplicas: 1 maxReplicas: 5 metrics: - type: Object object: describedObject: kind: Service name: prometheus-example-app metric: name: http_requests target: averageValue: "5" type: AverageValue ``` 你可以檢視以下連結獲取關於HPA的更多資訊: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/ 我們將使用自定義的http_requests_total指標來執行pod自動伸縮。 ![](https://oscimg.oschina.net/oscnet/up-0e746b5bb322d69a39cc42b8a8819db943f.JPEG) 現在我們可以生成一個樣本負載來檢視HPA的執行情況。我可以使用`hey`進行同樣的操作。 ``` hey -c 10 -n 5000 http://hpa.demo ``` ![](https://oscimg.oschina.net/oscnet/up-5069a496ee916d0ef5127945cba19afea9f.gif) ## 總 結 在本文中,我們探討了Rancher 2.5中新監控的靈活性。開發人員和叢集管理員可以利用該堆疊來監控它們的工作負載,部署視覺化,並利用Kubernetes內可用的高階工作負載管理