Prometheus 是 CNCF 的一个监控系统项目, 集采集、监控、报警等特点于一体。本文通过将介绍如何通过ArgoCD在K8S集群上部署prometheus来监控整个K8S集群的资源。
一、环境
- 已有一个K8S集群
- 集群中安装了ArgoCD
二、部署
网上搜索“在K8S中部署prometheus”,大概会被Prometheus Operator 、 kube-prometheus 、community helm chart 三个搞晕。
查看github说明,先进行概念辨析
Prometheus Operator vs. kube-prometheus vs. community helm chart
Prometheus Operator
The Prometheus Operator uses Kubernetes custom resources to simplify the deployment and configuration of Prometheus, Alertmanager, and related monitoring components.
kube-prometheus
kube-prometheus provides example configurations for a complete cluster monitoring stack based on Prometheus and the Prometheus Operator. This includes deployment of multiple Prometheus and Alertmanager instances, metrics exporters such as the node_exporter for gathering node metrics, scrape target configuration linking Prometheus to various metrics endpoints, and example alerting rules for notification of potential issues in the cluster.
helm chart
The prometheus-community/kube-prometheus-stack helm chart provides a similar feature set to kube-prometheus. This chart is maintained by the Prometheus community. For more information, please see the chart's readme
- Prometheus Operator用的是K8S自定义资源来简化部署和配置Prometheus, Alertmanager及相关监控组件
- kube-prometheus 就是基于Prometheus Operator部署了一套Prometheus来完成对K8S集群的监控
- community helm chart 就是用helm chart的方式 部署了 kube-prometheus
我要安装prometheus监控K8S集群,也就是要部署kube-prometheus,我用ArgoCD部署,ArgoCD支持helm资源清单,所以我们直接用“ prometheus-community/kube-prometheus-stack”安装就行了
下载 helm chart
点击此链接进入代码仓 prometheus-community/kube-prometheus-stack
原来kube-prometheus-stack只是“prometheus-community/helm-charts”仓下的其中一个chart。
查看 helm-charts/charts/kube-prometheus-stack/Chart.yaml
dependencies: | |
| ------------- | -------------------------------------------------------------- |
| | - name: kube-state-metrics |
| | version: "4.4.*" |
| | repository: https://prometheus-community.github.io/helm-charts |
| | condition: kubeStateMetrics.enabled |
| | - name: prometheus-node-exporter |
| | version: "2.5.*" |
| | repository: https://prometheus-community.github.io/helm-charts |
| | condition: nodeExporter.enabled |
| | - name: grafana |
| | version: "6.21.*" |
| | repository: https://grafana.github.io/helm-charts |
| | condition: grafana.enabled
我们知道kube-prometheus-stack还依赖于kube-state-metrics、prometheus-node-exporter和grafana这三个chart,其中kube-state-metrics、prometheus-node-exporter就在同一个仓库下,而grafana在github的另一个仓库
因为dependencies指向的github仓,实际安装时会因为无法访问github导致按照失败。报错如下:
helm dependency build failed exit status 1: Error: no repository definition for prometheus-community.github.io/helm-charts, prometheus-community.github.io/helm-charts, grafana.github.io/helm-charts. Please add the missing repos via 'helm repo add'
下面我把代码下载下来按照helm chart的subcharts的方式进行了处理。
- 代码下载到本地
git clone https://github.com/prometheus-community/helm-charts.git
git clone https://grafana.github.io/helm-charts
-
按照subcharts的方式就行了重组,如下:
-
上面的代码放到ArgoCD管理的配置仓的某个路径下
定义Application
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: prometheus
namespace: argocd
spec:
destination:
namespace: foundation
server: https://kubernetes.default.svc
project: foundation
source:
path: prometheus/kube-prometheus-stack
repoURL: https://10.112.21.246/adops/operation-devops.git
targetRevision: HEAD
syncPolicy:
syncOptions:
- CreateNamespace=true
automated:
selfHeal: true
prune: true
ArgoCD同步
三、问题解决
helm dependency build failed
helm dependency build failed exit status 1: Error: no repository definition for prometheus-community.github.io/helm-charts, prometheus-community.github.io/helm-charts, grafana.github.io/helm-charts. Please add the missing repos via 'helm repo add'
添加到自己仓库 subcharts localcharts
the lock file (Chart.lock) is out of sync with the dependencies file
rpc error: code = Unknown desc = helm dependency build failed exit status 1: Error: the lock file (Chart.lock) is out of sync with the dependencies file (Chart.yaml). Please update the dependencies
因为我修改了chart文件,所以报此错误。
直接删除kube-prometheus-stack下的Charts.lock这个文件就可以了。
镜像拉取问题
# 版本2.3.0
registry.cn-wulanchabu.aliyuncs.com/moge1/kube-state-metrics:v2.3.0
# 版本2.3.1
registry.cn-wulanchabu.aliyuncs.com/moge1/kube-state-metrics:v2.3.1
- kube-webhook-certgen
对应官方k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.0
docker pull koala2020/ingress-nginx-kube-webhook-certgen:v1
ArgoCD同步报红
CustomResourceDefinition.apiextensions.k8s.io "prometheuses.monitoring.coreos.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes
google同样的问题 www.fuscin.com/prometheus-…
通过在cli上通过helm安装没有此问题但是用argoCD就会出现
Got same issue while deploying kube-prometheus-stack via Helm by ArgoCD
CustomResourceDefinition.apiextensions.k8s.io "prometheuses.monitoring.coreos.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes
Works fine when applying via Helm from CLI
没有实际影响,但是红色让强迫症不舒服
Still marking the ArgoCD App with `Sync Failed`\
It currently is not a real issue but really annoying in the eyes to see a red application.\
Could this be fixed?
添加注解可以解决
For ArgoCD you can add the annotation `argocd.argoproj.io/sync-options: Replace=true` to the CRD
I can confirm that in this way it works.