本文描述prometheus在k8s上面使用helm包-kube-prometheus-stack搭建以及使用过程
使用helm方式部署
-
添加repo
root@yong:~/tmp# helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
"prometheus-community" already exists with the same configuration, skipping
[root@xy-5-server14 kubelet]# helm repo update prometheus-community
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "prometheus-community" chart repository
Update Complete. ⎈Happy Helming!⎈
-
helm列表:
root@xy-5-server14 ~]# helm search repo prometheus-community
NAME CHART VERSION APP VERSION DESCRIPTION
prometheus-community/alertmanager 0.26.1 v0.25.0 The Alertmanager handles alerts sent by client ...
prometheus-community/alertmanager-snmp-notifier 0.1.0 v1.4.0 The SNMP Notifier handles alerts coming from Pr...
prometheus-community/jiralert 1.2.0 v1.3.0 A Helm chart for Kubernetes to install jiralert
prometheus-community/kube-prometheus-stack 45.7.1 v0.63.0 kube-prometheus-stack collects Kubernetes manif...
prometheus-community/kube-state-metrics 5.0.1 2.8.2 Install kube-state-metrics to generate and expo...
prometheus-community/prom-label-proxy 0.2.0 v0.6.0 A proxy that enforces a given label in a given ...
prometheus-community/prometheus 19.7.2 v2.41.0 Prometheus is a monitoring system and time seri...
prometheus-community/prometheus-adapter 4.1.1 v0.10.0 A Helm chart for k8s prometheus adapter
prometheus-community/prometheus-blackbox-exporter 7.6.1 0.23.0 Prometheus Blackbox Exporter
prometheus-community/prometheus-cloudwatch-expo... 0.24.0 0.15.1 A Helm chart for prometheus cloudwatch-exporter
prometheus-community/prometheus-conntrack-stats... 0.5.5 v0.4.11 A Helm chart for conntrack-stats-exporter
prometheus-community/prometheus-consul-exporter 1.0.0 0.4.0 A Helm chart for the Prometheus Consul Exporter
prometheus-community/prometheus-couchdb-exporter 0.2.1 1.0 A Helm chart to export the metrics from couchdb...
prometheus-community/prometheus-druid-exporter 1.0.0 v0.11.0 Druid exporter to monitor druid metrics with Pr...
prometheus-community/prometheus-elasticsearch-e... 5.0.0 1.5.0 Elasticsearch stats exporter for Prometheus
prometheus-community/prometheus-fastly-exporter 0.1.1 7.2.4 A Helm chart for the Prometheus Fastly Exporter
prometheus-community/prometheus-json-exporter 0.6.1 v0.5.0 Install prometheus-json-exporter
prometheus-community/prometheus-kafka-exporter 1.8.0 v1.6.0 A Helm chart to export the metrics from Kafka i...
prometheus-community/prometheus-mongodb-exporter 3.1.2 0.31.0 A Prometheus exporter for MongoDB metrics
prometheus-community/prometheus-mysql-exporter 1.13.0 v0.14.0 A Helm chart for prometheus mysql exporter with...
prometheus-community/prometheus-nats-exporter 2.11.0 0.10.1 A Helm chart for prometheus-nats-exporter
prometheus-community/prometheus-nginx-exporter 0.1.0 0.11.0 A Helm chart for the Prometheus NGINX Exporter
prometheus-community/prometheus-node-exporter 4.14.0 1.5.0 A Helm chart for prometheus node-exporter
prometheus-community/prometheus-operator 9.3.2 0.38.1 DEPRECATED - This chart will be renamed. See ht...
prometheus-community/prometheus-operator-crds 2.0.0 0.63.0 A Helm chart that collects custom resource defi...
prometheus-community/prometheus-pgbouncer-exporter 0.1.0 1.18.0 A Helm chart for prometheus pgbouncer-exporter
prometheus-community/prometheus-pingdom-exporter 2.4.1 20190610-1 A Helm chart for Prometheus Pingdom Exporter
prometheus-community/prometheus-postgres-exporter 4.4.0 0.11.1 A Helm chart for prometheus postgres-exporter
prometheus-community/prometheus-pushgateway 2.1.3 v1.5.1 A Helm chart for prometheus pushgateway
prometheus-community/prometheus-rabbitmq-exporter 1.4.0 v0.29.0 Rabbitmq metrics exporter for prometheus
prometheus-community/prometheus-redis-exporter 5.3.0 v1.44.0 Prometheus exporter for Redis metrics
prometheus-community/prometheus-smartctl-exporter 0.3.1 v0.8.0 A Helm chart for Kubernetes
prometheus-community/prometheus-snmp-exporter 1.4.0 v0.21.0 Prometheus SNMP Exporter
prometheus-community/prometheus-stackdriver-exp... 4.2.0 0.13.0 Stackdriver exporter for Prometheus
prometheus-community/prometheus-statsd-exporter 0.7.0 v0.22.8 A Helm chart for prometheus stats-exporter
prometheus-community/prometheus-to-sd 0.4.2 0.5.2 Scrape metrics stored in prometheus format and ...
-
安装
root@yong:~/tmp# helm pull prometheus-community/kube-prometheus-stack --version 45.7.1
root@yong:/tmp# ls
kube-prometheus-stack-45.7.1.tgz
root@yong:/tmp# tar -xvf kube-prometheus-stack-45.7.1.tgz
root@yong:/tmp# cd kube-prometheus-stack/
root@yong:/tmp/kube-prometheus-stack# ls
Chart.lock charts Chart.yaml CONTRIBUTING.md crds README.md templates values.yaml
root@yong:/tmp# helm show values prometheus-community/prometheus > prometheus.yaml-default
[root@xy-5-server14 kube-prometheus-stack]# helm upgrade --install prometheus-stack . \
> -f values.yaml \
> -n kube-monitor \
> --create-namespace \
> --version 45.7.1 --debug
Release "prometheus-stack" has been upgraded. Happy Helming!
NAME: prometheus-stack
LAST DEPLOYED: Thu Mar 23 17:53:48 2023
NAMESPACE: kube-monitor
STATUS: deployed
REVISION: 4
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
kubectl --namespace kube-monitor get pods -l "release=prometheus-stack"
Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.
-
卸载
root@yong:/tmp# helm -n kube-server uninstall prometheus
#grafana密码
[root@xy-5-server14 prometheus]# kubectl -n kube-monitor get secrets |grep Opaque|grep grafana|awk '{print $1}'|xargs kubectl -n kube-monitor get secrets -o yamlassword|awk '{print $2}'|base64 -d && echo ''
prom-operator
[root@xy-5-server14 prometheus]#
增加抓取配置
在values中,加入
# prom-custom-values.yaml
prometheus:
prometheusSpec:
additionalScrapeConfigs:
- job_name: kubernetes-service-endpoints
kubernetes_sd_configs:
- role: service
relabel_configs:
# annotation 'prometheus.io/scrape' must be set to 'true'
- action: keep
regex: true
source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
# service cannot be in kube-system or prom namespaces
- action: drop
regex: (kube-system|prom)
source_labels: [__meta_kubernetes_namespace]
# service port name must end with word 'metrics'
- action: keep
regex: .*metrics
source_labels: [__meta_kubernetes_service_port_name]
# allow override of http scheme
- action: replace
regex: (https?)
source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
target_label: __scheme__
# allow override of default /metrics path
- action: replace
regex: (.+)
source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
target_label: __metrics_path__
# allow override of default port
- action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
target_label: __address__
- {action: labelmap, regex: __meta_kubernetes_service_label_(.+)}
- action: replace
source_labels: [__meta_kubernetes_namespace]
target_label: kubernetes_namespace
- action: replace
source_labels: [__meta_kubernetes_service_name]
target_label: kubernetes_name
然后,您需要使用“helm upgrade”将新设置重新应用到 kube-prometheus-stack 版本
# will give you namespace, release name of kube-prometheus-stack
helm list -A | grep prometheus
# tailor this to your namespace and release name
helm upgrade \
--namespace prom \
-f prom-custom-values.yaml \
prom-stack prometheus-community/kube-prometheus-stack
PS.另外一种添加额外抓取的方式:
使用curl -X POST "http://10.122.148.81:30090/-/reload" 可以看到新添加的job是否正常生效,没有返回就是正常,否则就是失败信息。
备注:通过观察和实验,prometheus对kubernetes组件的监控,都是通过serviceMonitor进行配置的,整个job的生成都是按照serviceMonitor自动解析的。
serviceMonitor api参考
例如,自定义的serviceMonitor:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
annotations:
meta.helm.sh/release-name: prometheus-stack
meta.helm.sh/release-namespace: kube-monitor
labels:
app: jmx-exporter-0
app.kubernetes.io/instance: prometheus-stack
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/part-of: kube-prometheus-stack
app.kubernetes.io/version: 45.7.1
chart: kube-prometheus-stack-45.7.1
heritage: Helm
release: prometheus-stack
name: test-jmx-c
namespace: kube-monitor
spec:
endpoints:
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
port: http-metrics
jobLabel: job-l-caoyong
namespaceSelector:
matchNames:
- tests
selector:
matchLabels:
app: kube-prometheus-stack-jmx
release: prometheus-stack
自动解析成的job如下:
- job_name: serviceMonitor/kube-monitor/test-jmx-c/0
honor_timestamps: true
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
authorization:
type: Bearer
credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
follow_redirects: true
enable_http2: true
relabel_configs:
- source_labels: [job]
separator: ;
regex: (.*)
target_label: __tmp_prometheus_job_name
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_service_label_app, __meta_kubernetes_service_labelpresent_app]
separator: ;
regex: (kube-prometheus-stack-jmx);true
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_service_label_release, __meta_kubernetes_service_labelpresent_release]
separator: ;
regex: (prometheus-stack);true
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_endpoint_port_name]
separator: ;
regex: http-metrics
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Node;(.*)
target_label: node
replacement: ${1}
action: replace
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Pod;(.*)
target_label: pod
replacement: ${1}
action: replace
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: service
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: pod
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_container_name]
separator: ;
regex: (.*)
target_label: container
replacement: $1
action: replace
- source_labels: [__meta_kubernetes_pod_phase]
separator: ;
regex: (Failed|Succeeded)
replacement: $1
action: drop
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: job
replacement: ${1}
action: replace
- source_labels: [__meta_kubernetes_service_label_job_l_caoyong]
separator: ;
regex: (.+)
target_label: job
replacement: ${1}
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: http-metrics
action: replace
- source_labels: [__address__]
separator: ;
regex: (.*)
modulus: 1
target_label: __tmp_hash
replacement: $1
action: hashmod
- source_labels: [__tmp_hash]
separator: ;
regex: "0"
replacement: $1
action: keep
kubernetes_sd_configs:
- role: endpoints
kubeconfig_file: ""
follow_redirects: true
enable_http2: true
namespaces:
own_namespace: false
names:
- tests
指定存储
relabel_configs介绍
www.cnblogs.com/zhrx/p/1597…
blog.51cto.com/u_12227788/…
# 重新标签
relabel_configs:
# 整段含义是使用"(.*)some-[regex]"匹配源标签中的内容,并将job这个标签的值=foo-${1},$1就是匹配到的内容
- source_labels: [job, __meta_dns_name]
regex: (.*)some-[regex]
target_label: job
replacement: foo-${1}
自定配置文件(该配置测试不成功)
如果想使用自定义的configMap来更新prometheus的配置,请参考:https://help.aliyun.com/document_detail/94622.html
具体方法是将configmap挂载到pod中的/etc/prometheus/configmaps/路径下,可以针对prometheus和alertmanager的configmaps字段
填入您自定义configmap名称即可
例如,您想要定义一个名称为
special-config的configmap,里面包含prometheus的config文件。如果您想要将在prometheus的pod启动时,将其作为--config.file的参数,那我们就可以在prometheus的configmaps字段添加如下字段,就可以将其挂载到我们的pod中了,其挂载路径为/etc/prometheus/configmaps/
special-config的yml定义如下:
Grafana配置
- Dashboard的外挂配置
如果您想将Dashboard文件以configmap的方式挂载到Grafana pod中去,您可以在ack-prometheus-operator页面,单击一键部署,然后在参数配置向导页面,找到extraConfigmapMounts,您可以在下图的字段中进行挂载配置。
如果使用helm内部的安装文件chart
内部的values生效,应该:
helm install apisix ./ -f values.yaml -n apisix --create-namespace --namespace apisix
告警
zhuanlan.zhihu.com/p/321526960
问题解决:
问题一
Error: UPGRADE FAILED: error validating "": error validating data: ValidationError(Prometheus.spec): unknown field "hostNetwork" in com.coreos.monitoring.v1.Prometheus.spec
该错误,可能是之前安装旧的CRD 没有删除导致的
见:www.cnblogs.com/liruilong/p…
删除以前的crd即可:
kubectl delete crd alertmanagerconfigs.monitoring.coreos.com alertmanagers.monitoring.coreos.com podmonitors.monitoring.coreos.com probes.monitoring.coreos.com prometheuses.monitoring.coreos.com prometheusrules.monitoring.coreos.com servicemonitors.monitoring.coreos.com thanosrulers.monitoring.coreos.com
问题2
proxy服务监听127.0.0.1
$ kubectl edit cm/kube-proxy -n kube-system
...
kind: KubeProxyConfiguration
metricsBindAddress: 0.0.0.0:10249
...
$ kubectl delete pod -l k8s-app=kube-proxy -n kube-system