prometheus(一)安装

1,207 阅读6分钟

本文描述prometheus在k8s上面使用helm包-kube-prometheus-stack搭建以及使用过程

使用helm方式部署

参考文档

  1. 添加repo

root@yong:~/tmp# helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
"prometheus-community" already exists with the same configuration, skipping
[root@xy-5-server14 kubelet]# helm repo update prometheus-community
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "prometheus-community" chart repository
Update Complete. ⎈Happy Helming!⎈
  1. helm列表:

root@xy-5-server14 ~]# helm search repo prometheus-community
NAME                                                    CHART VERSION   APP VERSION     DESCRIPTION                                       
prometheus-community/alertmanager                       0.26.1          v0.25.0         The Alertmanager handles alerts sent by client ...
prometheus-community/alertmanager-snmp-notifier         0.1.0           v1.4.0          The SNMP Notifier handles alerts coming from Pr...
prometheus-community/jiralert                           1.2.0           v1.3.0          A Helm chart for Kubernetes to install jiralert   
prometheus-community/kube-prometheus-stack              45.7.1          v0.63.0         kube-prometheus-stack collects Kubernetes manif...
prometheus-community/kube-state-metrics                 5.0.1           2.8.2           Install kube-state-metrics to generate and expo...
prometheus-community/prom-label-proxy                   0.2.0           v0.6.0          A proxy that enforces a given label in a given ...
prometheus-community/prometheus                         19.7.2          v2.41.0         Prometheus is a monitoring system and time seri...
prometheus-community/prometheus-adapter                 4.1.1           v0.10.0         A Helm chart for k8s prometheus adapter           
prometheus-community/prometheus-blackbox-exporter       7.6.1           0.23.0          Prometheus Blackbox Exporter                      
prometheus-community/prometheus-cloudwatch-expo...      0.24.0          0.15.1          A Helm chart for prometheus cloudwatch-exporter   
prometheus-community/prometheus-conntrack-stats...      0.5.5           v0.4.11         A Helm chart for conntrack-stats-exporter         
prometheus-community/prometheus-consul-exporter         1.0.0           0.4.0           A Helm chart for the Prometheus Consul Exporter   
prometheus-community/prometheus-couchdb-exporter        0.2.1           1.0             A Helm chart to export the metrics from couchdb...
prometheus-community/prometheus-druid-exporter          1.0.0           v0.11.0         Druid exporter to monitor druid metrics with Pr...
prometheus-community/prometheus-elasticsearch-e...      5.0.0           1.5.0           Elasticsearch stats exporter for Prometheus       
prometheus-community/prometheus-fastly-exporter         0.1.1           7.2.4           A Helm chart for the Prometheus Fastly Exporter   
prometheus-community/prometheus-json-exporter           0.6.1           v0.5.0          Install prometheus-json-exporter                  
prometheus-community/prometheus-kafka-exporter          1.8.0           v1.6.0          A Helm chart to export the metrics from Kafka i...
prometheus-community/prometheus-mongodb-exporter        3.1.2           0.31.0          A Prometheus exporter for MongoDB metrics         
prometheus-community/prometheus-mysql-exporter          1.13.0          v0.14.0         A Helm chart for prometheus mysql exporter with...
prometheus-community/prometheus-nats-exporter           2.11.0          0.10.1          A Helm chart for prometheus-nats-exporter         
prometheus-community/prometheus-nginx-exporter          0.1.0           0.11.0          A Helm chart for the Prometheus NGINX Exporter    
prometheus-community/prometheus-node-exporter           4.14.0          1.5.0           A Helm chart for prometheus node-exporter         
prometheus-community/prometheus-operator                9.3.2           0.38.1          DEPRECATED - This chart will be renamed. See ht...
prometheus-community/prometheus-operator-crds           2.0.0           0.63.0          A Helm chart that collects custom resource defi...
prometheus-community/prometheus-pgbouncer-exporter      0.1.0           1.18.0          A Helm chart for prometheus pgbouncer-exporter    
prometheus-community/prometheus-pingdom-exporter        2.4.1           20190610-1      A Helm chart for Prometheus Pingdom Exporter      
prometheus-community/prometheus-postgres-exporter       4.4.0           0.11.1          A Helm chart for prometheus postgres-exporter     
prometheus-community/prometheus-pushgateway             2.1.3           v1.5.1          A Helm chart for prometheus pushgateway           
prometheus-community/prometheus-rabbitmq-exporter       1.4.0           v0.29.0         Rabbitmq metrics exporter for prometheus          
prometheus-community/prometheus-redis-exporter          5.3.0           v1.44.0         Prometheus exporter for Redis metrics             
prometheus-community/prometheus-smartctl-exporter       0.3.1           v0.8.0          A Helm chart for Kubernetes                       
prometheus-community/prometheus-snmp-exporter           1.4.0           v0.21.0         Prometheus SNMP Exporter                          
prometheus-community/prometheus-stackdriver-exp...      4.2.0           0.13.0          Stackdriver exporter for Prometheus               
prometheus-community/prometheus-statsd-exporter         0.7.0           v0.22.8         A Helm chart for prometheus stats-exporter        
prometheus-community/prometheus-to-sd                   0.4.2           0.5.2           Scrape metrics stored in prometheus format and ...
  1. 安装

root@yong:~/tmp# helm pull prometheus-community/kube-prometheus-stack --version 45.7.1
root@yong:/tmp# ls
 kube-prometheus-stack-45.7.1.tgz
root@yong:/tmp# tar -xvf kube-prometheus-stack-45.7.1.tgz
root@yong:/tmp# cd kube-prometheus-stack/
root@yong:/tmp/kube-prometheus-stack# ls
Chart.lock  charts  Chart.yaml  CONTRIBUTING.md  crds  README.md  templates  values.yaml
root@yong:/tmp# helm show values prometheus-community/prometheus > prometheus.yaml-default

[root@xy-5-server14 kube-prometheus-stack]# helm upgrade --install  prometheus-stack  . \
>     -f values.yaml \
>     -n kube-monitor \
>     --create-namespace     \
>     --version 45.7.1  --debug

Release "prometheus-stack" has been upgraded. Happy Helming!
NAME: prometheus-stack
LAST DEPLOYED: Thu Mar 23 17:53:48 2023
NAMESPACE: kube-monitor
STATUS: deployed
REVISION: 4
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
  kubectl --namespace kube-monitor get pods -l "release=prometheus-stack"

Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.
  1. 卸载

root@yong:/tmp# helm -n kube-server uninstall prometheus 
#grafana密码
[root@xy-5-server14 prometheus]# kubectl  -n kube-monitor get secrets |grep Opaque|grep grafana|awk '{print $1}'|xargs kubectl -n kube-monitor get secrets -o yamlassword|awk '{print $2}'|base64 -d && echo ''
prom-operator
[root@xy-5-server14 prometheus]# 

增加抓取配置

参照fabianlee.org/2022/07/08/…

在values中,加入

# prom-custom-values.yaml
prometheus:
  prometheusSpec:
    additionalScrapeConfigs:
      - job_name: kubernetes-service-endpoints
        kubernetes_sd_configs:
        - role: service
        relabel_configs:

        # annotation 'prometheus.io/scrape' must be set to 'true'
        - action: keep
          regex: true
          source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]

        # service cannot be in kube-system or prom namespaces
        - action: drop
          regex: (kube-system|prom)
          source_labels: [__meta_kubernetes_namespace]

        # service port name must end with word 'metrics'
        - action: keep
          regex: .*metrics
          source_labels: [__meta_kubernetes_service_port_name]

        # allow override of http scheme
        - action: replace
          regex: (https?)
          source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
          target_label: __scheme__

        # allow override of default /metrics path
        - action: replace
          regex: (.+)
          source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
          target_label: __metrics_path__

        # allow override of default port
        - action: replace
          regex: ([^:]+)(?::\d+)?;(\d+)
          replacement: $1:$2
          source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
          target_label: __address__
        - {action: labelmap, regex: __meta_kubernetes_service_label_(.+)}
        - action: replace
          source_labels: [__meta_kubernetes_namespace]
          target_label: kubernetes_namespace
        - action: replace
          source_labels: [__meta_kubernetes_service_name]
          target_label: kubernetes_name 

然后,您需要使用“helm upgrade”将新设置重新应用到 kube-prometheus-stack 版本

# will give you namespace, release name of kube-prometheus-stack
helm list -A | grep prometheus

# tailor this to your namespace and release name
helm upgrade \
  --namespace prom \
  -f prom-custom-values.yaml \
  prom-stack prometheus-community/kube-prometheus-stack

PS.另外一种添加额外抓取的方式

使用curl -X POST "http://10.122.148.81:30090/-/reload" 可以看到新添加的job是否正常生效,没有返回就是正常,否则就是失败信息。

备注:通过观察和实验,prometheus对kubernetes组件的监控,都是通过serviceMonitor进行配置的,整个job的生成都是按照serviceMonitor自动解析的。 serviceMonitor api参考
例如,自定义的serviceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  annotations:
    meta.helm.sh/release-name: prometheus-stack
    meta.helm.sh/release-namespace: kube-monitor
  labels:
    app: jmx-exporter-0
    app.kubernetes.io/instance: prometheus-stack
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/part-of: kube-prometheus-stack
    app.kubernetes.io/version: 45.7.1
    chart: kube-prometheus-stack-45.7.1
    heritage: Helm
    release: prometheus-stack
  name: test-jmx-c
  namespace: kube-monitor
spec:
  endpoints:
  - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
    port: http-metrics
  jobLabel: job-l-caoyong
  namespaceSelector:
    matchNames:
    - tests
  selector:
    matchLabels:
      app: kube-prometheus-stack-jmx
      release: prometheus-stack

自动解析成的job如下:

- job_name: serviceMonitor/kube-monitor/test-jmx-c/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  authorization:
    type: Bearer
    credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  follow_redirects: true
  enable_http2: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_label_app, __meta_kubernetes_service_labelpresent_app]
    separator: ;
    regex: (kube-prometheus-stack-jmx);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release, __meta_kubernetes_service_labelpresent_release]
    separator: ;
    regex: (prometheus-stack);true
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_phase]
    separator: ;
    regex: (Failed|Succeeded)
    replacement: $1
    action: drop
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_job_l_caoyong]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http-metrics
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: $1
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: $1
    action: keep
  kubernetes_sd_configs:
  - role: endpoints
    kubeconfig_file: ""
    follow_redirects: true
    enable_http2: true
    namespaces:
      own_namespace: false
      names:
      - tests

指定存储

relabel_configs介绍

www.cnblogs.com/zhrx/p/1597…
blog.51cto.com/u_12227788/…

  # 重新标签
  relabel_configs:
  # 整段含义是使用"(.*)some-[regex]"匹配源标签中的内容,并将job这个标签的值=foo-${1},$1就是匹配到的内容
  - source_labels: [job, __meta_dns_name]
    regex:         (.*)some-[regex]
    target_label:  job
    replacement:   foo-${1}

自定配置文件(该配置测试不成功)

如果想使用自定义的configMap来更新prometheus的配置,请参考:https://help.aliyun.com/document_detail/94622.html

具体方法是将configmap挂载到pod中的/etc/prometheus/configmaps/路径下,可以针对prometheus和alertmanager的configmaps字段 image.png 填入您自定义configmap名称即可 image.png 例如,您想要定义一个名称为special-config的configmap,里面包含prometheus的config文件。如果您想要将在prometheus的pod启动时,将其作为--config.file的参数,那我们就可以在prometheus的configmaps字段添加如下字段,就可以将其挂载到我们的pod中了,其挂载路径为/etc/prometheus/configmaps/ special-config的yml定义如下: image.png

Grafana配置

  • Dashboard的外挂配置
    如果您想将Dashboard文件以configmap的方式挂载到Grafana pod中去,您可以在ack-prometheus-operator页面,单击一键部署,然后在参数配置向导页面,找到extraConfigmapMounts,您可以在下图的字段中进行挂载配置。

image.png

如果使用helm内部的安装文件chart 内部的values生效,应该: helm install apisix ./ -f values.yaml -n apisix --create-namespace --namespace apisix

告警

zhuanlan.zhihu.com/p/321526960

问题解决:

问题一

Error: UPGRADE FAILED: error validating "": error validating data: ValidationError(Prometheus.spec): unknown field "hostNetwork" in com.coreos.monitoring.v1.Prometheus.spec 该错误,可能是之前安装旧的CRD 没有删除导致的
见:www.cnblogs.com/liruilong/p…
删除以前的crd即可:

kubectl delete crd alertmanagerconfigs.monitoring.coreos.com alertmanagers.monitoring.coreos.com podmonitors.monitoring.coreos.com probes.monitoring.coreos.com prometheuses.monitoring.coreos.com prometheusrules.monitoring.coreos.com servicemonitors.monitoring.coreos.com thanosrulers.monitoring.coreos.com

问题2

proxy服务监听127.0.0.1 111proxy.png

$ kubectl edit cm/kube-proxy -n kube-system

...
kind: KubeProxyConfiguration
metricsBindAddress: 0.0.0.0:10249
...

$ kubectl delete pod -l k8s-app=kube-proxy -n kube-system

参考:stackoverflow.com/questions/6…