背景
一种数据处理平台是通过Flink on K8S去实现数据ETL,把数据从Hbase、Hive 、关系型数据库等大数据ODS(Operational Data store )层去拉取相应的数据,存储到对应DMP平台中统一供数据科学家、数据工程师、机器学习工程师用来做算法模型的训练、数据测试以及其他的数据应用,能够解决数据存储分散、特征重复、提取复杂、使用困难等问题。
如何监控K8S集群,并对集群及任务异常进行告警,这是基本配置。既然服务是跑在K8S上,作为云原生社区第二个毕业项目Prometheus就是k8s监控告警的不二选择了
文章主要分几个方面进行介绍,先对Prometheus进行介绍,接着会介绍下项目中基于prometheus如何去去做K8S集群部署、Flink作业运行、ElasticSearch的监控和告警。
Prometheus系统介绍
Prometheus简介
Prometheus是SoundCloud公司基于GO语言开发的源于(启发于)谷歌borgmon的开源监控工具,Prometheus于2016年加入云原生云计算基金会(CNCF)成为继Kubernetes之后的第二个托管项目,作为新一代的监控框架,Prometheus 具有以下特点:
- 多维数据模型:由度量名称和键值对标识的时间序列数据
- 基于HTTP的pull方式采集时间序列数据
- PromSQL:一种灵活的查询语言,可以利用多维数据完成复杂的查询
- 不依赖分布式存储,单个服务器节点可直接工作,单机性能:可以达到每秒消费百万级时间序列,同时支持采集上千个targets
- 通过服务发现或静态配置发现目标
- 社区生态丰富(多语言,各种exportrs)、开源社区活跃36k关注度
附:时间序列数据库发展趋势
Prometheus 可以简单归类两个部分,一个是监控报警系统,另一个是自带的时序数据库(TSDB),从上图时许数据库发展趋势可以看出Prometheus也可以作为时序数据库使用
Prometheus架构设计
其中组件如下:
- Prometheus Server: 用于收集和存储时间序列数据:也是Prometheus中的核心模块:Retrieval模块用来定时拉取监控target数据、Storage时序数据库保存数据,PromQL 提供的查询语法, 通过解析语法树,查询 Storage 模块获取监控数据。另外除了除了 Prometheus 自带的 webui,还可以通过 grafana 、API 等方式查询 Prometheus 监控数据。
- Push Gateway: 主要用于短期的 jobs。由于这类 jobs 存在时间较短,可能在 Prometheus 来 pull 之前就消失了。为此,这次 jobs 可以直接向 Prometheus server 端推送它们的 metrics。这种方式主要用于服务层面的 metrics,对于机器层面的 metrices,需要使用 node exporter。
- Exporters: 用于暴露已有的第三方服务的 metrics 给 Prometheus。
- Alertmanager: 从 Prometheus server 端接收到 alerts 后,会进行去除重复数据,分组,并路由到对收的接受方式,发出报警。常见的接收方式有:电子邮件,企业微信, webhook 等。
- 最上面Prometheus提供服务发现, 支持监控对象的自动发现机制,从而可以动态获取监控对象。
Metric数据及类型
指标格式
从最原始的抓取数据上来看,如下图所示,timestamp是当前抓取时间戳:
每个Metric name代表了一类的指标,他们可以携带不同的Labels,每个Metric name + Label组合成代表了一条时间序列的数据也就是一个样本,由三部分组织:
- 指标(metric):metric name和描述当前样本特征的labelsets;
- 时间戳(timestamp):一个精确到毫秒的时间戳;
- 样本值(value): 一个float64的浮点型数据表示当前样本的值。
样本类型
- Counter(计数器) :计数统计,累计多长或者累计多少次等。它的特点是只增不减,譬如 HTTP 访问总量;
- Gauge(仪表盘) :侧重反应系统的当前状态,这类指标的样本数据可增可减,如果当前内存用量,它随着时间变化忽高忽低。
除了Counter和Gauge类型的监控指标以外,Prometheus还定义了Histogram和Summary的指标类型。Histogram和Summary主用用于统计和分析样本的分布情况。
- Histogram(直方图) :用于表示一段时间的数据采样结果,Histogram 在一段时间范围内对数据进行采样(通常是请求持续时间或响应大小等),并将其计入可配置的存储桶(bucket)中,后续可通过指定区间筛选样本,也可以统计样本总数,最后一般将数据展示为直方图。
- Summary(摘要) :与Histogram类似,用于表示一段时间的数据采样结果,但它直接存储了分数位,(通过客户端计算,然后展示出来)而不是通过区间来计算。
Sumamry的分位数则是直接在客户端计算完成。因此对于分位数的计算而言,Summary在通过PromQL进行查询时有更好的性能表现,而Histogram则会消耗更多的资源。反之对于客户端而言Histogram消耗的资源更少。
除了上面介绍了还有诸如:
PromQL:Prometheus的自定义查询语言PromQL Prometheus告警:Alertmanager组件的配置,接收器Receiver针对邮件、短信、webhook等配置 Exporter介绍、容器cAdvisro及采集Target配置 数据可视化:Prometheus UI基本可视化、外部Grafana 集群与高可用:集群联邦、Alertmanager高可用、存储高可用
Prometheus服务发现
这些内容可以参考其他资料或者从本文参考文档找到相关的内容介绍及使用方式,这里不做详细介绍了。
Prometheus 限制
- 基于 Metric 的监控,不适用于日志(Logs)、事件(Event)、调用链(Tracing),Prometheus 只针对性能和可用性监控,并不具备日志监控等功能,并不能通过 Prometheus 解决所有监控问题,日志监控还是要结合日志采集如Fluented、日志存储如ElasticSearch等配合去使用,特征平台在实践过程中采取了EFK(ElasticSearch、Fluented、Kibana)对k8s容器日志进行了收集,这块会有另外的文章进行介绍。
- Prometheus 默认是 Pull 模型,依赖于拉取周期,实施性不那么强,监控的Target多了后,需要合理规划你的网络,尽量不要转发。
- Prometheus 数据有效期问题,通常认为最近的监控数据才有查询的需要, Prometheus 本地存储的设计初衷只是保持短期(一个月)的数据,并非针对大量的历史数据的存储。如果需要报表之类的历史数据,则建议使用 Prometheus 的远端存储如 OpenTSDB、m3db 等。
相关详细问题请查看参考文档。
Flink 监控基于Prometheus的落地实践
数据处理平台中的Flink、ElasticSearch日志收集等组件都是跑在Kubernetes,监控和告警都是基于 Prometheus 构建。整个k8s监控告警体系如下所示:
Prometheus 采集的数据包括了主机性能监控(Node Exporter)、容器性能监控(CAdvisor)、Kubernetes 集群状态(kube-state-metrics)、另外部署了ElasticSearch Exporter 。同时Prometheus配置了相关性能指标的告警,并将告警发送到alertmanager ,alertmanager中配置了飞书的webhook告警信息将通过飞书自动机器人发送到运维群中提醒处理,告警的prometheus规则有:如job运行失败、ElasticSearch集群节点挂了、cpu、内存使用的告警、计算节点异常等。
数据处理平台中Prometheus 监控实战
部署说明
主要采取常规的Prometheus方式进行部署,即Kubernetes 通过 YAML 文件方式部署 Prometheus 的过程,按顺序部署 Prometheus、kube-state-metrics、node-exporter 、Alertmanager以及 Grafana。
常规的部署组件的调用关系如下图:
Kubernetes Deployment 方式部署 Prometheus 、alertmanager、grafana、kube-state-metrics(获取 Kubernetes 集群的状态)、elastic-exporter(监控自己运维的ElasticSearch集群),Kubernetes DaemonSet 部署 Node exporter并设置污点容忍允许部署到master节点,获取该节点物理机或者虚拟机的监控信息。所有信息汇聚到 Prometheus 进行处理和存储,然后通过 Grafana 进行展示。
基于k8s部署监控体系
首先需要创建 Prometheus 所在命名空间,然后创建 Prometheus 使用的 RBAC 规则,创建 Prometheus 的 configmap 来保存配置文件,创建 service 进行固定集群 IP 访问,创建 deployment 部署带有 Prometheus 容器的 pod。
- 创建prometheus名的命名空间 之后相关对象都放到该命名空间
创建namespace,文件名ns-promethes.yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: prometheus
执行:kubectl apply -f ns-promethes.yaml
- 创建 RBAC 规则,包含 ServiceAccount、ClusterRole、ClusterRoleBinding 三类 YAML 文件,主要是为ServiceAccount为 prometheus申请访问k8s Apiserver所需要的权限
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups:
- extensions
resources:
- ingresses
verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
namespace: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: prometheus
- 使用 ConfigMap 方式创建 Prometheus 应用配置文件
配置文件里面制定了抓取周期、Alertmanger配置、告警规则文件位置、所有需要抓取的的target:这里可以看到k8s大部分指标收集都是通过k8s服务发现去收集。
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: prometheus
data:
prometheus.yml: |
global:
scrape_interval: 15s #抓取周期15s
evaluation_interval: 15s
alerting:
alertmanagers:
- static_configs:
- targets: ["alertmanager-svc:9093"] #告警svc配置
rule_files:
- "/etc/prometheus/rules/rule.yml" #告警规则文件目录
scrape_configs:
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
- job_name: 'kubernetes-nodes'
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics
- job_name: 'kubernetes-cadvisor'
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
- job_name: 'kubernetes-services'
kubernetes_sd_configs:
- role: service
metrics_path: /probe
params:
module: [http_2xx]
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
action: keep
regex: true
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: blackbox-exporter.example.com:9115
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
target_label: kubernetes_name
- job_name: 'kubernetes-ingresses'
kubernetes_sd_configs:
- role: ingress
relabel_configs:
- source_labels: [__meta_kubernetes_ingress_annotation_prometheus_io_probe]
action: keep
regex: true
- source_labels: [__meta_kubernetes_ingress_scheme,__address__,__meta_kubernetes_ingress_path]
regex: (.+);(.+);(.+)
replacement: ${1}://${2}${3}
target_label: __param_target
- target_label: __address__
replacement: blackbox-exporter.example.com:9115
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_ingress_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_ingress_name]
target_label: kubernetes_name
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- job_name: 'node-exporter'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_endpoints_name]
regex: 'node-exporter'
action: keep
- 使用 ConfigMap 方式创建 Prometheus 告警配置文件
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-server-rule-config
namespace: prometheus
data:
rule.yml: |
groups:
- name: kubernetes
rules:
- alert: PodDown
expr: kube_pod_status_phase{phase="Unknown"} == 1 or kube_pod_status_phase{phase="Failed"} == 1
for: 1m
labels:
severity: error
annotations:
summary: Pod Down
description: " pod: {{ $labels.pod }} namespace :{{ $labels.namespace }}"
- alert: PodRestart
expr: changes(kube_pod_container_status_restarts_total{pod !~ "analyzer.*"}[10m]) > 0
for: 1m
labels:
severity: error
annotations:
summary: Pod Restart
description: "pod:{{ $labels.pod }} namespace : {{ $labels.namespace }} restart"
- alert: NodeUnschedulable
expr: kube_node_spec_unschedulable == 1
for: 5m
labels:
severity: error
annotations:
summary: Node Unschedulable
description: "node: {{ $labels.node }} Unschedulable "
- alert: NodeStatusError
expr: kube_node_status_condition{condition="Ready", status!="true"} == 1
for: 5m
labels:
severity: error
annotations:
summary: Node Status Error
description: "node: {{ $labels.node }} Status Error "
- alert: DaemonsetUnavailable
expr: kube_daemonset_status_number_unavailable > 0
for: 5m
labels:
severity: error
annotations:
summary: "Daemonset Unavailable"
description: "Daemonset {{ $labels.daemonset }} with namespace {{ $labels.namespace }} Unavailable"
- alert: JobFailed
expr: kube_job_status_failed == 1
for: 5m
labels:
severity: error
annotations:
summary: "Job Failed"
description: "实例job : {{ $labels.job_name }} namespace :{{ $labels.namespace }} 运行失败"
- name: elasticsearch
rules:
- record: elasticsearch_filesystem_data_used_percent
expr: 100 * (elasticsearch_filesystem_data_size_bytes - elasticsearch_filesystem_data_free_bytes)
/ elasticsearch_filesystem_data_size_bytes
- record: elasticsearch_filesystem_data_free_percent
expr: 100 - elasticsearch_filesystem_data_used_percent
- alert: ElasticsearchTooFewNodesRunning
expr: elasticsearch_cluster_health_number_of_nodes < 3
for: 5m
labels:
severity: critical
annotations:
description: "There are only {{$value}} < 3 ElasticSearch nodes running"
summary: ElasticSearch running on less than 3 nodes
- alert: ElasticsearchHeapTooHigh
expr: elasticsearch_jvm_memory_used_bytes{area="heap"} / elasticsearch_jvm_memory_max_bytes{area="heap"}
> 0.9
for: 15m
labels:
severity: critical
annotations:
description: The heap usage is over 90% for 15m
summary: ElasticSearch node {{$labels.node}} heap usage is high
- alert: 机器宕机
expr: up{component="node-exporter"} != 1
for: 1m
labels:
severity: "warning"
instance: "{{ $labels.instance }}"
annotations:
summary: "机器 {{ $labels.instance }} 处于down的状态"
description: "{{ $labels.instance }} of job {{ $labels.job }} 已经处于down状态超过1分钟,请及时处理"
- alert: cpu 剩余量过低
expr: 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 85
for: 1m
labels:
severity: "warning"
instance: "{{ $labels.instance }}"
annotations:
summary: "机器 {{ $labels.instance }} cpu 已用超过设定值"
description: "{{ $labels.instance }} CPU 用量已超过 85% (current value is: {{ $value }}),请及时处理。
这里的告警规则:简单的配置了k8s Pod挂了、PodRestart、节点不可用、DaemonsetUnavailable不可用、elasticsearch等。其他需要的话 还可以进行添加
- deployment 方式创建 prometheus 实例
这里面主要注意几处配置:数据卷挂载可以配置一个hostpath以免pod挂了后数据丢失,另外就是configmap挂载的告警规则和prometheus的配置。
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus-dep
namespace: prometheus
spec:
replicas: 1
selector:
matchLabels:
app: prometheus-dep
template:
metadata:
labels:
app: prometheus-dep
spec:
containers:
- image: prom/prometheus
name: prometheus
command:
- "/bin/prometheus"
args:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus"
- "--storage.tsdb.retention=3d"
# - "--web.external-url=http://192.168.106.41:30090/" 这块如果监控告警需要连接到prometheus ui可以配置外部可访问的地址
ports:
- containerPort: 9090
protocol: TCP
volumeMounts:
- mountPath: "/prometheus"
name: data
- mountPath: "/etc/prometheus"
name: config-volume
- mountPath: "/etc/prometheus/rules"
name: rule-config-volume
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 500m
memory: 2500Mi
serviceAccountName: prometheus
volumes:
- name: data
emptyDir: {}
- name: config-volume
configMap:
name: prometheus-config
- name: rule-config-volume
configMap:
name: prometheus-server-rule-config
- 部署prometheus svc 类型为NodePort暴露外部端口30090
主要作用把k8s prometheus 集群内的ip和端口 映射到集群节点,可以用节点ip地址和端口供外部去访问。
kind: Service
apiVersion: v1
metadata:
name: prometheus-svc
namespace: prometheus
spec:
type: NodePort
ports:
- port: 9090
targetPort: 9090
nodePort: 30090
selector:
app: prometheus-dep
- k8s Deployment 部署 Alertmanager
包含三部分:Alertmanager的配置以ConfigMap的方式处理、配置外部Service供外部访问、Deployment部署Alertmanager。
### alertmanager-config
---
apiVersion: v1
kind: ConfigMap
metadata:
name: alertmanager-config
namespace: prometheus
data:
config.yml: |
global:
resolve_timeout: 5m
route:
receiver: feishuhok
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
group_by: ['alertname', 'k8scluster', 'node', 'container', 'exported_job', 'daemonset']
routes:
- receiver: feishuhok
group_wait: 10s
match:
severity: error
receivers:
- name: feishuhok
webhook_configs:
- url: 'https://www.feishu.cn/flow/api/trigger-webhook/8e0f266df11-----'
send_resolved: true
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: alertmanager-dep
namespace: prometheus
spec:
replicas: 1
selector:
matchLabels:
app: alertmanager-dep
template:
metadata:
labels:
app: alertmanager-dep
spec:
containers:
- image: prom/alertmanager
name: alertmanager
args:
- "--config.file=/etc/alertmanager/config.yml"
- "--storage.path=/alertmanager"
- "--data.retention=72h"
volumeMounts:
- mountPath: "/alertmanager"
name: data
- mountPath: "/etc/alertmanager"
name: config-volume
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 500m
memory: 2500Mi
volumes:
- name: data
emptyDir: {}
- name: config-volume
configMap:
name: alertmanager-config
---
kind: Service
apiVersion: v1
metadata:
name: alertmanager-svc
namespace: prometheus
spec:
type: NodePort
ports:
- name: http
port: 9093
nodePort: 31090
selector:
app: alertmanager-dep
- grafana部署
主要为了去可视化展示prometheus的监控数据
包含两部分:grafana配置外部Service供外部访问、Deployment部署grafana。
### grafana deployment
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana-core
namespace: prometheus
labels:
app: grafana
component: core
spec:
replicas: 1
selector:
matchLabels:
app: grafana
component: core
template:
metadata:
labels:
app: grafana
component: core
spec:
securityContext:
runAsUser: 472
fsGroup: 472
containers:
- image: grafana/grafana
name: grafana-core
resources:
limits:
cpu: 100m
memory: 100Mi
requests:
cpu: 100m
memory: 100Mi
readinessProbe:
httpGet:
path: /login
port: 3000
volumeMounts:
- name: grafana-persistent-storage
mountPath: /var/lib/grafana
serviceAccountName: prometheus
volumes:
- name: grafana-persistent-storage
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: grafana
namespace: prometheus
labels:
app: grafana
component: core
spec:
type: NodePort
ports:
- port: 3000
nodePort: 31000
selector:
app: grafana
- k8s 集群状态收集
主要包含RBAC规则创建、Deployment创建、Service ,这个用来收集集群的相关状态
## k8s 状态收集
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: kube-state-metrics
namespace: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1
# kubernetes versions before 1.8.0 should use rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
namespace: prometheus
name: kube-state-metrics-resizer
rules:
- apiGroups: [""]
resources:
- pods
verbs: ["get"]
- apiGroups: ["extensions"]
resources:
- deployments
resourceNames: ["kube-state-metrics"]
verbs: ["get", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
# kubernetes versions before 1.8.0 should use rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: kube-state-metrics
rules:
- apiGroups: [""]
resources:
- configmaps
- secrets
- nodes
- pods
- services
- resourcequotas
- replicationcontrollers
- limitranges
- persistentvolumeclaims
- persistentvolumes
- namespaces
- endpoints
verbs: ["list", "watch"]
- apiGroups: ["extensions"]
resources:
- daemonsets
- deployments
- replicasets
verbs: ["list", "watch"]
- apiGroups: ["apps"]
resources:
- statefulsets
- daemonsets
- deployments
- replicasets
verbs: ["list", "watch"]
- apiGroups: ["batch"]
resources:
- cronjobs
- jobs
verbs: ["list", "watch"]
- apiGroups: ["autoscaling"]
resources:
- horizontalpodautoscalers
verbs: ["list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
# kubernetes versions before 1.8.0 should use rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: kube-state-metrics
namespace: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: kube-state-metrics-resizer
subjects:
- kind: ServiceAccount
name: kube-state-metrics
namespace: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1
# kubernetes versions before 1.8.0 should use rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: kube-state-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kube-state-metrics
subjects:
- kind: ServiceAccount
name: kube-state-metrics
namespace: prometheus
---
apiVersion: apps/v1
# Kubernetes versions after 1.9.0 should use apps/v1
# Kubernetes versions before 1.8.0 should use apps/v1beta1 or extensions/v1beta1
# addon-resizer描述:https://github.com/kubernetes/autoscaler/tree/master/addon-resizer
kind: Deployment
metadata:
name: kube-state-metrics
namespace: prometheus
spec:
selector:
matchLabels:
k8s-app: kube-state-metrics
replicas: 1
template:
metadata:
labels:
k8s-app: kube-state-metrics
spec:
serviceAccountName: kube-state-metrics
containers:
- name: kube-state-metrics
image: quay.io/coreos/kube-state-metrics:v2.0.0-alpha.1
ports:
- name: http-metrics
containerPort: 8080
- name: telemetry
containerPort: 8081
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
timeoutSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: kube-state-metrics
namespace: prometheus
labels:
k8s-app: kube-state-metrics
annotations:
prometheus.io/scrape: 'true'
spec:
ports:
- name: http-metrics
port: 8080
targetPort: http-metrics
protocol: TCP
- name: telemetry
port: 8081
targetPort: telemetry
protocol: TCP
selector:
k8s-app: kube-state-metrics
- k8s DaemonSet 部署Node Exporter收集节点数据
这里tolerations这块设置,主要是为了也能够在master部署exporter收集master节点数据
## node exporter
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-exporter
namespace: kube-system
labels:
k8s-app: node-exporter
spec:
selector:
matchLabels:
k8s-app: node-exporter
template:
metadata:
labels:
k8s-app: node-exporter
spec:
containers:
- image: prom/node-exporter
name: node-exporter
ports:
- containerPort: 9100
protocol: TCP
name: http
hostPID: true
tolerations:
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"
---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: node-exporter
name: node-exporter
namespace: kube-system
annotations:
prometheus.io/scrape: 'true'
spec:
clusterIP: None
ports:
- name: http
port: 9100
protocol: TCP
type: ClusterIP
selector:
k8s-app: node-exporter
- 部署elasticsearch exporter采集es集群
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: elastic-exporter
namespace: prometheus
spec:
replicas: 1
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
selector:
matchLabels:
app: elastic-exporter
template:
metadata:
labels:
app: elastic-exporter
spec:
containers:
- command:
- /bin/elasticsearch_exporter
- --es.uri=http://es-svc:9200
- --es.all
image: justwatch/elasticsearch_exporter:1.1.0
securityContext:
capabilities:
drop:
- SETPCAP
- MKNOD
- AUDIT_WRITE
- CHOWN
- NET_RAW
- DAC_OVERRIDE
- FOWNER
- FSETID
- KILL
- SETGID
- SETUID
- NET_BIND_SERVICE
- SYS_CHROOT
- SETFCAP
readOnlyRootFilesystem: true
livenessProbe:
httpGet:
path: /healthz
port: 9114
initialDelaySeconds: 30
timeoutSeconds: 10
name: elastic-exporter
ports:
- containerPort: 9114
name: http
readinessProbe:
httpGet:
path: /healthz
port: 9114
initialDelaySeconds: 10
timeoutSeconds: 10
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 25m
memory: 64Mi
restartPolicy: Always
securityContext:
runAsNonRoot: true
runAsGroup: 10000
runAsUser: 10000
fsGroup: 10000
---
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/scrape: 'true'
labels:
app: elastic-exporter
name: elastic-exporter
namespace: prometheus
spec:
ports:
- name: http
port: 9114
nodePort: 31200
protocol: TCP
type: NodePort
selector:
app: elastic-exporter
这是整个部署逻辑,对于熟悉k8s 对象操作没太多负责的地方,无非是配置configmap、编写RBAC规则、创建deployment、daemonset svc对象进行操作,最后可以把这些配置文件写到一个yaml中,执行一条 kubectl apply -f promehteus-all.yaml就可以了。
结果展示
- 告警规则
- 采集target情况
- 飞书群告警通知
- grafana告警监控
总结
本文主要介绍了prometheus相关知识、进行了相关实际操作,并结合项目介绍了采用prometheus 打通了监控数据收集、监控指标访问、告警通知到飞书运维群的整个监控链路。实际中对Prometheus的使用还是比较基础,k8s集群集群规模也比较小。集群的负载、存储和访问量挑战还不大。后续还有许多工作需要继续完善,比如面对负责的场景、资源有限、网络负责,需要配置Prometheus的HA、更丰富的Prometheus监控规则等工作
在设计数据处理平台是把flink跑在yarn还是k8s上面,我们毫不犹豫的选择了k8s,主要原因社区都在拥抱k8s 已经是一种趋势,自己搭建了k8s集群,并对k8s集群做了高可用,完善了集群的日志收集,做k8s集群、应用状态的监控和告警。通过亲自从环境搭建、监控告警、项目上线 这一系列操作 ,对整个k8s生态理解的更加深刻,同时也通过k8s提升了开发效率。
参考文档
我正在参与掘金技术社区创作者签约计划招募活动,点击链接报名投稿。