Prometheus组件介绍
- prometheus server: 主服务,接受外部http请求,收集、存储与查询数据
- promeheus targets:静态收集的目标服务数据
- service discovery:动态发现服务
- prometheus alerting: 报警通知
- push gateway:数据收集代理服务器(类似于zabbix proxy)
- data visualization and export:数据可视化与数据导出(访问客户端)
使用Operator部署Prometheus
克隆Operator仓库
git clone https://github.com/prometheus-operator/kube-prometheus.git
进入目录应用yaml文件
# 进入项目目录
[root@node1 ~]# cd kube-prometheus/
查看项目使用的谷歌镜像有哪些
可以看到kube-state-metrics和prometheus-adapter使用了google官方的镜像仓库,国内无法访问,所以后面得替换下
[root@node1 kube-prometheus]# grep image: ./* -R | grep registry.k8s.io
./manifests/kubeStateMetrics-deployment.yaml: image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.9.2
./manifests/prometheusAdapter-deployment.yaml: image: registry.k8s.io/prometheus-adapter/prometheus-adapter:v0.11.1
使用docker hub的镜像替换registry.k8s.io的镜像
该步骤需要自己将上面的两个镜像推送到自己的docker hub仓库里面
sed -i "s#registry.k8s.io/kube-state-metrics/kube-state-metrics.*#postkarte/kube-state-metrics:v2.9.2#g" manifests/kubeStateMetrics-deployment.yaml
sed -i "s#registry.k8s.io/prometheus-adapter.*#postkarte/prometheus-adapter:v0.11.1#g" manifests/prometheusAdapter-deployment.yaml
首先应用setup目录里面的yaml文件
注意应用时候需要加上--server-side,表示Server-side Apply
该目录包括创建名称空间等初始化资源清单文件
[root@node1 kube-prometheus]# kubectl apply --server-side -f manifests/setup/
customresourcedefinition.apiextensions.k8s.io/alertmanagerconfigs.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/probes.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/prometheusagents.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/scrapeconfigs.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com serverside-applied
namespace/monitoring serverside-applied
等待上一步资源创建成功
kubectl wait \
--for condition=Established \
--all CustomResourceDefinition \
--namespace=monitoring
此命令的含义是:
-
kubectl wait:等待某个事件满足条件之后再返回
-
--for condition=Established:等待条件为"Established",即等待到达"Established"状态
-
--all CustomResourceDefinition:等待的对象是所有类型为CustomResourceDefinition的资源
-
--namespace=monitoring:限定查看monitoring命名空间下的CRD
这个命令在monitoring命名空间下,等待所有的CustomResourceDefinition资源对象都进入"Established"状态之后才返回。
CustomResourceDefinition (CRD)需要一段时间从定型状态变成已建立状态,这期间CRD资源还无法正常使用。
此命令就是等待所有的CRD建立完成,确保它们都能正常工作后,kubectl命令才继续执行后面的操作。
使用kubectl wait可以避免在CRD还未完全就绪的情况下进行后续操作,从而避免因CRD状态不全导致的处理失败。这对于需要依赖CRD定义的其他资源来说,是必要的同步机制。
在上一步资源创建成功后,应用Prometheus和grafana的yaml文件
修改下prometheus的service类型为NodePort
编辑manifests/prometheus-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/component: prometheus
app.kubernetes.io/instance: k8s
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.47.0
name: prometheus-k8s
namespace: monitoring
spec:
type: NodePort # 修改Service类型为NodePort
ports:
- name: web
port: 9090
nodePort: 30099 # 对外暴露30099端口
targetPort: web
- name: reloader-web
port: 8080
targetPort: reloader-web
selector:
app.kubernetes.io/component: prometheus
app.kubernetes.io/instance: k8s
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: kube-prometheus
sessionAffinity: ClientIP
修改grafana的service类型为NodePort
修改manifests/grafana-service.yaml, 内容如下
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/component: grafana
app.kubernetes.io/name: grafana
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 9.5.3
name: grafana
namespace: monitoring
spec:
type: NodePort # 修改Service类型为NodePort
ports:
- name: http
port: 3000
nodePort: 30100 # 对外暴露30100端口
targetPort: http
selector:
app.kubernetes.io/component: grafana
app.kubernetes.io/name: grafana
app.kubernetes.io/part-of: kube-prometheus
应用yaml
[root@node1 kube-prometheus]# kubectl apply -f manifests/
查看资源部署情况
查看Service
[root@node1 kube-prometheus]# kubectl get service -n monitoring
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-main ClusterIP 10.104.94.183 <none> 9093/TCP,8080/TCP 39s
blackbox-exporter ClusterIP 10.103.146.184 <none> 9115/TCP,19115/TCP 38s
grafana NodePort 10.97.17.194 <none> 3000:30100/TCP 38s
kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 37s
node-exporter ClusterIP None <none> 9100/TCP 37s
prometheus-adapter ClusterIP 10.98.202.192 <none> 443/TCP 36s
prometheus-k8s NodePort 10.107.150.96 <none> 9090:30099/TCP,8080:32682/TCP 37s
prometheus-operator ClusterIP None <none> 8443/TCP 36s
查看Pod
[root@node1 ~]# kubectl get pods -n monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2/2 Running 0 7h32m
alertmanager-main-1 2/2 Running 0 7h32m
alertmanager-main-2 2/2 Running 0 7h32m
blackbox-exporter-d9597b5ff-qzzrn 3/3 Running 0 7h33m
grafana-748964b847-868mr 1/1 Running 0 7h33m
kube-state-metrics-64b967c498-q57d6 3/3 Running 0 7h30m
node-exporter-7vxcn 2/2 Running 0 7h33m
node-exporter-dgw6q 2/2 Running 0 7h33m
node-exporter-p5kx2 2/2 Running 0 7h33m
node-exporter-sl9ns 2/2 Running 0 7h33m
node-exporter-wvvwg 2/2 Running 0 7h33m
prometheus-adapter-65f6855fcd-dh2rk 1/1 Running 0 7h33m
prometheus-adapter-65f6855fcd-sqv7m 1/1 Running 0 7h33m
prometheus-k8s-0 2/2 Running 0 7h32m
prometheus-k8s-1 2/2 Running 0 7h32m
prometheus-operator-749b97889c-77v6p 2/2 Running 0 7h33m
访问Prometheus
访问 http://192.168.0.184:30099/
访问Grafana
默认用户admin,默认密码admin
查看dashboard
点击Dashboad栏
可以看到有很多dashboard,同时除了默认的这些dashboard之外还可以自己创建和导入
遇到的错误
在应用manifests/setup/文件中的yaml文件时候出现如下错误,解决办法是加上--server-side
Error from server (Invalid): error when creating "manifests/setup/0prometheusCustomResourceDefinition.yaml": CustomResourceDefinition.apiextensions.k8s.io "prometheuses.monitoring.coreos.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes
Error from server (Invalid): error when creating "manifests/setup/0prometheusagentCustomResourceDefinition.yaml": CustomResourceDefinition.apiextensions.k8s.io "prometheusagents.monitoring.coreos.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes