prometheus operator 自定义添加服务监控
整体流程
-
安装java agent,新增metrics接口
-
在deployment的template里新增标签和端口 如:metrics: jmx-metrics
-
创建 服务的service关联metrics:jmx-metrics
-
创建 servicemonitor 关联jmx-metrics
-
授权用户权限对资源访问
配置
2 参考 prometheus-operator xupd-openapi-platform.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: xupd-openapi-platform
namespace: xupd
labels:
app: xupd-openapi-platform
metrics: jmx-metrics
spec:
replicas: 1
selector:
matchLabels:
app: xupd-openapi-platform
template:
metadata:
labels:
app: xupd-openapi-platform
metrics: jmx-metrics
spec:
containers:
- name: xupd-openapi-platform
image: 172.16.12.43/xupd/xupd-openapi-platform/xupd-openapi-platform:93
imagePullPolicy: Always
env:
- name: LC_ALL
value: "C.UTF-8"
- name: MY_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: MY_NODE_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
ports:
- containerPort: 7145
- containerPort: 6060
name: http-metrics
readinessProbe:
tcpSocket:
port: 7145
volumeMounts:
- name: app-logs
mountPath: /home/project/xupd-openapi-platform/log/
- name: xupd-openapi-platform-log
image: 172.16.12.43/vv-base/beats/filebeat:6.5.4
args: [
"-c", "/etc/filebeat/filebeat.yml",
"-e",
]
volumeMounts:
- name: app-logs
mountPath: /home/project/xupd-openapi-platform/log
- name: xupd-openapi-platform-config
mountPath: /etc/filebeat/
volumes:
- name: app-logs
emptyDir: {}
- name: xupd-openapi-platform-config
configMap:
name: xupd-openapi-platform-config
新增
metrics: jmx-metrics
- containerPort: 6060
name: http-metrics
-
service_plat.yaml
apiVersion: v1 kind: Service metadata: labels: metrics: jmx-metrics # ServiceMonitor 自动发现的关键label name: jmx-metrics namespace: xupd spec: ports: - name: http-metrics #对应 ServiceMonitor 中spec.endpoints.port port: 6060 # jmx-exporter 暴露的服务端口 targetPort: http-metrics # pod yaml 暴露的端口名 selector: metrics: jmx-metrics # service本身的标签选择器 -
servicemoniotr_plat.yaml
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor # prometheus-operator 定义的CRD metadata: name: jmx-metrics namespace: monitoring labels: k8s-apps: jmx-metrics spec: jobLabel: metrics #监控数据的job标签指定为metrics label的值,即加上数据标签job=jmx-metrics selector: matchLabels: metrics: jmx-metrics # 自动发现 label中有metrics: jmx-metrics 的service namespaceSelector: matchNames: # 配置需要自动发现的命名空间,可以配置多个 - xupd endpoints: - port: http-metrics # 拉去metric的端口,这个写的是 service的端口名称,即 service yaml的spec.ports.name interval: 15s # 拉取metric的时间间隔apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: prometheus-k8s rules: - apiGroups: - "" resources: - nodes - services - endpoints - pods - nodes/proxy verbs: - get - list - watch - apiGroups: - "" resources: - configmaps - nodes/metrics verbs: - get - nonResourceURLs: - /metrics verbs: - get
排错
kubectl delete pod prometheus-k8s-1 -n monitoring
kubectl logs prometheus-k8s-1 -n monitoring -c prometheus
权限异常:
Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:ser
viceaccount:monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"xupd\""
level=error ts=2021-02-03T07:21:01.417Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:265: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount
:monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"xupd\""
level=error ts=2021-02-03T07:21:01.423Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:264: Failed to list *v1.Service: services is forbidden: User \"system:servic
eaccount:monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"xupd\""
参考文档:
chanjarster.github.io/post/prom-g… (java agent)