持续创作,加速成长!这是我参与「掘金日新计划 · 6 月更文挑战」的第2天,点击查看活动详情
服务监控告警套件
Prometheus
简介: Prometheus是一个服务监控系统。它以给定的时间间隔从配置的目标收集指标,评估规则表达式,显示结果,并在观察到指定条件时触发警报。
docker安装prometheus
docker pull prom/prometheus
docker run -d -p 9090:9090 --restart=always --name prometheus -v /home/work/prometheus.yml:/etc/prometheus/prometheus.yml -v /home/work/rules:/usr/local/prometheus/rules prom/prometheus
监控采集配置
SpringBoot项目中加入micrometer-registry-prometheus并开启actuator, 加入监控endpoints配置 management.endpoints.web.exposure.include: "*"
# 全局配置
global:
# 多久采集一次数据
scrape_interval: 15s
# 采集时的超时时间
scrape_timeout: 10s
# 告警通知
alerting:
alertmanagers:
- static_configs:
- targets:
- ip:9093
# 告警alert规则
rule_files:
- "/usr/local/prometheus/rules/*.rules"
# 监控采集任务
scrape_configs:
# web服务
- job_name: 'book'
# 采集的路径
metrics_path: '/api/actuator/prometheus'
# 采集服务的地址
static_configs:
- targets: ['ip:8081']
# MySQL
- job_name: 'mysql'
static_configs:
- targets: ['ip:9104']
# Redis
- job_name: 'redis'
static_configs:
- targets: ['ip:9121']
# Nginx
- job_name: 'nginx'
static_configs:
- targets: ['ip:9913']
# Linux
- job_name: 'linux'
static_configs:
- targets: ['ip:9100','ip:9100']
告警规则配置
groups:
- name: Warning
rules:
- alert: book-status
# 名称为book的任务状态为0表示宕机
expr: up{job="book"} == 0
# 服务状态转移时间间隔
# Inactive -> Pending -> Firing
# 当状态为Firing是会发送告警邮件
for: 10s
labels:
status: Warning
severity: 1
annotations:
summary: "{{$labels.instance}}: 服务宕机"
description: "{{$labels.instance}}: 服务中断超过10s"
Alertmanager
docker安装alertmanager
docker pull quay.io/prometheus/alertmanager
docker run -d -p 9093:9093 --restart=always --name alertmanager -v /home/work/alertmanager/alertmanager.yml:/etc/alertmanager/alertmanager.yml -v /home/work/alertmanager:/alertmanager prom/alertmanager
钉钉告警
docker安装prometheus-webhook-dingtalk
# v0.3.0
docker pull timonwong/prometheus-webhook-dingtalk:v0.3.0
docker run -d -p 8060:8060 -v /home/work/alertmanager/template:/usr/share/prometheus-webhook-dingtalk/template --name webhook-dingding timonwong/prometheus-webhook-dingtalk:v0.3.0 --template.file=/usr/share/prometheus-webhook-dingtalk/template/default.tmpl --ding.profile="webhook1=https://oapi.dingtalk.com/robot/send?access_token=xx"
# latest
# 需要配置config.yml
docker run -d -p 8060:8060 -v /home/work/alertmanager/template:/usr/share/prometheus-webhook-dingtalk/template -v /home/work/webhook:/usr/share/prometheus-webhook-dingtalk --restart=always --name webhook-dingding timonwong/prometheus-webhook-dingtalk --config.file=/usr/share/prometheus-webhook-dingtalk/config.yml
配置 dingtalk config.yml
# 自定义模板文件
templates:
- /usr/share/prometheus-webhook-dingtalk/template/default.tmpl
targets:
webhook1:
url: https://oapi.dingtalk.com/robot/send?access_token=xx
mention:
all: true
# mobiles: ['123']
告警效果
Nginx日志高级监控Nginx-Loki
- docker安装
# 拉取loki-config.yaml
wget https://raw.githubusercontent.com/grafana/loki/v2.1.0/cmd/loki/loki-local-config.yaml -O loki-config.yaml
docker run -d -v $(pwd):/mnt/config -p 3100:3100 --name loki grafana/loki:2.1.0 -config.file=/mnt/config/loki-config.yaml
# 拉取promtail-config.yaml
wget https://raw.githubusercontent.com/grafana/loki/v2.1.0/cmd/promtail/promtail-docker-config.yaml -O promtail-config.yaml
docker run -d -v $(pwd):/mnt/config -v /usr/local/nginx/logs:/usr/local/nginx/logs --name promtail grafana/promtail:2.1.0 -config.file=/mnt/config/promtail-config.yaml
- 修改loki-config.yaml和promtail-config.yaml配置
- 在nginx的http模块添加如下配置**nginx json_analytics 配置传送门**
数据接入Grafana
Grafana数据可视化效果图
简介: Grafana是数据的开源分析和监控解决方案
创建一个自己的监控面板流程如下
推荐一些我觉得比较好用的模板
- Linux服务器监控 (10795, 8919)
- JVM (10280, 4701)
- Nginx Nginx-Loki(2949, 12559)
- Redis (11835)
- MySQL (7362)