如何选择合适的监控平台
- 1.监控类型支持多(数据库、操作系统、中间件、开发的应用监控(java,php,c#)等)?
- 2.丰富的监控表报+自定义监控表报+预警通知?
- 3.可扩展的自定义监控插件定义?
监控平台构建
Prometheus使用
Prometheus:支持各种类型的数据并以时间序列形式进行数据的采集、存储.
架构
安装配置
- 1.下载
wget https://github.com/prometheus/prometheus/releases/download/v2.13.0/prometheus-2.13.0.linux-amd64.tar.gz
- 2.配置
tar -zxvf prometheus-2.13.0.linux-amd64.tar.gz
cd prometheus-2.13.0.linux-amd64
./prometheus --help -- 看启动命令
./prometheus --config.file='prometheus.yml' --启动配置
访问地址:http://localhost:9090
- 2.1 配置文件详解
prometheus.yml:
#全局配置
global:
#通知预警
alerting:
#规则配置
rule_files:
#job任务配置(数据采集)
scrape_configs:
#任务名称
- job_name:
#采集数据频率
scrape_interval:
#超时时间
scrape_timeout:
#采集数据的路径(默认:/metrics)
metrics_path:
#定义标签冲突解决方式
honor_labels:
#定义支持处理协议(默认:http)
scheme:
#http 请求url参数
params:
#http Authorization处理
basic_auth:
username:
password:
password_file:
#基于beaer_token模式的验证
bearer_token:
#job处理任务配置
static_configs:
#目标主机
- targets:[]
#标签配置
- labels:[]
Grafana使用
Grafana:开源的分析、监控系统
架构
使用(Linux)
- 1.安装
wget https://dl.grafana.com/oss/release/grafana-6.4.2.linux-amd64.tar.gz
tar -zxvf grafana-6.4.2.linux-amd64.tar.gz
- 2.运行
cd grafana-6.4.2.linux-amd64/bin
./grafana-server
http://localhost:3000
- 2.1 配置说明
defaults.ini:默认配置
ustom.ini or grafana.ini:自定义配置
-----------------------------------
paths:存储相关路径
server:服务器配置
database:存储数据库配置
remote_cache:缓存配置
security:安全配置
users:用户相关配置
dashboards:面板相关配置
smtp:邮箱相关配置
log:日志相关配置
alerting:预警配置
监控平台构建
Prometheus预警使用
预警提供对监控服务器、应用、数据的情况的汇报以达到最终监控目的
参考:
AlertManager
预警通知相关配置
- 1.alertmanager.yml参数使用说明
#全局参数配置(邮件服务、Slack、微信等)
global:
#通知模板配置
template:
#用来设置报警的分发策略,它是一个树状结构,按照深度优先从左向右的顺序进行匹配
#实际在配置:在什么时间点以什么方式给谁发送消息
route:
#通知端配置
receivers:
-------------------------------------
参考:
global:
resolve_timeout: 5m
smtp_smarthost: 'smtp.exmail.qq.com:465' # 邮箱smtp服务器代理
smtp_from: 'xxxxx' # 发送邮箱名称
smtp_auth_username: 'xxxx' # 邮箱名称
smtp_auth_password: 'xxxx' # 邮箱密码或授权码
smtp_require_tls: false #禁用tls
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: 'forEmail'
receivers:
#邮箱配置
- name: 'forEmail'
email_configs:
- to: 'xxxx'
send_resolved: true
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']
预警采集配置
1. Propmetheus开启预警配置
#alert通知预警配置
alerting:
alertmanagers:
- static_configs:
- targets: ["localhost:9093"]
2. 预警信息规则采集配置(rule配置)
rule_files:
- "/data/monitor/prometheus-2.13.0.linux-amd64/rule/*.rules"
groups:
- name: #规则名
rules:
- alert: #规则名
expr: #使用PromQL表达式进行规则条件定义
#达到、或者持续多长时间触发规则
for:
labels:
severity:
team:
#添加到所用alert上的统一信息
annotations:
summary:
--------------------------------
参考:
服务器宕机监控
groups:
- name: node-up
rules:
- alert: node-up
expr: up{job="node"} == 0
for: 15s
labels:
severity: 1
team: node
annotations:
summary: "{{ $labels.instance }} 已停止运行超过 15s!"
Prometheus常见exporters使用
Java 应用(Spring Boot)
- 1.Spring Boot 应用端引依赖并配置
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
<version>1.1.3</version>
</dependency>
@Bean
MeterRegistryCustomizer<MeterRegistry> configurer(
@Value("${spring.application.name}") String applicationName) {
return (registry) -> registry.config().commonTags("application", applicationName);
}
management:
endpoints:
web:
exposure:
include: '*'
metrics:
tags:
application: ${spring.application.name}
- 2.prometheus采集端配置
- job_name: 'springboot_prometheus'
scrape_interval: 5s
metrics_path: '/actuator/prometheus' #采集点
static_configs:
- targets: ['192.168.0.106:8080','192.168.0.106:8081'] #多应用配置
服务器监控(Linux/Window)
1.下载
wget https://github.com/prometheus/node_exporter/releases/download/v0.18.1/node_exporter-0.18.1.linux-386.tar.gz
2.启动
tar -zxvf node_exporter-0.18.1.linux-386.tar.gz
./node_exporter
3.采集端配置
- job_name: 'node'
static_configs:
- targets: ['192.168.0.106:9100']
数据库
MongoDB 监控配置[待定]
https://github.com/percona/mongodb_exporter/releases
https://devconnected.com/mongodb-monitoring-with-grafana-prometheus/#b_Installing_the_MongoDB_exporter
1.下载
wget https://github.com/percona/mongodb_exporter/releases/download/v0.10.0/mongodb_exporter-0.10.0.linux-amd64.tar.gz
2.监控配置
3.采集配置
- job_name: 'mongoDb'
static_configs:
- targets: ['192.168.0.106:9001']
Redis
1.下载并配置
https://github.com/oliver006/redis_exporter
2.监控配置
https://grafana.com/grafana/dashboards/763
3.采集配置
- job_name: 'Redis'
static_configs:
- targets: ['192.168.0.106:9121']
Mysql
1.下载采集器
https://github.com/prometheus/mysqld_exporter
1.1 mysql 数据库创建采集用户并授权:
CREATE USER 'exporter'@'localhost' IDENTIFIED BY 'exporter' WITH MAX_USER_CONNECTIONS 3;
GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'localhost';
1.2 采集器配置(创建my.conf配置文件)
[client]
host=192.168.0.102 #mysql server ip地址
port=3306
user=exporter#这里的用户就是第一步设置的用户
password=exporter
1.3 启动服务
./mysqld_exporter --config.my-cnf=/指定目录下/my.conf
2.dashboard 配置
https://grafana.com/grafana/dashboards/7362
https://grafana.com/grafana/dashboards/6239
3.监控采集配置
- job_name: 'Mysql'
static_configs:
- targets: ['192.168.0.106:9104']
PostgreSQL
# 1.postgresql 采集器下载
wget https://github.com/prometheus-community/postgres_exporter/releases/download/v0.9.0/postgres_exporter-0.9.0.linux-amd64.tar.gz
# 2.postgresql 数据库配置
CREATE USER postgres_exporter PASSWORD '123456';
ALTER USER postgres_exporter SET SEARCH_PATH TO postgres_exporter,pg_catalog;
CREATE SCHEMA postgres_exporter AUTHORIZATION postgres_exporter;
CREATE FUNCTION postgres_exporter.f_select_pg_stat_activity()
RETURNS setof pg_catalog.pg_stat_activity
LANGUAGE sql
SECURITY DEFINER
AS $$
SELECT * from pg_catalog.pg_stat_activity;
$$;
CREATE FUNCTION postgres_exporter.f_select_pg_stat_replication()
RETURNS setof pg_catalog.pg_stat_replication
LANGUAGE sql
SECURITY DEFINER
AS $$
SELECT * from pg_catalog.pg_stat_replication;
$$;
CREATE VIEW postgres_exporter.pg_stat_replication
AS
SELECT * FROM postgres_exporter.f_select_pg_stat_replication();
CREATE VIEW postgres_exporter.pg_stat_activity
AS
SELECT * FROM postgres_exporter.f_select_pg_stat_activity();
GRANT SELECT ON postgres_exporter.pg_stat_replication TO postgres_exporter;
GRANT SELECT ON postgres_exporter.pg_stat_activity TO postgres_exporter;
# 3.启动采集器
vim pg_export.sh #创建启动脚本
#!/bin/sh
export DATA_SOURCE_NAME="user=postgres_exporter host=127.0.0.1 password=123456 port=5432 dbname=test_monitor sslm
ode=disable"
./postgres_exporter
# 4.prometheus job 配置
- job_name: 'postgres'
static_configs:
- targets: ['192.168.3.32:9187']
http://192.168.3.32:9187/metrics #查看采集数据
# 5.postgres_exporter自定义查询配置
说明:
1.基于postgres_exporter源码进行修改,添加查询语句
2.源码打包编译上传并更新queries.yaml文件
消息中间件
- RabbitMq
前提:安装RabbitMq(如果是测试环境,使用docker简化操作)
https://www.cnblogs.com/yufeng218/p/9452621.html
1.下载并配置
https://github.com/kbudde/rabbitmq_exporter
1.1-通过配置文件
1.2-通过环境变量(更多参数参考文档)
如:RABBIT_USER=admin RABBIT_PASSWORD=admin RABBIT_URL=http://192.168.0.105:15672 SKIP_QUEUES="RPC_.*" MAX_QUEUES=5000 ./rabbitmq_exporter
2.监控模板
https://grafana.com/grafana/dashboards/4279
2.采集配置
- job_name: 'Redis'
static_configs:
- targets: ['192.168.0.106:9419']
其他中间件
- docker
应用服务器
- Nginx
1.安装nginx-module-vts提供采集nginx相关信息(以源码方式进行Nginx安装)[监控前提]
https://github.com/vozlt/nginx-module-vts
nginx.conf中配置
http {
vhost_traffic_status_zone;
server {
location /status {
vhost_traffic_status_display;
vhost_traffic_status_display_format html;
}
}
}
查看/status/format/json 是否能够访问到数据,如果可以表示配置成功
2.下载nginx-vts-exporter并进行监控配置
参考:https://hnlq715.github.io/nginx-vts-exporter/
./nginx-vts-exporter --help --->查看详细参数
./nginx-vts-exporter -nginx.scrape_uri=http://localhost/status/format/json
nginx.scrape_uri:监控获取的参数指标路径(基于nginx.config中配置访问路径为准)
3.采集配置
- job_name: 'nginx'
static_configs:
- targets: ['192.168.0.106:9913']