Spring Boot Actuator 监控详解
一、知识概述
Spring Boot Actuator 是 Spring Boot 的监控和管理模块,提供了生产级的监控和管理功能。通过 Actuator,可以查看应用的运行状态、健康检查、指标监控、线程信息等,是运维和排查问题的重要工具。
Actuator 核心功能:
- 健康检查:Health Endpoint
- 指标监控:Metrics Endpoint
- 信息展示:Info Endpoint
- 环境信息:Environment Endpoint
- 日志管理:Loggers Endpoint
- 线程信息:Thread Dump
理解 Actuator 的使用和配置,是构建生产级 Spring Boot 应用的必备技能。
二、知识点详细讲解
2.1 Actuator 端点
内置端点列表
| 端点 | 说明 | 默认暴露 |
|---|---|---|
| /actuator | 所有端点列表 | ✓ |
| /health | 应用健康状态 | ✓ |
| /info | 应用信息 | ✓ |
| /beans | Bean 列表 | ✗ |
| /conditions | 自动配置条件 | ✗ |
| /configprops | 配置属性 | ✗ |
| /env | 环境变量 | ✗ |
| /loggers | 日志配置 | ✗ |
| /metrics | 指标信息 | ✗ |
| /mappings | URL 映射 | ✗ |
| /threaddump | 线程转储 | ✗ |
| /heapdump | 堆转储 | ✗ |
| /shutdown | 关闭应用 | ✗ |
2.2 端点暴露配置
management:
endpoints:
web:
exposure:
include: health,info,metrics # 暴露指定端点
# include: "*" # 暴露所有端点
endpoint:
health:
show-details: always # 显示健康详情
shutdown:
enabled: true # 启用关闭端点
2.3 健康检查
内置健康指示器
- DataSourceHealthIndicator
- RedisHealthIndicator
- MongoHealthIndicator
- DiskSpaceHealthIndicator
- ElasticsearchHealthIndicator
自定义健康指示器
@Component
public class MyHealthIndicator implements HealthIndicator {
@Override
public Health health() {
// 自定义健康检查逻辑
return Health.up().withDetail("custom", "OK").build();
}
}
2.4 指标监控
Spring Boot 2.x 使用 Micrometer 作为指标门面:
- Counter:计数器
- Gauge:即时值
- Timer:计时器
- Summary:摘要统计
2.5 安全配置
Actuator 端点包含敏感信息,需要安全控制:
- 使用 Spring Security 保护端点
- 配置独立的管理端口
- 启用 HTTPS
三、代码示例
3.1 基础配置
# application.yml
management:
# 端点配置
endpoints:
web:
exposure:
include: health,info,metrics,env,loggers,mappings,beans
base-path: /actuator # 默认路径
# 端点细粒度配置
endpoint:
health:
show-details: always # always/never/when-authorized
show-components: always
probes:
enabled: true # 启用 Kubernetes 探针
env:
show-values: always # 显示环境变量值
loggers:
enabled: true
shutdown:
enabled: false # 默认关闭
# 健康检查配置
health:
db:
enabled: true
redis:
enabled: true
diskspace:
enabled: true
threshold: 10MB # 磁盘空间阈值
livenessstate:
enabled: true
readinessstate:
enabled: true
# 指标配置
metrics:
tags:
application: ${spring.application.name}
export:
prometheus:
enabled: true # 启用 Prometheus 格式
distribution:
percentiles-histogram:
http.server.requests: true
percentiles:
http.server.requests: 0.5,0.95,0.99
# 服务器配置
server:
port: 8081 # 管理端口(独立于应用端口)
# address: 127.0.0.1 # 只允许本地访问
# 信息配置
info:
env:
enabled: true
java:
enabled: true
os:
enabled: true
build:
enabled: true
git:
mode: full
# 应用信息
info:
app:
name: My Application
version: 1.0.0
description: 这是一个示例应用
author: Developer Team
3.2 健康检查示例
import org.springframework.boot.actuate.health.*;
import org.springframework.stereotype.Component;
import java.util.concurrent.atomic.AtomicBoolean;
// 自定义健康指示器
@Component
public class DatabaseHealthIndicator implements HealthIndicator {
private final DatabaseService databaseService;
public DatabaseHealthIndicator(DatabaseService databaseService) {
this.databaseService = databaseService;
}
@Override
public Health health() {
try {
// 检查数据库连接
if (databaseService.isConnectionValid()) {
return Health.up()
.withDetail("database", "MySQL")
.withDetail("connection", "active")
.withDetail("responseTime", databaseService.getLastResponseTime() + "ms")
.build();
} else {
return Health.down()
.withDetail("database", "MySQL")
.withDetail("error", "Connection failed")
.build();
}
} catch (Exception e) {
return Health.down()
.withDetail("database", "MySQL")
.withDetail("error", e.getMessage())
.withException(e)
.build();
}
}
}
// 外部服务健康检查
@Component
public class ExternalServiceHealthIndicator implements HealthIndicator {
private final RestTemplate restTemplate;
public ExternalServiceHealthIndicator(RestTemplate restTemplate) {
this.restTemplate = restTemplate;
}
@Override
public Health health() {
try {
// 检查外部服务
ResponseEntity<String> response = restTemplate.getForEntity(
"https://api.example.com/health",
String.class
);
if (response.getStatusCode().is2xxSuccessful()) {
return Health.up()
.withDetail("service", "External API")
.withDetail("status", "reachable")
.build();
} else {
return Health.down()
.withDetail("service", "External API")
.withDetail("status", "error")
.build();
}
} catch (Exception e) {
return Health.down()
.withDetail("service", "External API")
.withDetail("error", e.getMessage())
.build();
}
}
}
// 组合健康指示器
@Component
public class CompositeHealthIndicator implements HealthIndicator {
private final List<HealthIndicator> indicators;
public CompositeHealthIndicator(List<HealthIndicator> indicators) {
this.indicators = indicators;
}
@Override
public Health health() {
Health.Builder builder = Health.up();
for (HealthIndicator indicator : indicators) {
Health health = indicator.health();
builder.withDetail(
indicator.getClass().getSimpleName(),
health.getStatus()
);
if (health.getStatus() != Status.UP) {
builder.status(health.getStatus());
}
}
return builder.build();
}
}
// 健康状态聚合
@Component
public class CustomHealthAggregator {
public Status aggregate(List<Status> statuses) {
// 自定义状态聚合逻辑
if (statuses.contains(Status.DOWN)) {
return Status.DOWN;
}
if (statuses.contains(Status.OUT_OF_SERVICE)) {
return Status.OUT_OF_SERVICE;
}
return Status.UP;
}
}
3.3 自定义端点示例
import org.springframework.boot.actuate.endpoint.annotation.*;
import org.springframework.stereotype.Component;
import java.util.*;
// 自定义端点
@Component
@Endpoint(id = "custom")
public class CustomEndpoint {
private final Map<String, Object> info = new HashMap<>();
public CustomEndpoint() {
info.put("version", "1.0.0");
info.put("startTime", System.currentTimeMillis());
}
// 读取操作
@ReadOperation
public Map<String, Object> info() {
Map<String, Object> result = new HashMap<>(info);
result.put("uptime", System.currentTimeMillis() - (Long) info.get("startTime"));
return result;
}
// 带参数的读取操作
@ReadOperation
public Object infoByKey(@Selector String key) {
return info.get(key);
}
// 写入操作
@WriteOperation
public void updateInfo(@Selector String key, String value) {
info.put(key, value);
}
// 删除操作
@DeleteOperation
public void deleteInfo(@Selector String key) {
info.remove(key);
}
}
// Web 端点
@Component
@WebEndpoint(id = "customweb")
public class CustomWebEndpoint {
@ReadOperation
public CustomData getData() {
return new CustomData("test", 100, System.currentTimeMillis());
}
@ReadOperation
public CustomData getDataById(@Selector String id) {
return new CustomData(id, 200, System.currentTimeMillis());
}
}
public class CustomData {
private String id;
private int value;
private long timestamp;
public CustomData(String id, int value, long timestamp) {
this.id = id;
this.value = value;
this.timestamp = timestamp;
}
// getter
public String getId() { return id; }
public int getValue() { return value; }
public long getTimestamp() { return timestamp; }
}
// JMX 端点
@Component
@JmxEndpoint(id = "customjmx")
public class CustomJmxEndpoint {
private String status = "running";
@ReadOperation
public String getStatus() {
return status;
}
@WriteOperation
public void setStatus(String status) {
this.status = status;
}
}
3.4 指标监控示例
import io.micrometer.core.instrument.*;
import org.springframework.stereotype.Service;
import java.util.concurrent.*;
@Service
public class MetricsService {
private final MeterRegistry meterRegistry;
private final Counter requestCounter;
private final Timer responseTimer;
private final Gauge activeConnections;
private final AtomicInteger activeCount = new AtomicInteger(0);
public MetricsService(MeterRegistry meterRegistry) {
this.meterRegistry = meterRegistry;
// 创建计数器
this.requestCounter = Counter.builder("app.requests.total")
.description("Total number of requests")
.tag("type", "api")
.register(meterRegistry);
// 创建计时器
this.responseTimer = Timer.builder("app.response.time")
.description("Response time in milliseconds")
.tag("endpoint", "api")
.publishPercentiles(0.5, 0.95, 0.99)
.register(meterRegistry);
// 创建 Gauge
this.activeConnections = Gauge.builder("app.connections.active", activeCount, AtomicInteger::get)
.description("Number of active connections")
.register(meterRegistry);
}
// 记录请求
public void recordRequest(String endpoint) {
Counter.builder("app.requests.total")
.tag("endpoint", endpoint)
.register(meterRegistry)
.increment();
}
// 记录响应时间
public void recordResponseTime(String endpoint, long durationMs) {
Timer.builder("app.response.time")
.tag("endpoint", endpoint)
.register(meterRegistry)
.record(durationMs, TimeUnit.MILLISECONDS);
}
// 使用 Timer 包装代码块
public <T> T timedCall(String operation, Callable<T> callable) throws Exception {
return Timer.builder("app.operation.time")
.tag("operation", operation)
.register(meterRegistry)
.recordCallable(callable);
}
// 连接数增减
public void incrementConnection() {
activeCount.incrementAndGet();
}
public void decrementConnection() {
activeCount.decrementAndGet();
}
// 创建 Summary(摘要统计)
public void recordSummary(String name, double value) {
DistributionSummary.builder(name)
.description("Distribution summary")
.baseUnit("bytes")
.publishPercentiles(0.5, 0.95, 0.99)
.register(meterRegistry)
.record(value);
}
// 创建 LongTaskTimer(长时间任务计时)
public LongTaskTimer.Sample startLongTask(String taskName) {
return LongTaskTimer.builder("app.long.task")
.tag("task", taskName)
.register(meterRegistry)
.start();
}
public void stopLongTask(LongTaskTimer.Sample sample) {
sample.stop();
}
}
// 控制器指标示例
@RestController
public class MonitoredController {
private final MetricsService metricsService;
public MonitoredController(MetricsService metricsService) {
this.metricsService = metricsService;
}
@GetMapping("/api/data")
public String getData() {
long startTime = System.currentTimeMillis();
try {
metricsService.recordRequest("/api/data");
metricsService.incrementConnection();
// 业务逻辑
String result = processData();
return result;
} finally {
metricsService.recordResponseTime("/api/data",
System.currentTimeMillis() - startTime);
metricsService.decrementConnection();
}
}
private String processData() {
return "data";
}
}
3.5 信息端点示例
import org.springframework.boot.actuate.info.*;
import org.springframework.stereotype.Component;
import java.util.*;
// 自定义信息贡献者
@Component
public class CustomInfoContributor implements InfoContributor {
@Override
public void contribute(Info.Builder builder) {
builder.withDetail("app", createAppInfo())
.withDetail("build", createBuildInfo())
.withDetail("runtime", createRuntimeInfo());
}
private Map<String, Object> createAppInfo() {
Map<String, Object> info = new HashMap<>();
info.put("name", "My Application");
info.put("version", "1.0.0");
info.put("description", "A sample Spring Boot application");
return info;
}
private Map<String, Object> createBuildInfo() {
Map<String, Object> info = new HashMap<>();
info.put("time", new Date());
info.put("artifact", "my-app");
info.put("group", "com.example");
return info;
}
private Map<String, Object> createRuntimeInfo() {
Map<String, Object> info = new HashMap<>();
info.put("javaVersion", System.getProperty("java.version"));
info.put("osName", System.getProperty("os.name"));
info.put("availableProcessors", Runtime.getRuntime().availableProcessors());
info.put("maxMemory", Runtime.getRuntime().maxMemory());
info.put("totalMemory", Runtime.getRuntime().totalMemory());
info.put("freeMemory", Runtime.getRuntime().freeMemory());
return info;
}
}
// Git 信息贡献者
@Component
public class GitInfoContributor implements InfoContributor {
@Override
public void contribute(Info.Builder builder) {
Map<String, Object> git = new HashMap<>();
git.put("branch", "main");
git.put("commit", Map.of(
"id", "abc123",
"time", new Date(),
"message", "Initial commit"
));
builder.withDetail("git", git);
}
}
// 健康状态信息贡献者
@Component
public class HealthInfoContributor implements InfoContributor {
private final HealthEndpoint healthEndpoint;
public HealthInfoContributor(HealthEndpoint healthEndpoint) {
this.healthEndpoint = healthEndpoint;
}
@Override
public void contribute(Info.Builder builder) {
HealthComponent health = healthEndpoint.health();
builder.withDetail("health", Map.of(
"status", health.getStatus().getCode(),
"components", health.getDetails()
));
}
}
3.6 安全配置示例
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.security.config.annotation.web.builders.HttpSecurity;
import org.springframework.security.web.SecurityFilterChain;
import org.springframework.boot.actuate.autoconfigure.security.servlet.EndpointRequest;
@Configuration
public class ActuatorSecurityConfig {
@Bean
public SecurityFilterChain actuatorSecurityFilterChain(HttpSecurity http) throws Exception {
http
.securityMatcher(EndpointRequest.toAnyEndpoint())
.authorizeHttpRequests(auth -> auth
.requestMatchers(EndpointRequest.to("health", "info")).permitAll()
.requestMatchers(EndpointRequest.toAnyEndpoint()).hasRole("ACTUATOR")
)
.httpBasic(basic -> {});
return http.build();
}
}
// 更细粒度的安全配置
@Configuration
public class ActuatorSecurityConfig2 {
@Bean
public SecurityFilterChain actuatorSecurityFilterChain2(HttpSecurity http) throws Exception {
http
.securityMatcher(EndpointRequest.toAnyEndpoint())
.authorizeHttpRequests(auth -> {
// 公开端点
auth.requestMatchers(EndpointRequest.to("health", "info", "prometheus"))
.permitAll();
// 只读端点
auth.requestMatchers(EndpointRequest.to(
"metrics", "env", "loggers", "mappings", "beans"
)).hasRole("MONITOR");
// 敏感端点
auth.requestMatchers(EndpointRequest.to(
"shutdown", "heapdump", "threaddump"
)).hasRole("ADMIN");
// 其他端点
auth.anyRequest().authenticated();
})
.httpBasic(basic -> {});
return http.build();
}
}
3.7 Prometheus 集成示例
# application.yml
management:
endpoints:
web:
exposure:
include: health,info,metrics,prometheus
metrics:
tags:
application: ${spring.application.name}
environment: ${spring.profiles.active}
export:
prometheus:
enabled: true
distribution:
percentiles-histogram:
http.server.requests: true
percentiles:
http.server.requests: 0.5,0.95,0.99
slo:
http.server.requests: 100ms,200ms,500ms,1s
import io.micrometer.core.instrument.*;
import io.micrometer.prometheus.PrometheusMeterRegistry;
import org.springframework.stereotype.Service;
@Service
public class PrometheusMetricsService {
private final PrometheusMeterRegistry meterRegistry;
public PrometheusMetricsService(MeterRegistry meterRegistry) {
this.meterRegistry = (PrometheusMeterRegistry) meterRegistry;
}
// 获取 Prometheus 格式的指标
public String scrape() {
return meterRegistry.scrape();
}
// 自定义 Prometheus 指标
public void customPrometheusMetrics() {
// Counter
Counter counter = Counter.builder("app_requests_total")
.description("Total requests")
.tag("method", "GET")
.tag("status", "200")
.register(meterRegistry);
counter.increment();
// Gauge
Gauge.builder("app_memory_used_bytes", Runtime.getRuntime(), Runtime::totalMemory)
.description("Used memory in bytes")
.register(meterRegistry);
// Histogram
DistributionSummary summary = DistributionSummary.builder("app_response_size_bytes")
.description("Response size distribution")
.baseUnit("bytes")
.publishPercentiles(0.5, 0.95, 0.99)
.register(meterRegistry);
summary.record(1024);
}
}
四、实战应用场景
4.1 Kubernetes 健康检查
# application.yml
management:
endpoint:
health:
probes:
enabled: true
health:
livenessstate:
enabled: true
readinessstate:
enabled: true
# kubernetes-deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
template:
spec:
containers:
- name: my-app
image: my-app:latest
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
4.2 自定义健康状态
import org.springframework.boot.actuate.health.*;
import org.springframework.stereotype.Component;
@Component
public class ApplicationHealthIndicator implements HealthIndicator {
private volatile boolean ready = false;
private volatile boolean live = true;
@Override
public Health health() {
if (!live) {
return Health.down()
.withDetail("reason", "Application is not live")
.build();
}
if (!ready) {
return Health.status(new Status("OUT_OF_SERVICE", "Not ready"))
.withDetail("reason", "Application is initializing")
.build();
}
return Health.up()
.withDetail("ready", true)
.withDetail("live", true)
.build();
}
public void setReady(boolean ready) {
this.ready = ready;
}
public void setLive(boolean live) {
this.live = live;
}
}
// 应用生命周期管理
@Component
public class ApplicationLifecycleManager {
private final ApplicationHealthIndicator healthIndicator;
public ApplicationLifecycleManager(ApplicationHealthIndicator healthIndicator) {
this.healthIndicator = healthIndicator;
}
@EventListener(ContextRefreshedEvent.class)
public void onApplicationReady() {
healthIndicator.setReady(true);
}
@EventListener(ContextClosedEvent.class)
public void onApplicationShutdown() {
healthIndicator.setReady(false);
healthIndicator.setLive(false);
}
}
4.3 监控告警
import org.springframework.boot.actuate.health.*;
import org.springframework.stereotype.Component;
import java.util.concurrent.*;
@Component
public class HealthMonitor {
private final HealthEndpoint healthEndpoint;
private final ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1);
private volatile Status lastStatus = Status.UNKNOWN;
public HealthMonitor(HealthEndpoint healthEndpoint) {
this.healthEndpoint = healthEndpoint;
startMonitoring();
}
private void startMonitoring() {
scheduler.scheduleAtFixedRate(this::checkHealth, 0, 30, TimeUnit.SECONDS);
}
private void checkHealth() {
try {
HealthComponent health = healthEndpoint.health();
Status currentStatus = health.getStatus();
if (currentStatus != lastStatus) {
onStatusChange(lastStatus, currentStatus);
lastStatus = currentStatus;
}
// 检查各组件健康状态
if (health instanceof CompositeHealth) {
CompositeHealth composite = (CompositeHealth) health;
composite.getComponents().forEach(this::checkComponent);
}
} catch (Exception e) {
System.err.println("Health check failed: " + e.getMessage());
}
}
private void onStatusChange(Status oldStatus, Status newStatus) {
String message = String.format("Health status changed from %s to %s",
oldStatus, newStatus);
// 发送告警
sendAlert(message);
System.out.println(message);
}
private void checkComponent(String name, HealthComponent component) {
if (component.getStatus() != Status.UP) {
String message = String.format("Component %s is %s", name, component.getStatus());
sendAlert(message);
}
}
private void sendAlert(String message) {
// 发送告警通知
System.out.println("[ALERT] " + message);
}
}
五、总结与最佳实践
端点安全
- 暴露最小化:只暴露必要的端点
- 独立端口:管理端口与应用端口分离
- 访问控制:使用 Spring Security 保护端点
- HTTPS:生产环境启用 HTTPS
指标使用
- 合理命名:遵循命名规范
- 标签使用:添加有意义的标签
- 避免高基数:标签值不要过多
- 定期清理:清理无用指标
健康检查
- 快速响应:健康检查要快
- 依赖检查:检查关键依赖
- 状态区分:区分 Liveness 和 Readiness
- 错误处理:处理健康检查异常
Spring Boot Actuator 是生产环境监控的重要工具,合理配置和使用 Actuator,能够帮助开发者及时发现和解决问题,保障应用的稳定运行。