21-Spring Boot Actuator 监控详解

4 阅读4分钟

Spring Boot Actuator 监控详解

一、知识概述

Spring Boot Actuator 是 Spring Boot 的监控和管理模块,提供了生产级的监控和管理功能。通过 Actuator,可以查看应用的运行状态、健康检查、指标监控、线程信息等,是运维和排查问题的重要工具。

Actuator 核心功能:

  • 健康检查:Health Endpoint
  • 指标监控:Metrics Endpoint
  • 信息展示:Info Endpoint
  • 环境信息:Environment Endpoint
  • 日志管理:Loggers Endpoint
  • 线程信息:Thread Dump

理解 Actuator 的使用和配置,是构建生产级 Spring Boot 应用的必备技能。

二、知识点详细讲解

2.1 Actuator 端点

内置端点列表
端点说明默认暴露
/actuator所有端点列表
/health应用健康状态
/info应用信息
/beansBean 列表
/conditions自动配置条件
/configprops配置属性
/env环境变量
/loggers日志配置
/metrics指标信息
/mappingsURL 映射
/threaddump线程转储
/heapdump堆转储
/shutdown关闭应用

2.2 端点暴露配置

management:
  endpoints:
    web:
      exposure:
        include: health,info,metrics  # 暴露指定端点
        # include: "*"  # 暴露所有端点
        
  endpoint:
    health:
      show-details: always  # 显示健康详情
    shutdown:
      enabled: true  # 启用关闭端点

2.3 健康检查

内置健康指示器
  • DataSourceHealthIndicator
  • RedisHealthIndicator
  • MongoHealthIndicator
  • DiskSpaceHealthIndicator
  • ElasticsearchHealthIndicator
自定义健康指示器
@Component
public class MyHealthIndicator implements HealthIndicator {
    @Override
    public Health health() {
        // 自定义健康检查逻辑
        return Health.up().withDetail("custom", "OK").build();
    }
}

2.4 指标监控

Spring Boot 2.x 使用 Micrometer 作为指标门面:

  • Counter:计数器
  • Gauge:即时值
  • Timer:计时器
  • Summary:摘要统计

2.5 安全配置

Actuator 端点包含敏感信息,需要安全控制:

  • 使用 Spring Security 保护端点
  • 配置独立的管理端口
  • 启用 HTTPS

三、代码示例

3.1 基础配置

# application.yml
management:
  # 端点配置
  endpoints:
    web:
      exposure:
        include: health,info,metrics,env,loggers,mappings,beans
      base-path: /actuator  # 默认路径
      
  # 端点细粒度配置
  endpoint:
    health:
      show-details: always  # always/never/when-authorized
      show-components: always
      probes:
        enabled: true  # 启用 Kubernetes 探针
    env:
      show-values: always  # 显示环境变量值
    loggers:
      enabled: true
    shutdown:
      enabled: false  # 默认关闭
      
  # 健康检查配置
  health:
    db:
      enabled: true
    redis:
      enabled: true
    diskspace:
      enabled: true
      threshold: 10MB  # 磁盘空间阈值
    livenessstate:
      enabled: true
    readinessstate:
      enabled: true
      
  # 指标配置
  metrics:
    tags:
      application: ${spring.application.name}
    export:
      prometheus:
        enabled: true  # 启用 Prometheus 格式
    distribution:
      percentiles-histogram:
        http.server.requests: true
      percentiles:
        http.server.requests: 0.5,0.95,0.99
        
  # 服务器配置
  server:
    port: 8081  # 管理端口(独立于应用端口)
    # address: 127.0.0.1  # 只允许本地访问
    
  # 信息配置
  info:
    env:
      enabled: true
    java:
      enabled: true
    os:
      enabled: true
    build:
      enabled: true
    git:
      mode: full

# 应用信息
info:
  app:
    name: My Application
    version: 1.0.0
    description: 这是一个示例应用
  author: Developer Team

3.2 健康检查示例

import org.springframework.boot.actuate.health.*;
import org.springframework.stereotype.Component;
import java.util.concurrent.atomic.AtomicBoolean;

// 自定义健康指示器
@Component
public class DatabaseHealthIndicator implements HealthIndicator {
    
    private final DatabaseService databaseService;
    
    public DatabaseHealthIndicator(DatabaseService databaseService) {
        this.databaseService = databaseService;
    }
    
    @Override
    public Health health() {
        try {
            // 检查数据库连接
            if (databaseService.isConnectionValid()) {
                return Health.up()
                    .withDetail("database", "MySQL")
                    .withDetail("connection", "active")
                    .withDetail("responseTime", databaseService.getLastResponseTime() + "ms")
                    .build();
            } else {
                return Health.down()
                    .withDetail("database", "MySQL")
                    .withDetail("error", "Connection failed")
                    .build();
            }
        } catch (Exception e) {
            return Health.down()
                .withDetail("database", "MySQL")
                .withDetail("error", e.getMessage())
                .withException(e)
                .build();
        }
    }
}

// 外部服务健康检查
@Component
public class ExternalServiceHealthIndicator implements HealthIndicator {
    
    private final RestTemplate restTemplate;
    
    public ExternalServiceHealthIndicator(RestTemplate restTemplate) {
        this.restTemplate = restTemplate;
    }
    
    @Override
    public Health health() {
        try {
            // 检查外部服务
            ResponseEntity<String> response = restTemplate.getForEntity(
                "https://api.example.com/health", 
                String.class
            );
            
            if (response.getStatusCode().is2xxSuccessful()) {
                return Health.up()
                    .withDetail("service", "External API")
                    .withDetail("status", "reachable")
                    .build();
            } else {
                return Health.down()
                    .withDetail("service", "External API")
                    .withDetail("status", "error")
                    .build();
            }
        } catch (Exception e) {
            return Health.down()
                .withDetail("service", "External API")
                .withDetail("error", e.getMessage())
                .build();
        }
    }
}

// 组合健康指示器
@Component
public class CompositeHealthIndicator implements HealthIndicator {
    
    private final List<HealthIndicator> indicators;
    
    public CompositeHealthIndicator(List<HealthIndicator> indicators) {
        this.indicators = indicators;
    }
    
    @Override
    public Health health() {
        Health.Builder builder = Health.up();
        
        for (HealthIndicator indicator : indicators) {
            Health health = indicator.health();
            builder.withDetail(
                indicator.getClass().getSimpleName(), 
                health.getStatus()
            );
            
            if (health.getStatus() != Status.UP) {
                builder.status(health.getStatus());
            }
        }
        
        return builder.build();
    }
}

// 健康状态聚合
@Component
public class CustomHealthAggregator {
    
    public Status aggregate(List<Status> statuses) {
        // 自定义状态聚合逻辑
        if (statuses.contains(Status.DOWN)) {
            return Status.DOWN;
        }
        if (statuses.contains(Status.OUT_OF_SERVICE)) {
            return Status.OUT_OF_SERVICE;
        }
        return Status.UP;
    }
}

3.3 自定义端点示例

import org.springframework.boot.actuate.endpoint.annotation.*;
import org.springframework.stereotype.Component;
import java.util.*;

// 自定义端点
@Component
@Endpoint(id = "custom")
public class CustomEndpoint {
    
    private final Map<String, Object> info = new HashMap<>();
    
    public CustomEndpoint() {
        info.put("version", "1.0.0");
        info.put("startTime", System.currentTimeMillis());
    }
    
    // 读取操作
    @ReadOperation
    public Map<String, Object> info() {
        Map<String, Object> result = new HashMap<>(info);
        result.put("uptime", System.currentTimeMillis() - (Long) info.get("startTime"));
        return result;
    }
    
    // 带参数的读取操作
    @ReadOperation
    public Object infoByKey(@Selector String key) {
        return info.get(key);
    }
    
    // 写入操作
    @WriteOperation
    public void updateInfo(@Selector String key, String value) {
        info.put(key, value);
    }
    
    // 删除操作
    @DeleteOperation
    public void deleteInfo(@Selector String key) {
        info.remove(key);
    }
}

// Web 端点
@Component
@WebEndpoint(id = "customweb")
public class CustomWebEndpoint {
    
    @ReadOperation
    public CustomData getData() {
        return new CustomData("test", 100, System.currentTimeMillis());
    }
    
    @ReadOperation
    public CustomData getDataById(@Selector String id) {
        return new CustomData(id, 200, System.currentTimeMillis());
    }
}

public class CustomData {
    private String id;
    private int value;
    private long timestamp;
    
    public CustomData(String id, int value, long timestamp) {
        this.id = id;
        this.value = value;
        this.timestamp = timestamp;
    }
    
    // getter
    public String getId() { return id; }
    public int getValue() { return value; }
    public long getTimestamp() { return timestamp; }
}

// JMX 端点
@Component
@JmxEndpoint(id = "customjmx")
public class CustomJmxEndpoint {
    
    private String status = "running";
    
    @ReadOperation
    public String getStatus() {
        return status;
    }
    
    @WriteOperation
    public void setStatus(String status) {
        this.status = status;
    }
}

3.4 指标监控示例

import io.micrometer.core.instrument.*;
import org.springframework.stereotype.Service;
import java.util.concurrent.*;

@Service
public class MetricsService {
    
    private final MeterRegistry meterRegistry;
    private final Counter requestCounter;
    private final Timer responseTimer;
    private final Gauge activeConnections;
    private final AtomicInteger activeCount = new AtomicInteger(0);
    
    public MetricsService(MeterRegistry meterRegistry) {
        this.meterRegistry = meterRegistry;
        
        // 创建计数器
        this.requestCounter = Counter.builder("app.requests.total")
            .description("Total number of requests")
            .tag("type", "api")
            .register(meterRegistry);
        
        // 创建计时器
        this.responseTimer = Timer.builder("app.response.time")
            .description("Response time in milliseconds")
            .tag("endpoint", "api")
            .publishPercentiles(0.5, 0.95, 0.99)
            .register(meterRegistry);
        
        // 创建 Gauge
        this.activeConnections = Gauge.builder("app.connections.active", activeCount, AtomicInteger::get)
            .description("Number of active connections")
            .register(meterRegistry);
    }
    
    // 记录请求
    public void recordRequest(String endpoint) {
        Counter.builder("app.requests.total")
            .tag("endpoint", endpoint)
            .register(meterRegistry)
            .increment();
    }
    
    // 记录响应时间
    public void recordResponseTime(String endpoint, long durationMs) {
        Timer.builder("app.response.time")
            .tag("endpoint", endpoint)
            .register(meterRegistry)
            .record(durationMs, TimeUnit.MILLISECONDS);
    }
    
    // 使用 Timer 包装代码块
    public <T> T timedCall(String operation, Callable<T> callable) throws Exception {
        return Timer.builder("app.operation.time")
            .tag("operation", operation)
            .register(meterRegistry)
            .recordCallable(callable);
    }
    
    // 连接数增减
    public void incrementConnection() {
        activeCount.incrementAndGet();
    }
    
    public void decrementConnection() {
        activeCount.decrementAndGet();
    }
    
    // 创建 Summary(摘要统计)
    public void recordSummary(String name, double value) {
        DistributionSummary.builder(name)
            .description("Distribution summary")
            .baseUnit("bytes")
            .publishPercentiles(0.5, 0.95, 0.99)
            .register(meterRegistry)
            .record(value);
    }
    
    // 创建 LongTaskTimer(长时间任务计时)
    public LongTaskTimer.Sample startLongTask(String taskName) {
        return LongTaskTimer.builder("app.long.task")
            .tag("task", taskName)
            .register(meterRegistry)
            .start();
    }
    
    public void stopLongTask(LongTaskTimer.Sample sample) {
        sample.stop();
    }
}

// 控制器指标示例
@RestController
public class MonitoredController {
    
    private final MetricsService metricsService;
    
    public MonitoredController(MetricsService metricsService) {
        this.metricsService = metricsService;
    }
    
    @GetMapping("/api/data")
    public String getData() {
        long startTime = System.currentTimeMillis();
        
        try {
            metricsService.recordRequest("/api/data");
            metricsService.incrementConnection();
            
            // 业务逻辑
            String result = processData();
            
            return result;
        } finally {
            metricsService.recordResponseTime("/api/data", 
                System.currentTimeMillis() - startTime);
            metricsService.decrementConnection();
        }
    }
    
    private String processData() {
        return "data";
    }
}

3.5 信息端点示例

import org.springframework.boot.actuate.info.*;
import org.springframework.stereotype.Component;
import java.util.*;

// 自定义信息贡献者
@Component
public class CustomInfoContributor implements InfoContributor {
    
    @Override
    public void contribute(Info.Builder builder) {
        builder.withDetail("app", createAppInfo())
               .withDetail("build", createBuildInfo())
               .withDetail("runtime", createRuntimeInfo());
    }
    
    private Map<String, Object> createAppInfo() {
        Map<String, Object> info = new HashMap<>();
        info.put("name", "My Application");
        info.put("version", "1.0.0");
        info.put("description", "A sample Spring Boot application");
        return info;
    }
    
    private Map<String, Object> createBuildInfo() {
        Map<String, Object> info = new HashMap<>();
        info.put("time", new Date());
        info.put("artifact", "my-app");
        info.put("group", "com.example");
        return info;
    }
    
    private Map<String, Object> createRuntimeInfo() {
        Map<String, Object> info = new HashMap<>();
        info.put("javaVersion", System.getProperty("java.version"));
        info.put("osName", System.getProperty("os.name"));
        info.put("availableProcessors", Runtime.getRuntime().availableProcessors());
        info.put("maxMemory", Runtime.getRuntime().maxMemory());
        info.put("totalMemory", Runtime.getRuntime().totalMemory());
        info.put("freeMemory", Runtime.getRuntime().freeMemory());
        return info;
    }
}

// Git 信息贡献者
@Component
public class GitInfoContributor implements InfoContributor {
    
    @Override
    public void contribute(Info.Builder builder) {
        Map<String, Object> git = new HashMap<>();
        git.put("branch", "main");
        git.put("commit", Map.of(
            "id", "abc123",
            "time", new Date(),
            "message", "Initial commit"
        ));
        
        builder.withDetail("git", git);
    }
}

// 健康状态信息贡献者
@Component
public class HealthInfoContributor implements InfoContributor {
    
    private final HealthEndpoint healthEndpoint;
    
    public HealthInfoContributor(HealthEndpoint healthEndpoint) {
        this.healthEndpoint = healthEndpoint;
    }
    
    @Override
    public void contribute(Info.Builder builder) {
        HealthComponent health = healthEndpoint.health();
        builder.withDetail("health", Map.of(
            "status", health.getStatus().getCode(),
            "components", health.getDetails()
        ));
    }
}

3.6 安全配置示例

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.security.config.annotation.web.builders.HttpSecurity;
import org.springframework.security.web.SecurityFilterChain;
import org.springframework.boot.actuate.autoconfigure.security.servlet.EndpointRequest;

@Configuration
public class ActuatorSecurityConfig {
    
    @Bean
    public SecurityFilterChain actuatorSecurityFilterChain(HttpSecurity http) throws Exception {
        http
            .securityMatcher(EndpointRequest.toAnyEndpoint())
            .authorizeHttpRequests(auth -> auth
                .requestMatchers(EndpointRequest.to("health", "info")).permitAll()
                .requestMatchers(EndpointRequest.toAnyEndpoint()).hasRole("ACTUATOR")
            )
            .httpBasic(basic -> {});
        
        return http.build();
    }
}

// 更细粒度的安全配置
@Configuration
public class ActuatorSecurityConfig2 {
    
    @Bean
    public SecurityFilterChain actuatorSecurityFilterChain2(HttpSecurity http) throws Exception {
        http
            .securityMatcher(EndpointRequest.toAnyEndpoint())
            .authorizeHttpRequests(auth -> {
                // 公开端点
                auth.requestMatchers(EndpointRequest.to("health", "info", "prometheus"))
                    .permitAll();
                
                // 只读端点
                auth.requestMatchers(EndpointRequest.to(
                    "metrics", "env", "loggers", "mappings", "beans"
                )).hasRole("MONITOR");
                
                // 敏感端点
                auth.requestMatchers(EndpointRequest.to(
                    "shutdown", "heapdump", "threaddump"
                )).hasRole("ADMIN");
                
                // 其他端点
                auth.anyRequest().authenticated();
            })
            .httpBasic(basic -> {});
        
        return http.build();
    }
}

3.7 Prometheus 集成示例

# application.yml
management:
  endpoints:
    web:
      exposure:
        include: health,info,metrics,prometheus
        
  metrics:
    tags:
      application: ${spring.application.name}
      environment: ${spring.profiles.active}
      
    export:
      prometheus:
        enabled: true
        
    distribution:
      percentiles-histogram:
        http.server.requests: true
      percentiles:
        http.server.requests: 0.5,0.95,0.99
      slo:
        http.server.requests: 100ms,200ms,500ms,1s
import io.micrometer.core.instrument.*;
import io.micrometer.prometheus.PrometheusMeterRegistry;
import org.springframework.stereotype.Service;

@Service
public class PrometheusMetricsService {
    
    private final PrometheusMeterRegistry meterRegistry;
    
    public PrometheusMetricsService(MeterRegistry meterRegistry) {
        this.meterRegistry = (PrometheusMeterRegistry) meterRegistry;
    }
    
    // 获取 Prometheus 格式的指标
    public String scrape() {
        return meterRegistry.scrape();
    }
    
    // 自定义 Prometheus 指标
    public void customPrometheusMetrics() {
        // Counter
        Counter counter = Counter.builder("app_requests_total")
            .description("Total requests")
            .tag("method", "GET")
            .tag("status", "200")
            .register(meterRegistry);
        counter.increment();
        
        // Gauge
        Gauge.builder("app_memory_used_bytes", Runtime.getRuntime(), Runtime::totalMemory)
            .description("Used memory in bytes")
            .register(meterRegistry);
        
        // Histogram
        DistributionSummary summary = DistributionSummary.builder("app_response_size_bytes")
            .description("Response size distribution")
            .baseUnit("bytes")
            .publishPercentiles(0.5, 0.95, 0.99)
            .register(meterRegistry);
        summary.record(1024);
    }
}

四、实战应用场景

4.1 Kubernetes 健康检查

# application.yml
management:
  endpoint:
    health:
      probes:
        enabled: true
  health:
    livenessstate:
      enabled: true
    readinessstate:
      enabled: true
# kubernetes-deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  template:
    spec:
      containers:
      - name: my-app
        image: my-app:latest
        livenessProbe:
          httpGet:
            path: /actuator/health/liveness
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /actuator/health/readiness
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5

4.2 自定义健康状态

import org.springframework.boot.actuate.health.*;
import org.springframework.stereotype.Component;

@Component
public class ApplicationHealthIndicator implements HealthIndicator {
    
    private volatile boolean ready = false;
    private volatile boolean live = true;
    
    @Override
    public Health health() {
        if (!live) {
            return Health.down()
                .withDetail("reason", "Application is not live")
                .build();
        }
        if (!ready) {
            return Health.status(new Status("OUT_OF_SERVICE", "Not ready"))
                .withDetail("reason", "Application is initializing")
                .build();
        }
        return Health.up()
            .withDetail("ready", true)
            .withDetail("live", true)
            .build();
    }
    
    public void setReady(boolean ready) {
        this.ready = ready;
    }
    
    public void setLive(boolean live) {
        this.live = live;
    }
}

// 应用生命周期管理
@Component
public class ApplicationLifecycleManager {
    
    private final ApplicationHealthIndicator healthIndicator;
    
    public ApplicationLifecycleManager(ApplicationHealthIndicator healthIndicator) {
        this.healthIndicator = healthIndicator;
    }
    
    @EventListener(ContextRefreshedEvent.class)
    public void onApplicationReady() {
        healthIndicator.setReady(true);
    }
    
    @EventListener(ContextClosedEvent.class)
    public void onApplicationShutdown() {
        healthIndicator.setReady(false);
        healthIndicator.setLive(false);
    }
}

4.3 监控告警

import org.springframework.boot.actuate.health.*;
import org.springframework.stereotype.Component;
import java.util.concurrent.*;

@Component
public class HealthMonitor {
    
    private final HealthEndpoint healthEndpoint;
    private final ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1);
    private volatile Status lastStatus = Status.UNKNOWN;
    
    public HealthMonitor(HealthEndpoint healthEndpoint) {
        this.healthEndpoint = healthEndpoint;
        startMonitoring();
    }
    
    private void startMonitoring() {
        scheduler.scheduleAtFixedRate(this::checkHealth, 0, 30, TimeUnit.SECONDS);
    }
    
    private void checkHealth() {
        try {
            HealthComponent health = healthEndpoint.health();
            Status currentStatus = health.getStatus();
            
            if (currentStatus != lastStatus) {
                onStatusChange(lastStatus, currentStatus);
                lastStatus = currentStatus;
            }
            
            // 检查各组件健康状态
            if (health instanceof CompositeHealth) {
                CompositeHealth composite = (CompositeHealth) health;
                composite.getComponents().forEach(this::checkComponent);
            }
        } catch (Exception e) {
            System.err.println("Health check failed: " + e.getMessage());
        }
    }
    
    private void onStatusChange(Status oldStatus, Status newStatus) {
        String message = String.format("Health status changed from %s to %s", 
            oldStatus, newStatus);
        
        // 发送告警
        sendAlert(message);
        
        System.out.println(message);
    }
    
    private void checkComponent(String name, HealthComponent component) {
        if (component.getStatus() != Status.UP) {
            String message = String.format("Component %s is %s", name, component.getStatus());
            sendAlert(message);
        }
    }
    
    private void sendAlert(String message) {
        // 发送告警通知
        System.out.println("[ALERT] " + message);
    }
}

五、总结与最佳实践

端点安全

  1. 暴露最小化:只暴露必要的端点
  2. 独立端口:管理端口与应用端口分离
  3. 访问控制:使用 Spring Security 保护端点
  4. HTTPS:生产环境启用 HTTPS

指标使用

  1. 合理命名:遵循命名规范
  2. 标签使用:添加有意义的标签
  3. 避免高基数:标签值不要过多
  4. 定期清理:清理无用指标

健康检查

  1. 快速响应:健康检查要快
  2. 依赖检查:检查关键依赖
  3. 状态区分:区分 Liveness 和 Readiness
  4. 错误处理:处理健康检查异常

Spring Boot Actuator 是生产环境监控的重要工具,合理配置和使用 Actuator,能够帮助开发者及时发现和解决问题,保障应用的稳定运行。