在生产环境中,监控对于项目问题的分析排查变得尤为重要。
本文将介绍如何利用Java Agent技术实现对SpringBoot应用的无侵入式监控,帮助开发人员在不修改源码的情况下获取应用运行时的关键指标。
Java Agent简介
Java Agent是JDK 1.5引入的特性,它允许我们在JVM启动时或运行时动态地修改已加载的类字节码,从而实现对应用行为的增强或监控。
Java Agent的核心优势在于能够在不修改源代码的情况下,对应用进行功能扩展。
Java Agent主要有两种使用方式:
启动时加载(premain) 运行时加载(agentmain)
本文将主要关注启动时加载的方式。
技术原理
Java Agent的工作原理基于字节码增强技术,通过在类加载过程中修改字节码来实现功能增强。
在SpringBoot应用监控场景中,我们可以利用Java Agent拦截关键方法的调用,收集执行时间、资源使用情况等指标。
主要技术栈:
- Java Agent:提供字节码修改的入口
- Byte Buddy/ASM/Javassist:字节码操作库
- SpringBoot:目标应用框架
- Micrometer:指标收集与暴露
实现步骤
1. 创建Agent项目
首先,我们需要创建一个独立的Maven项目用于开发Java Agent:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>demo</groupId>
<artifactId>springboot-agent</artifactId>
<version>0.0.1-SNAPSHOT</version>
</parent>
<artifactId>agent</artifactId>
<dependencies>
<dependency>
<groupId>net.bytebuddy</groupId>
<artifactId>byte-buddy</artifactId>
<version>1.14.5</version>
</dependency>
<dependency>
<groupId>net.bytebuddy</groupId>
<artifactId>byte-buddy-agent</artifactId>
<version>1.14.5</version>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
<version>1.10.0</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>21</source>
<target>21</target>
<encoding>utf-8</encoding>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>3.2.0</version>
<configuration>
<archive>
<manifestEntries>
<Premain-Class>com.example.agent.MonitorAgent</Premain-Class>
<Can-Redefine-Classes>true</Can-Redefine-Classes>
<Can-Retransform-Classes>true</Can-Retransform-Classes>
</manifestEntries>
</archive>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.4</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
2. 实现Agent主类
创建MonitorAgent
类,实现premain
方法:
package com.example.agent;
import net.bytebuddy.agent.builder.AgentBuilder;
import net.bytebuddy.implementation.MethodDelegation;
import net.bytebuddy.matcher.ElementMatchers;
import java.lang.instrument.Instrumentation;
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;
public class MonitorAgent {
private static final ScheduledExecutorService executorService = Executors.newSingleThreadScheduledExecutor();
public static void premain(String arguments, Instrumentation instrumentation) {
System.out.println("SpringBoot监控Agent已启动...");
log();
// 使用ByteBuddy拦截SpringBoot的Controller方法
new AgentBuilder.Default()
.type(ElementMatchers.nameEndsWith("Controller"))
.transform((builder, typeDescription, classLoader, module, protectionDomain) ->
builder.method(ElementMatchers.isAnnotatedWith(
ElementMatchers.named("org.springframework.web.bind.annotation.RequestMapping")
.or(ElementMatchers.named("org.springframework.web.bind.annotation.GetMapping"))
.or(ElementMatchers.named("org.springframework.web.bind.annotation.PostMapping"))
.or(ElementMatchers.named("org.springframework.web.bind.annotation.PutMapping"))
.or(ElementMatchers.named("org.springframework.web.bind.annotation.DeleteMapping"))
))
.intercept(MethodDelegation.to(ControllerInterceptor.class))
)
.installOn(instrumentation);
}
private static void log(){
executorService.scheduleAtFixedRate(() -> {
// 收集并打印性能指标
String text = MetricsCollector.scrape();
System.out.println("===============");
System.out.println(text);
}, 0, 5, TimeUnit.SECONDS);
}
}
3. 实现拦截器
创建Controller拦截器:
package com.example.agent;
import net.bytebuddy.implementation.bind.annotation.*;
import java.lang.reflect.Method;
import java.util.concurrent.Callable;
public class ControllerInterceptor {
@RuntimeType
public static Object intercept(
@Origin Method method,
@SuperCall Callable<?> callable,
@AllArguments Object[] args) throws Exception {
long startTime = System.currentTimeMillis();
String className = method.getDeclaringClass().getName();
String methodName = method.getName();
try {
// 调用原方法
return callable.call();
} catch (Exception e) {
// 记录异常信息
MetricsCollector.recordException(className, methodName, e);
throw e;
} finally {
long executionTime = System.currentTimeMillis() - startTime;
// 收集性能指标
MetricsCollector.recordExecutionTime(className, methodName, executionTime);
}
}
}
4. 实现指标收集
创建MetricsCollector
类用于收集和暴露监控指标:
package com.example.agent;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.atomic.AtomicLong;
public class MetricsCollector {
private static final Map<String, AtomicLong> executionTimeMap = new ConcurrentHashMap<>();
private static final Map<String, AtomicLong> invocationCountMap = new ConcurrentHashMap<>();
private static final Map<String, AtomicLong> exceptionCountMap = new ConcurrentHashMap<>();
public static void recordExecutionTime(String className, String methodName, long executionTime) {
String key = className + "." + methodName;
executionTimeMap.computeIfAbsent(key, k -> new AtomicLong(0)).addAndGet(executionTime);
invocationCountMap.computeIfAbsent(key, k -> new AtomicLong(0)).incrementAndGet();
// 输出日志,实际项目中可能会发送到监控系统
System.out.printf("Controller执行: %s, 耗时: %d ms%n", key, executionTime);
}
public static void recordException(String className, String methodName, Exception e) {
String key = className + "." + methodName;
exceptionCountMap.computeIfAbsent(key, k -> new AtomicLong(0)).incrementAndGet();
System.out.printf("Controller异常: %s, 异常类型: %s, 消息: %s%n",
key, e.getClass().getName(), e.getMessage());
}
public static void recordSqlExecutionTime(String className, String methodName, long executionTime) {
String key = className + "." + methodName;
executionTimeMap.computeIfAbsent(key, k -> new AtomicLong(0)).addAndGet(executionTime);
invocationCountMap.computeIfAbsent(key, k -> new AtomicLong(0)).incrementAndGet();
System.out.printf("SQL执行: %s, 耗时: %d ms%n", key, executionTime);
}
public static void recordSqlException(String className, String methodName, Exception e) {
String key = className + "." + methodName;
exceptionCountMap.computeIfAbsent(key, k -> new AtomicLong(0)).incrementAndGet();
System.out.printf("SQL异常: %s, 异常类型: %s, 消息: %s%n",
key, e.getClass().getName(), e.getMessage());
}
// 获取各种指标的方法,可以被监控系统调用
public static Map<String, AtomicLong> getExecutionTimeMap() {
return executionTimeMap;
}
public static Map<String, AtomicLong> getInvocationCountMap() {
return invocationCountMap;
}
public static Map<String, AtomicLong> getExceptionCountMap() {
return exceptionCountMap;
}
}
5. 集成Prometheus与Grafana(可选)
为了更好地可视化监控数据,我们可以将收集到的指标暴露给Prometheus,并使用Grafana进行展示。首先,添加Micrometer相关依赖:
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
<version>1.10.0</version>
</dependency>
然后,修改MetricsCollector
类,将收集到的指标注册到Micrometer:
package com.example.agent;
import io.micrometer.core.instrument.Counter;
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.Timer;
import io.micrometer.prometheus.PrometheusConfig;
import io.micrometer.prometheus.PrometheusMeterRegistry;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.TimeUnit;
public class MetricsCollector {
private static final PrometheusMeterRegistry registry = new PrometheusMeterRegistry(PrometheusConfig.DEFAULT);
private static final Map<String, Timer> timers = new ConcurrentHashMap<>();
private static final Map<String, Counter> exceptionCounters = new ConcurrentHashMap<>();
public static void recordExecutionTime(String className, String methodName, long executionTime) {
String key = className + "." + methodName;
getOrCreateTimer(key, "controller").record(executionTime, TimeUnit.MILLISECONDS);
System.out.printf("Controller执行: %s, 耗时: %d ms%n", key, executionTime);
}
public static void recordException(String className, String methodName, Exception e) {
String key = className + "." + methodName;
getOrCreateExceptionCounter(key, "controller", e.getClass().getSimpleName()).increment();
System.out.printf("Controller异常: %s, 异常类型: %s, 消息: %s%n",
key, e.getClass().getName(), e.getMessage());
}
public static void recordSqlExecutionTime(String className, String methodName, long executionTime) {
String key = className + "." + methodName;
getOrCreateTimer(key, "sql").record(executionTime, TimeUnit.MILLISECONDS);
System.out.printf("SQL执行: %s, 耗时: %d ms%n", key, executionTime);
}
public static void recordSqlException(String className, String methodName, Exception e) {
String key = className + "." + methodName;
getOrCreateExceptionCounter(key, "sql", e.getClass().getSimpleName()).increment();
System.out.printf("SQL异常: %s, 异常类型: %s, 消息: %s%n",
key, e.getClass().getName(), e.getMessage());
}
private static Timer getOrCreateTimer(String name, String type) {
return timers.computeIfAbsent(name, k ->
Timer.builder("app.execution.time")
.tag("name", name)
.tag("type", type)
.register(registry)
);
}
private static Counter getOrCreateExceptionCounter(String name, String type, String exceptionType) {
String key = name + "." + exceptionType;
return exceptionCounters.computeIfAbsent(key, k ->
Counter.builder("app.exception.count")
.tag("name", name)
.tag("type", type)
.tag("exception", exceptionType)
.register(registry)
);
}
// 获取Prometheus格式的指标数据
public static String scrape() {
return registry.scrape();
}
// 获取注册表,可以被其他组件使用
public static MeterRegistry getRegistry() {
return registry;
}
}
6. 启动Agent并应用到SpringBoot应用
编译并打包Agent项目后,可以通过JVM参数将Agent添加到SpringBoot应用中:
java -javaagent:/path/to/springboot-monitor-agent.jar -jar your-springboot-app.jar
进阶扩展
除了基本的监控功能外,我们还可以对Agent进行以下扩展:
1. JVM指标监控
监控JVM的内存使用、GC情况、线程数等指标:
private static void monitorJvmMetrics(MeterRegistry registry) {
// 注册JVM内存指标
new JvmMemoryMetrics().bindTo(registry);
// 注册GC指标
new JvmGcMetrics().bindTo(registry);
// 注册线程指标
new JvmThreadMetrics().bindTo(registry);
}
2. HTTP客户端监控
监控应用发起的HTTP请求:
new AgentBuilder.Default()
.type(ElementMatchers.nameContains("RestTemplate")
.or(ElementMatchers.nameContains("HttpClient")))
.transform((builder, typeDescription, classLoader, module, protectionDomain) ->
builder.method(ElementMatchers.named("execute")
.or(ElementMatchers.named("doExecute"))
.or(ElementMatchers.named("exchange")))
.intercept(MethodDelegation.to(HttpClientInterceptor.class))
)
.installOn(instrumentation);
3. 分布式追踪集成
与Zipkin或Jaeger等分布式追踪系统集成,实现全链路追踪:
public static void recordTraceInfo(String className, String methodName, String traceId, String spanId) {
// 记录追踪信息
MDC.put("traceId", traceId);
MDC.put("spanId", spanId);
// 处理逻辑...
}
优势与注意事项
优势
无侵入性:不需要修改应用源代码 灵活性:可以动态决定要监控的类和方法 通用性:适用于任何基于SpringBoot的应用 运行时监控:可以实时收集应用运行数据
注意事项
性能影响:字节码增强会带来一定的性能开销,需要合理选择监控点 兼容性:需要确保Agent与应用的JDK版本兼容 稳定性:Agent本身的异常不应影响应用主流程 安全性:收集的数据可能包含敏感信息,需要注意数据安全
总结
在实际使用中,我们可以根据具体需求,对Agent进行定制化开发,实现更加精细化的监控。
同时,可以将Agent与现有的监控系统集成,构建完整的应用性能监控体系。