说明
源码路线是按照 最新的版本流程讲解, 不分析老代码,因为已经停止迭代了。版本路线
- openfeign 3.1.1
- loadbalancer 3.1.1
- spring-cloud-starter-alibaba-nacos-discovery 2.2.2
- spring-cloud-starter-alibaba-sentinel 2021.2
不同版本核心流程大同小异,主要是解析 整个调用链路的原理 方便后续遇到整个springcloud体系 调用关系异常问题时方便排查原因和更深层次的代码扩展。
如何使用
注解
在springboot启动类中注解 @EnableFeignClients
配置
Feign配置
feign:
sentinel:
enabled: true
client:
config:
default: #全局, default 为所有服务
connectTimeout: 3000 #连接超时时间,单位毫秒,默认1000ms
readTimeout: 3000 #请求超时时间,单位毫秒,默认1000ms
loggerLevel: FULL
# requestInterceptors:
# - com.okcoin.commons.feign.interceptor.CustomeFeignInterceptor
feign-provider: #指定服务,请求的服务名称
connectTimeout: 3000
readTimeout: 2000
loggerLevel: FULL
# requestInterceptors:
# - com.okcoin.commons.feign.interceptor.CustomeFeignInterceptor
decode404: true
# retryer: feign.Retryer.Default
#http客户端选取
httpclient:
enabled: false # httpclient 客户端
okhttp:
enabled: false # okhttp 客户端
asynchttpclient:
enabled: true # async http 客户端
Loadbalance配置
spring:
application:
name: feign-consumer #应用名称
cloud:
#nacos服务注册发现
nacos:
discovery:
server-addr: office-nacos.okg.com #nacos注册发现地址
#负载均衡
loadbalancer:
enabled: true
#重试
retry:
enabled: false #开启或关闭loadbalancer重试机制,默认开启true
maxRetriesOnSameServiceInstance: 2 #对当前实例重试次数,不包当前请求,默认0
maxRetriesOnNextServiceInstance: 2 #切换实例进行重试次数,不包括当前实例,默认1
retryOnAllOperations: true #对所有操作请求都进行重试
使用
- 在发起调用接口上注解
@FeignClient name为nacos 上serviceIdcontextId: feign 会将每一个接口当中所有的方法对应生成的spring bean放在spring子容器中, 那么该属性就是spring 子容器的子容器id。configuration: 所有当前接口下的配置(编解码器、重试策略、日志级别等等)都会作用于到当前 feign接口下(相当于这些bean 会存放到 容器id 为contextId的spring子容器中,用于区分隔离其他feign接口)url: feign接口调用远程目标地址。 配置了url后name无效。feign不会去loadbalancer上取ip:port映射。不进行负载均衡。- 其他: 省略
@FeignClient(name = "feign-provider", contextId = "feign-provider01"/*,url = "localhost:9999"*/, fallbackFactory = TestRestClientFallbackFactory.class, configuration = CustomFeignConfiguration.class
)
public interface TestRestClient {
@GetMapping(value = "/test")
public String test(@RequestParam(name = "size") int size);
public static void main(String[] args) {
ByteBuffer target = ByteBuffer.wrap(new byte[10000]);
target.put(new byte[1024]);
target.flip();
}
}
Springcloud RPC核心流程
注册、加载、代理、熔断降级 (Openfeign +Sentinel)
注册扫描
@EnableFeignClients注解导入了FeignClientsRegistrar(Feign注解扫描器)
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.TYPE)
@Documented
@Import(FeignClientsRegistrar.class)
public @interface EnableFeignClients {
@FeignClientsRegistrar注解扫描器通过实现ImportBeanDefinitionRegistrar接口registerBeanDefinitions方法,可以侵入到spring启动过程中,在bean初始化时做一些手脚。
@Override
public void registerBeanDefinitions(AnnotationMetadata metadata, BeanDefinitionRegistry registry) {
registerDefaultConfiguration(metadata, registry);
registerFeignClients(metadata, registry);
}
-
registerDefaultConfiguration注册 Feign 配置类-
此处
registerDefaultConfiguration就不做过多介绍了,主要是扫描@FeignClient中的configuration,将所有的配置bean注册到spring子容器中
-
-
【重点】registerFeignClients内部通过动态代理 注册Feign client对象 缓存到spring容器中。-
通过FeignClientFactoryBean 创建bean对象
-
获取contextId 将对象创建到spring子容器中
-
往 FeignClientFactoryBean bean工厂中注入 接口类相关信息
- name: @FeignClient 注解中配置的name
- type: 为@FeignClient 注解的接口类 class
-
private void registerFeignClient(BeanDefinitionRegistry registry, AnnotationMetadata annotationMetadata,
Map<String, Object> attributes) {
String className = annotationMetadata.getClassName();
Class clazz = ClassUtils.resolveClassName(className, null);
ConfigurableBeanFactory beanFactory = registry instanceof ConfigurableBeanFactory
? (ConfigurableBeanFactory) registry : null;
String contextId = getContextId(beanFactory, attributes);
String name = getName(attributes);
FeignClientFactoryBean factoryBean = new FeignClientFactoryBean();
factoryBean.setBeanFactory(beanFactory);
// 这里的name 为 nacos上的servceId
factoryBean.setName(name);
factoryBean.setContextId(contextId);
factoryBean.setType(clazz);
factoryBean.setRefreshableClient(isClientRefreshEnabled());
// 此处省略部分代码
AbstractBeanDefinition beanDefinition = definition.getBeanDefinition();
beanDefinition.setAttribute("feignClientsRegistrarFactoryBean", factoryBean);
BeanDefinitionHolder holder = new BeanDefinitionHolder(beanDefinition, className, qualifiers);
BeanDefinitionReaderUtils.registerBeanDefinition(holder, registry);
registerOptionsBeanDefinition(registry, contextId);
}
FactoryBean 方式创建Bean
FeignClientFactoryBean类继承FactoryBean 在spring初始化时会被动调用getObject 方法 创建 spring bean对象
public class FeignClientFactoryBean implements FactoryBean<Object>, InitializingBean, ApplicationContextAware, BeanFactoryAware
在getTarget 方法中主要操作
- 获取
FeignContext:FeignContext在FeignAutoConfiguration加载时注入到Spring容器中 - 构建
Feign.Builder对象: 通过该对象可以扩展不同的熔断策略。Sentinle 熔断入口也是从该对象开始的。 - @FeignClient 注解中不存在 url 属性时,构建负载均衡策略。
@Override
public Object getObject() {
return getTarget();
}
<T> T getTarget() {
FeignContext context = beanFactory != null ? beanFactory.getBean(FeignContext.class)
: applicationContext.getBean(FeignContext.class);
Feign.Builder builder = feign(context);
if (!StringUtils.hasText(url)) {
if (LOG.isInfoEnabled()) {
LOG.info("For '" + name + "' URL not provided. Will try picking an instance via load-balancing.");
}
if (!name.startsWith("http")) {
url = "http://" + name;
}
else {
url = name;
}
url += cleanPath();
return (T) loadBalance(builder, context, new HardCodedTarget<>(type, name, url));
}
// 此处省略部分代码
}
构建Feign.Builder
feign方法主要进行以下逻辑
- 从spring容器中获取日志创建工厂、编解码器、合约相关对象注入到
Feign.Builder中方便后续逻辑使用 - get方法从spring容器中获取, 那么此处就可以进行扩展了,通过
@ConditionalOnMissingBean方式进行策略扩展。
protected Feign.Builder feign(FeignContext context) {
FeignLoggerFactory loggerFactory = get(context, FeignLoggerFactory.class);
Logger logger = loggerFactory.create(type);
// @formatter:off
Feign.Builder builder = get(context, Feign.Builder.class)
// required values
.logger(logger)
.encoder(get(context, Encoder.class))
.decoder(get(context, Decoder.class))
.contract(get(context, Contract.class));
// @formatter:on
configureFeign(context, builder);
return builder;
}
protected <T> T get(FeignContext context, Class<T> type) {
T instance = context.getInstance(contextId, type);
if (instance == null) {
throw new IllegalStateException("No bean found of type " + type + " for " + contextId);
}
return instance;
}
当我们在配置文件中开启 feign.sentinel.enabled=true那么SentinelFeign.Builder 会自动被加载到Spring上下文中。那么在上面的get方法中则会获取到SentinelFeign.Builder对象。 那么该 SentinelFeign.Builder 类继承了Feign.Builder对象。
public static final class Builder extends Feign.Builder implements ApplicationContextAware {
自动注入Feign.Builder
@Configuration(proxyBeanMethods = false)
@ConditionalOnClass({ SphU.class, Feign.class })
public class SentinelFeignAutoConfiguration {
@Bean
@Scope("prototype")
@ConditionalOnMissingBean
@ConditionalOnProperty(name = "feign.sentinel.enabled")
public Feign.Builder feignSentinelBuilder() {
return SentinelFeign.builder();
}
}
那么分析到这里。我们知道Sentinel通过Feign.Builder 提供的扩展能力,集成到OpenFeign流程中。
那么继续往下我们分析 loadBalance 方法。
protected <T> T loadBalance(Feign.Builder builder, FeignContext context, HardCodedTarget<T> target) {
//此处会根据不同的 http客户端策略和不同的 重试or 不重试策略选择不同的 LoadBalancerClient
Client client = getOptional(context, Client.class);
if (client != null) {
builder.client(client);
applyBuildCustomizers(context, builder);
Targeter targeter = get(context, Targeter.class);
return targeter.target(this, builder, context, target);
}
throw new IllegalStateException(
"No Feign Client for loadBalancing defined. Did you forget to include spring-cloud-starter-loadbalancer?");
}
获取RPC client 客户端
-
getOptional 通过该方法。从spring环境中获取类型为 Client的bean对象。
-
Client bean对象同样根据不同的策略。会创建不同的bean。同时会根据不同的 http客户端策略和不同的 重试or 不重试策略选择不同的
LoadBalancerClient。我们主要分析几个类- DefaultFeignLoadBalancerConfiguration :openfeign默认的Client自动注入类
- OkHttpFeignLoadBalancerConfiguration: 使用okHttpClient 后spring加载的自动注入类
@ConditionalOnProperty("feign.okhttp.enabled") - OkFeignAsyncHttpAutoConfiguration: ok 自研的asynchttpclient http客户端 自动注入了类
@ConditionalOnProperty("feign.asynchttpclient.enabled")
-
-
将获取到的Client bean(发起远程调用的请求客户端) 配置到Feign.Builder中提供后续使用。
熔断降级、动态代理触发器
-
从spring容器中获取Targeter 子类对象, 该对象主要职责: 熔断降级策略选择器,同时负责构建RPC调用动态代理对象。
DefaultTargeter: openfeign 默认不进行熔断降级策略。FeignCircuitBreakerTargeter:openfeign提供的熔断策略。 如需开启 则需要在FeignAutoConfiguration中将feign.circuitbreaker.enabled设置为true。则会加载该策略。- HystrixTargeter:需要引入
spring-cloud-starter-netflix-hystrix -
构建Feign客户端
-
Feign#target该方法中主要构件Feign客户端,同时创建RPC动态代理对象。 -
public <T> T target(Target<T> target) { return build().newInstance(target); }
重写InvocationHandlerFactory代理工厂
因为SentinelFeign 中重写了父类中的build方法,那么我们直接到来到SentinelFeign 逻辑中。此方法主要做一件事情, 即将父类Feign 中的invocationHandlerFactory 赋予新对象。 相当于重写父类动态代理创建对象的逻辑。通过源码我们可以发现,build方法最终还是会调用super.build 方法
public Feign build() {
super.invocationHandlerFactory(new InvocationHandlerFactory() {
@Override
public InvocationHandler create(Target target,
Map<Method, MethodHandler> dispatch) {
GenericApplicationContext gctx = (GenericApplicationContext) Builder.this.applicationContext;
BeanDefinition def = gctx.getBeanDefinition(target.type().getName());
/**
* Due to the change of the initialization sequence, BeanFactory.getBean will cause a circular dependency.
* So FeignClientFactoryBean can only be obtained from BeanDefinition
*/
FeignClientFactoryBean feignClientFactoryBean = (FeignClientFactoryBean) def.getAttribute("feignClientsRegistrarFactoryBean");
Class fallback = feignClientFactoryBean.getFallback();
Class fallbackFactory = feignClientFactoryBean.getFallbackFactory();
String beanName = feignClientFactoryBean.getContextId();
return new SentinelInvocationHandler(target, dispatch);
}
private Object getFromContext(String name, String type,
Class fallbackType, Class targetType) {
Object fallbackInstance = feignContext.getInstance(name,
fallbackType);
if (fallbackInstance == null) {
throw new IllegalStateException(String.format(
"No %s instance of type %s found for feign client %s",
type, fallbackType, name));
}
if (!targetType.isAssignableFrom(fallbackType)) {
throw new IllegalStateException(String.format(
"Incompatible %s instance. Fallback/fallbackFactory of type %s is not assignable to %s for feign client %s",
type, fallbackType, targetType, name));
}
return fallbackInstance;
}
});
super.contract(new SentinelContractHolder(contract));
return super.build();
}
构建ReflectiveFeign
Feign#build 方法中主要是构建 子类ReflectiveFeign 对象。 同时将 SynchronousMethodHandler.Factory 注入到ParseHandlersByName。
public Feign build() {
Client client = Capability.enrich(this.client, capabilities);
Retryer retryer = Capability.enrich(this.retryer, capabilities);
List<RequestInterceptor> requestInterceptors = this.requestInterceptors.stream()
.map(ri -> Capability.enrich(ri, capabilities))
.collect(Collectors.toList());
Logger logger = Capability.enrich(this.logger, capabilities);
Contract contract = Capability.enrich(this.contract, capabilities);
Options options = Capability.enrich(this.options, capabilities);
Encoder encoder = Capability.enrich(this.encoder, capabilities);
Decoder decoder = Capability.enrich(this.decoder, capabilities);
InvocationHandlerFactory invocationHandlerFactory =
Capability.enrich(this.invocationHandlerFactory, capabilities);
QueryMapEncoder queryMapEncoder = Capability.enrich(this.queryMapEncoder, capabilities);
SynchronousMethodHandler.Factory synchronousMethodHandlerFactory =
new SynchronousMethodHandler.Factory(client, retryer, requestInterceptors, logger,
logLevel, decode404, closeAfterDecode, propagationPolicy, forceDecoding);
ParseHandlersByName handlersByName =
new ParseHandlersByName(contract, options, encoder, decoder, queryMapEncoder,
errorDecoder, synchronousMethodHandlerFactory);
return new ReflectiveFeign(handlersByName, invocationHandlerFactory, queryMapEncoder);
}
}
那么我们继续往下看ReflectiveFeign的 newInstance方法
targetToHandlersByName该对象即为ParseHandlersByName对象。
public <T> T newInstance(Target<T> target) {
// 1. 通过
Map<String, MethodHandler> nameToHandler = targetToHandlersByName.apply(target);
Map<Method, MethodHandler> methodToHandler = new LinkedHashMap<Method, MethodHandler>();
List<DefaultMethodHandler> defaultMethodHandlers = new LinkedList<DefaultMethodHandler>();
for (Method method : target.type().getMethods()) {
if (method.getDeclaringClass() == Object.class) {
continue;
} else if (Util.isDefault(method)) {
DefaultMethodHandler handler = new DefaultMethodHandler(method);
defaultMethodHandlers.add(handler);
methodToHandler.put(method, handler);
} else {
methodToHandler.put(method, nameToHandler.get(Feign.configKey(target.type(), method)));
}
}
InvocationHandler handler = factory.create(target, methodToHandler);
T proxy = (T) Proxy.newProxyInstance(target.type().getClassLoader(),
new Class<?>[] {target.type()}, handler);
for (DefaultMethodHandler defaultMethodHandler : defaultMethodHandlers) {
defaultMethodHandler.bindTo(proxy);
}
return proxy;
}
ParseHandlersByName#apply
-
从openfeign的缓存中获取注解了
@FeignClient 类的元数据。 -
循环这个元数据通过
SynchronousMethodHandler.Factory的create方法创建SynchronousMethodHandler并缓存在Map<Method, MethodHandler>里。最后将该map对象传递给InvocationHandlerFactory用来创建动态代理对象使用。SynchronousMethodHandler可以理解为动态代理委托类。最终通过jdk 动态代理会间接调用此类中的invoke方法执行远程调用
public Map<String, MethodHandler> apply(Target target) {
List<MethodMetadata> metadata = contract.parseAndValidateMetadata(target.type());
Map<String, MethodHandler> result = new LinkedHashMap<String, MethodHandler>();
for (MethodMetadata md : metadata) {
BuildTemplateByResolvingArgs buildTemplate;
if (!md.formParams().isEmpty() && md.template().bodyTemplate() == null) {
buildTemplate =
new BuildFormEncodedTemplateFromArgs(md, encoder, queryMapEncoder, target);
} else if (md.bodyIndex() != null || md.alwaysEncodeBody()) {
buildTemplate = new BuildEncodedTemplateFromArgs(md, encoder, queryMapEncoder, target);
} else {
buildTemplate = new BuildTemplateByResolvingArgs(md, queryMapEncoder, target);
}
if (md.isIgnored()) {
result.put(md.configKey(), args -> {
throw new IllegalStateException(md.configKey() + " is not a method handled by feign");
});
} else {
result.put(md.configKey(),
factory.create(target, md, buildTemplate, options, decoder, errorDecoder));
}
}
return result;
}
}
Sentinel 创建动态代理对象
上文中提到共。SentinelFeign.build 方法会替换Feign中的 invocationHandlerFactory,那么invocationHandlerFactory 主要是如何处理的呢?通过源码我们可以看到,主要是创建了 SentinelInvocationHandler对象。同时注入了SynchronousMethodHandler对象。 那么在SentinelInvocationHandler中主要是使用Sentinel 集成熔断降级的能力。
我们主要看标红的逻辑代码
- 在执行真正的RPC调用之前,植入了
SphU.entry处理熔断降级。 - 在sentinle抛出降级异常后, catch会捕获到异常,从
fallbackMethodMap(提前加载到缓存中的bean)中获取降级Method。通过invoke反射调用降级策略。 - resourceName 即为提供给sentinel使用的限流熔断 资源key。 这里需要注意?一般开发人员都会认为只要开启了feign.sentinel.enabled=true 后就默认可以使用熔断限流的能力了。但是我们如何配置 熔断限流策略呢?这里先抛一个问题。在下面的内容中讲解这个过程 。
public Object invoke(final Object proxy, final Method method, final Object[] args)
throws Throwable {
// 此处省略部分代码
Object result;
MethodHandler methodHandler = this.dispatch.get(method);
// only handle by HardCodedTarget
if (target instanceof Target.HardCodedTarget) {
Target.HardCodedTarget hardCodedTarget = (Target.HardCodedTarget) target;
MethodMetadata methodMetadata = SentinelContractHolder.METADATA_MAP
.get(hardCodedTarget.type().getName()
+ Feign.configKey(hardCodedTarget.type(), method));
// resource default is HttpMethod:protocol://url
if (methodMetadata == null) {
result = methodHandler.invoke(args);
}
else {
String resourceName = methodMetadata.template().method().toUpperCase()
+ ":" + hardCodedTarget.url() + methodMetadata.template().path();
Entry entry = null;
try {
ContextUtil.enter(resourceName);
entry = SphU.entry(resourceName, EntryType.OUT, 1, args);
result = methodHandler.invoke(args);
}
catch (Throwable ex) {
// fallback handle
if (!BlockException.isBlockException(ex)) {
Tracer.trace(ex);
}
if (fallbackFactory != null) {
try {
Object fallbackResult = fallbackMethodMap.get(method)
.invoke(fallbackFactory.create(ex), args);
return fallbackResult;
}
catch (IllegalAccessException e) {
// shouldn't happen as method is public due to being an
// interface
throw new AssertionError(e);
}
catch (InvocationTargetException e) {
throw new AssertionError(e.getCause());
}
}
else {
// throw exception if fallbackFactory is null
throw ex;
}
}
finally {
if (entry != null) {
entry.exit(1, args);
}
ContextUtil.exit();
}
}
}
else {
// other target type using default strategy
result = methodHandler.invoke(args);
}
return result;
}
看到这里你基本已经了解了整个openfeign如何完成的注册、加载、代理、熔断降级策略了。
重试、负载均衡(openfeign重试、loadbalancer重试、loadbalance负载均衡)
上文提到真正发生远程调用的对象为SynchronousMethodHandler。 整个openfeign、loadbalancer 和 nacos的服务发现的入口都在此类中。
- 通过
RequestTemplate.Factory将请求参数转化为RequestTemplate - clone 请求重试
Retryer,因为重试会记录重试次数,是一个有状态的对象。所以需要浅clone - executeAndDecode 执行远程调用RPC
- catch通过
retryer处理重试逻辑。
@Override
public Object invoke(Object[] argv) throws Throwable {
RequestTemplate template = buildTemplateFromArgs.create(argv);
Options options = findOptions(argv);
Retryer retryer = this.retryer.clone();
while (true) {
try {
return executeAndDecode(template, options);
} catch (RetryableException e) {
try {
retryer.continueOrPropagate(e);
} catch (RetryableException th) {
Throwable cause = th.getCause();
if (propagationPolicy == UNWRAP && cause != null) {
throw cause;
} else {
throw th;
}
}
if (logLevel != Logger.Level.NONE) {
logger.logRetry(metadata.configKey(), logLevel);
}
continue;
}
}
}
SynchronousMethodHandler#executeAndDecode
- targetRequest: 此处可以在rpc调用之前做一些业务集成,通过配置的拦截器将
RequestTemplate转化成Feign的Request方便后续发起Http请求 - client.execute 真正执行http请求,此处的client 根据不同的策略会选择不同的http客户端
- 通过解码器 将返回的数据进行解码
- 开发者如果没有配置实现自定义的解码器,则通过openfeign提供的解码逻辑进行解码。
暂时无法在Lark文档外展示此内容
Object executeAndDecode(RequestTemplate template, Options options) throws Throwable {
Request request = targetRequest(template);
if (logLevel != Logger.Level.NONE) {
logger.logRequest(metadata.configKey(), logLevel, request);
}
Response response;
long start = System.nanoTime();
try {
response = client.execute(request, options);
// ensure the request is set. TODO: remove in Feign 12
response = response.toBuilder()
.request(request)
.requestTemplate(template)
.build();
} catch (IOException e) {
if (logLevel != Logger.Level.NONE) {
logger.logIOException(metadata.configKey(), logLevel, e, elapsedTime(start));
}
throw errorExecuting(request, e);
}
long elapsedTime = TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - start);
if (decoder != null)
return decoder.decode(response, metadata.returnType());
CompletableFuture<Object> resultFuture = new CompletableFuture<>();
asyncResponseHandler.handleResponse(resultFuture, metadata.configKey(), response,
metadata.returnType(),
elapsedTime);
try {
if (!resultFuture.isDone())
throw new IllegalStateException("Response handling not done");
return resultFuture.join();
} catch (CompletionException e) {
Throwable cause = e.getCause();
if (cause != null)
throw cause;
throw e;
}
}
Spring Retry 原理
在开始之前 需要聊一下 spring retry 的基本逻辑。
如果想继续重试,那么以下几个条件必须满足。否则将退出重试逻辑。
canRetry必须为trueisExhaustedOnly必须为false
-
doWithRetry执行重试的逻辑流程,使用者自行实现。 -
registerThrowable重试异常后回调方法 -
doOnErrorInterceptors异常后可以配置拦截器对异常进行自定义逻辑处理。 -
backOffPolicy.backOff(backOffContext);是否等待一段时间再次重试?
while (canRetry(retryPolicy, context) && !context.isExhaustedOnly()) {
try {
if (this.logger.isDebugEnabled()) {
this.logger.debug("Retry: count=" + context.getRetryCount());
}
lastException = null;
return retryCallback.doWithRetry(context);
}
catch (Throwable e) {
lastException = e;
try {
registerThrowable(retryPolicy, state, context, e);
}
catch (Exception ex) {
throw new TerminatedRetryException("Could not register throwable",
ex);
}
finally {
doOnErrorInterceptors(retryCallback, context, e);
}
if (canRetry(retryPolicy, context) && !context.isExhaustedOnly()) {
try {
backOffPolicy.backOff(backOffContext);
}
catch (BackOffInterruptedException ex) {
lastException = e;
// back off was prevented by another thread - fail the retry
if (this.logger.isDebugEnabled()) {
this.logger
.debug("Abort retry because interrupted: count="
+ context.getRetryCount());
}
throw ex;
}
}
if (this.logger.isDebugEnabled()) {
this.logger.debug(
"Checking for rethrow: count=" + context.getRetryCount());
}
if (shouldRethrow(retryPolicy, context, state)) {
if (this.logger.isDebugEnabled()) {
this.logger.debug("Rethrow in retry for policy: count="
+ context.getRetryCount());
}
throw RetryTemplate.<E>wrapIfNecessary(e);
}
}
if (state != null && context.hasAttribute(GLOBAL_STATE)) {
break;
}
}
loadbalancer重试
看下 loadbalancer中是如何进行重试以及如何根据配置进行重试执行的 。首先我们要了解loadbalancer 的两个配置参数 maxRetriesOnSameServiceInstance 和 maxRetriesOnNextServiceInstance 。
-
maxRetriesOnSameServiceInstance: 当前节点最大重试次数。 -
maxRetriesOnNextServiceInstance: 新节点最大重试次数。此处的新节点有可能是触发异常的节点,这取决于loadbalancer 负载均衡算法的choose方法。目前loadbalancer 提供的几个负载均衡算法返回的实例都是无状态的,所以可能会拿到异常的节点。下面会对这个结果进行分析。
client.execute 这里是我们真正发起远程调用的地方。同时在这里我们需要获取到真正调用哪个远程服务端。通过上文我们可以知道。这里重试我们使用的是RetryableFeignBlockingLoadBalancerClient 开启的loadbalance 的重试流程。
-
创建loadbalancer 重试策略
BlockingLoadBalancedRetryPolicy -
通过重试策略
BlockingLoadBalancedRetryPolicy构建RetryTemplate重试模板- 创建重试延迟方式: 默认不延迟, 即发生异常重试时,不需要等待 继续重试。
- 创建重试策略委托者: 将上面创建好的 重试策略传递给
InterceptorRetryPolicy。 那么InterceptorRetryPolicy间接调用BlockingLoadBalancedRetryPolicy进行重试逻辑。 InterceptorRetryPolicy实现了RetryPolicy重写canRetry和registerThrowable-
private RetryTemplate buildRetryTemplate(String serviceId, Request request, LoadBalancedRetryPolicy retryPolicy) { RetryTemplate retryTemplate = new RetryTemplate(); // a. 构建重试延迟方式,默认不延迟/ BackOffPolicy backOffPolicy = this.loadBalancedRetryFactory.createBackOffPolicy(serviceId); retryTemplate.setBackOffPolicy(backOffPolicy == null ? new NoBackOffPolicy() : backOffPolicy); RetryListener[] retryListeners = this.loadBalancedRetryFactory.createRetryListeners(serviceId); if (retryListeners != null && retryListeners.length != 0) { retryTemplate.setListeners(retryListeners); } retryTemplate.setRetryPolicy(retryPolicy == null ? new NeverRetryPolicy() : new InterceptorRetryPolicy(toHttpRequest(request), retryPolicy, loadBalancerClient, serviceId)); return retryTemplate; }
在RPC调用发生异常时,会调用 InterceptorRetryPolicy#registerThrowable 判断重试逻辑。
- 先判断是否可以在 当前机器重试。 如果
canRetrySameServer == true那么sameServerCount++ - 如果
canRetrySameServer == false则sameServerCount= 0nextServerCount++ - 判断是否有资格继续选择新节点重试? 如果有, 则 执行
context.setServiceInstance(null);
@Override
public void registerThrowable(LoadBalancedRetryContext context, Throwable throwable) {
if (!canRetrySameServer(context) && canRetry(context)) {
// Reset same server since we are moving to a new ServiceInstance
sameServerCount = 0;
nextServerCount++;
if (!canRetryNextServer(context)) {
context.setExhaustedOnly();
}
else {
// We want the service instance to be set by
// `RetryLoadBalancerInterceptor`
// in order to get the entire data of the request
context.setServiceInstance(null);
}
}
else {
sameServerCount++;
}
}
@Override
public Response execute(Request request, Request.Options options) throws IOException {
final URI originalUri = URI.create(request.url());
String serviceId = originalUri.getHost();
Assert.state(serviceId != null, "Request URI does not contain a valid hostname: " + originalUri);
// 1. 创建loadbalancer重试策略
final LoadBalancedRetryPolicy retryPolicy = loadBalancedRetryFactory.createRetryPolicy(serviceId,
loadBalancerClient);
RetryTemplate retryTemplate = buildRetryTemplate(serviceId, request, retryPolicy);
return retryTemplate.execute(context -> {
Request feignRequest = null;
ServiceInstance retrievedServiceInstance = null;
Set<LoadBalancerLifecycle> supportedLifecycleProcessors = LoadBalancerLifecycleValidator
.getSupportedLifecycleProcessors(
loadBalancerClientFactory.getInstances(serviceId, LoadBalancerLifecycle.class),
RetryableRequestContext.class, ResponseData.class, ServiceInstance.class);
String hint = getHint(serviceId);
DefaultRequest<RetryableRequestContext> lbRequest = new DefaultRequest<>(
new RetryableRequestContext(null, buildRequestData(request), hint));
// On retries the policy will choose the server and set it in the context
// and extract the server and update the request being made
if (context instanceof LoadBalancedRetryContext) {
LoadBalancedRetryContext lbContext = (LoadBalancedRetryContext) context;
ServiceInstance serviceInstance = lbContext.getServiceInstance();
// 这里是loadbalacner的核心 如果serviceInstance == null 会重新选去新节点
if (serviceInstance == null) {
if (LOG.isDebugEnabled()) {
LOG.debug("Service instance retrieved from LoadBalancedRetryContext: was null. "
+ "Reattempting service instance selection");
}
ServiceInstance previousServiceInstance = lbContext.getPreviousServiceInstance();
lbRequest.getContext().setPreviousServiceInstance(previousServiceInstance);
supportedLifecycleProcessors.forEach(lifecycle -> lifecycle.onStart(lbRequest));
retrievedServiceInstance = loadBalancerClient.choose(serviceId, lbRequest);
if (LOG.isDebugEnabled()) {
LOG.debug(String.format("Selected service instance: %s", retrievedServiceInstance));
}
lbContext.setServiceInstance(retrievedServiceInstance);
}
if (retrievedServiceInstance == null) {
if (LOG.isWarnEnabled()) {
LOG.warn("Service instance was not resolved, executing the original request");
}
org.springframework.cloud.client.loadbalancer.Response<ServiceInstance> lbResponse = new DefaultResponse(
retrievedServiceInstance);
supportedLifecycleProcessors.forEach(lifecycle -> lifecycle
.onComplete(new CompletionContext<ResponseData, ServiceInstance, RetryableRequestContext>(
CompletionContext.Status.DISCARD, lbRequest, lbResponse)));
feignRequest = request;
}
else {
if (LOG.isDebugEnabled()) {
LOG.debug(String.format("Using service instance from LoadBalancedRetryContext: %s",
retrievedServiceInstance));
}
String reconstructedUrl = loadBalancerClient.reconstructURI(retrievedServiceInstance, originalUri)
.toString();
feignRequest = buildRequest(request, reconstructedUrl);
}
}
org.springframework.cloud.client.loadbalancer.Response<ServiceInstance> lbResponse = new DefaultResponse(
retrievedServiceInstance);
Response response = LoadBalancerUtils.executeWithLoadBalancerLifecycleProcessing(delegate, options,
feignRequest, lbRequest, lbResponse, supportedLifecycleProcessors,
retrievedServiceInstance != null);
int responseStatus = response.status();
if (retryPolicy != null && retryPolicy.retryableStatusCode(responseStatus)) {
if (LOG.isDebugEnabled()) {
LOG.debug(String.format("Retrying on status code: %d", responseStatus));
}
byte[] byteArray = response.body() == null ? new byte[] {}
: StreamUtils.copyToByteArray(response.body().asInputStream());
response.close();
throw new LoadBalancerResponseStatusCodeException(serviceId, response, byteArray,
URI.create(request.url()));
}
return response;
}, new LoadBalancedRecoveryCallback<Response, Response>() {
@Override
protected Response createResponse(Response response, URI uri) {
return response;
}
});
}
Loadbalancer 负载均衡
负载均衡工厂
-
LoadBalancerClientFactory为loadbalancer负载均衡工厂,通过该工厂可以获取不同策略的负载均衡策略。 -
spring.cloud.loadbalancer.enabled开启负载均衡策略。默认为true
@Configuration(proxyBeanMethods = false)
@LoadBalancerClients
@EnableConfigurationProperties(LoadBalancerClientsProperties.class)
@AutoConfigureBefore({ ReactorLoadBalancerClientAutoConfiguration.class,
LoadBalancerBeanPostProcessorAutoConfiguration.class })
@ConditionalOnProperty(value = "spring.cloud.loadbalancer.enabled", havingValue = "true", matchIfMissing = true)
public class LoadBalancerAutoConfiguration {
private final ObjectProvider<List<LoadBalancerClientSpecification>> configurations;
public LoadBalancerAutoConfiguration(ObjectProvider<List<LoadBalancerClientSpecification>> configurations) {
this.configurations = configurations;
}
@Bean
@ConditionalOnMissingBean
public LoadBalancerZoneConfig zoneConfig(Environment environment) {
return new LoadBalancerZoneConfig(environment.getProperty("spring.cloud.loadbalancer.zone"));
}
@ConditionalOnMissingBean
@Bean
public LoadBalancerClientFactory loadBalancerClientFactory(LoadBalancerClientsProperties properties) {
LoadBalancerClientFactory clientFactory = new LoadBalancerClientFactory(properties);
clientFactory.setConfigurations(this.configurations.getIfAvailable(Collections::emptyList));
return clientFactory;
}
}
获取负载均衡器
从spring容器中获取类型为 ReactorServiceInstanceLoadBalancer 的对象。 所以如果想自己实现负载均衡客户端,只需要实现 ReactorServiceInstanceLoadBalancer 接口重写chose方法,并且将重写后的bean交给Spring管理即可完成自定义负载均衡策略集成。
@Override
public ReactiveLoadBalancer<ServiceInstance> getInstance(String serviceId) {
return getInstance(serviceId, ReactorServiceInstanceLoadBalancer.class);
}
public <T> T getInstance(String name, Class<T> type) {
AnnotationConfigApplicationContext context = getContext(name);
try {
return context.getBean(type);
}
catch (NoSuchBeanDefinitionException e) {
// ignore
}
return null;
}
执行负载均衡
这里我们选择默认的随机轮训的方式 RandomLoadBalancer
- 获取服务提供器, 这里是从nacos上获取服务列表。 先不做深究,后文会针对nacos服务发现做底层原理分析。
- 通过supplier.get(request).next() 方法即可拿到所有 nacos上管理的服务列表。
- 将服务列表
serviceInstances交给负载均衡算法筛选可执行远程调用的服务。 ThreadLocalRandom.current().nextInt(instances.size())随机选择一个节点返回。
@Override
public Mono<Response<ServiceInstance>> choose(Request request) {
ServiceInstanceListSupplier supplier = serviceInstanceListSupplierProvider
.getIfAvailable(NoopServiceInstanceListSupplier::new);
return supplier.get(request).next()
.map(serviceInstances -> processInstanceResponse(supplier, serviceInstances));
}
private Response<ServiceInstance> processInstanceResponse(ServiceInstanceListSupplier supplier,
List<ServiceInstance> serviceInstances) {
Response<ServiceInstance> serviceInstanceResponse = getInstanceResponse(serviceInstances);
if (supplier instanceof SelectedInstanceCallback && serviceInstanceResponse.hasServer()) {
((SelectedInstanceCallback) supplier).selectedServiceInstance(serviceInstanceResponse.getServer());
}
return serviceInstanceResponse;
}
private Response<ServiceInstance> getInstanceResponse(List<ServiceInstance> instances) {
if (instances.isEmpty()) {
if (log.isWarnEnabled()) {
log.warn("No servers available for service: " + serviceId);
}
return new EmptyResponse();
}
int index = ThreadLocalRandom.current().nextInt(instances.size());
ServiceInstance instance = instances.get(index);
return new DefaultResponse(instance);
}
远程调用
feignClient对象是真正的执行http 请求的调用客户端(jdk http、apache httpclient 、okhttp、asynchttp)feignClient.execute(feignRequest, options);发起远程调用。 所以这里如果我们需要使用自定义的http 客户端,只需要集成feign.Client对象,重写execute方法即可完成自定义http客户端封装- 在服务端发生异常时, Response.status = 500,此时并不会抛出异常。这里很多开发者会觉得在服务端发生异常时主观感觉会抛出异常,但是其实这个异常是服务端异常,客户端只会将该异常信息封装到Response对象中。 只有在请求超时时才会抛出Exceptoin 这里需要注意,因为请求超时 是客户端这边的异常,并不代表服务端异常。
static Response executeWithLoadBalancerLifecycleProcessing(Client feignClient, Request.Options options,
Request feignRequest, org.springframework.cloud.client.loadbalancer.Request lbRequest,
org.springframework.cloud.client.loadbalancer.Response<ServiceInstance> lbResponse,
Set<LoadBalancerLifecycle> supportedLifecycleProcessors, boolean loadBalanced) throws IOException {
supportedLifecycleProcessors.forEach(lifecycle -> lifecycle.onStartRequest(lbRequest, lbResponse));
try {
Response response = feignClient.execute(feignRequest, options);
if (loadBalanced) {
supportedLifecycleProcessors.forEach(
lifecycle -> lifecycle.onComplete(new CompletionContext<>(CompletionContext.Status.SUCCESS,
lbRequest, lbResponse, buildResponseData(response))));
}
return response;
}
catch (Exception exception) {
if (loadBalanced) {
supportedLifecycleProcessors.forEach(lifecycle -> lifecycle.onComplete(
new CompletionContext<>(CompletionContext.Status.FAILED, exception, lbRequest, lbResponse)));
}
throw exception;
}
}
解析响应结果
-
在
client.execute执行异常时(超时、provider不存在、连接中断)场景下 会抛出客户端异常。 此时会创建RetryableException向上抛出 出发feign重试。 -
Decoder为解码器,开发者可以自定实现该类重写decode方法。默认为空。 则流程会继续往下执行。 -
asyncResponseHandler.handleResponseopenfeign默认的 响应解码逻辑。- 通过
errorDecoder.decode处理通过Response 封装的服务端异常进行处理, 只有在响应头header中存在Retry-After才会抛出RetryableException异常, 才能触发feign的重试机制。
- 通过
Object executeAndDecode(RequestTemplate template, Options options) throws Throwable {
Request request = targetRequest(template);
long start = System.nanoTime();
try {
response = client.execute(request, options);
// 此处省略部分代码
} catch (IOException e) {
if (logLevel != Logger.Level.NONE) {
logger.logIOException(metadata.configKey(), logLevel, e, elapsedTime(start));
}
// 1
throw errorExecuting(request, e);
}
if (decoder != null)
// 2
return decoder.decode(response, metadata.returnType());
CompletableFuture<Object> resultFuture = new CompletableFuture<>();
// 3
asyncResponseHandler.handleResponse(resultFuture, metadata.configKey(), response,
metadata.returnType(),
elapsedTime);
try {
if (!resultFuture.isDone())
throw new IllegalStateException("Response handling not done");
return resultFuture.join();
} catch (CompletionException e) {
Throwable cause = e.getCause();
if (cause != null)
throw cause;
throw e;
}
}
handleResponse处理响应结果
void handleResponse(CompletableFuture<Object> resultFuture,
String configKey,
Response response,
Type returnType,
long elapsedTime) {
try {
if (Response.class == returnType) {
// 此处省略部分代码
} else if (decode404 && response.status() == 404 && !isVoidType(returnType)) {
resultFuture.complete(result);
} else { // 异常场景 response == 500
resultFuture.completeExceptionally(errorDecoder.decode(configKey, response));
}
} catch (final IOException e) {
resultFuture.completeExceptionally(errorReading(response.request(), response, e));
} catch (final Exception e) {
resultFuture.completeExceptionally(e);
}
}
}
ErrorDecoder#decode 处理异常逻辑,只有在服务端返回响应头中存在Retry-After时才会触发feign 重试。
public static class Default implements ErrorDecoder {
private final RetryAfterDecoder retryAfterDecoder = new RetryAfterDecoder();
@Override
public Exception decode(String methodKey, Response response) {
FeignException exception = errorStatus(methodKey, response);
Date retryAfter = retryAfterDecoder.apply(firstOrNull(response.headers(), RETRY_AFTER));
if (retryAfter != null) {
return new RetryableException(
response.status(),
exception.getMessage(),
response.request().httpMethod(),
exception,
retryAfter,
response.request());
}
return exception;
}
private <T> T firstOrNull(Map<String, Collection<T>> map, String key) {
if (map.containsKey(key) && !map.get(key).isEmpty()) {
return map.get(key).iterator().next();
}
return null;
}
}
Openfeign 重试
SynchronousMethodHandler#invoke中 feign通过retry重试组件进行重试逻辑。只有executeAndDecode 抛出RetryableException时候才会触发feign的重试
public Object invoke(Object[] argv) throws Throwable {
RequestTemplate template = buildTemplateFromArgs.create(argv);
Options options = findOptions(argv);
Retryer retryer = this.retryer.clone();
while (true) {
try {
return executeAndDecode(template, options);
} catch (RetryableException e) {
try {
retryer.continueOrPropagate(e);
} catch (RetryableException th) {
Throwable cause = th.getCause();
if (propagationPolicy == UNWRAP && cause != null) {
throw cause;
} else {
throw th;
}
}
if (logLevel != Logger.Level.NONE) {
logger.logRetry(metadata.configKey(), logLevel);
}
continue;
}
}
}
Retryer#continueOrPropagate
-
会先判断重试次数是否大于最大重试次数,如果大于则直接抛出异常
-
计算重试尝试的时间间隔。间隔随着每次尝试呈指数增长,以nextInterval *= 1.5(其中1.5是后退因子)的速率增长到最大间隔。
-
Retryer有四个配置
- period: 重试间隔时间
- maxPeriod: 最大重试间隔时间
- maxAttempts: 最大重试次数
- attempt: 当前重试次数
public void continueOrPropagate(RetryableException e) {
if (attempt++ >= maxAttempts) {
throw e;
}
long interval;
if (e.retryAfter() != null) {
interval = e.retryAfter().getTime() - currentTimeMillis();
if (interval > maxPeriod) {
interval = maxPeriod;
}
if (interval < 0) {
return;
}
} else {
interval = nextMaxInterval();
}
try {
Thread.sleep(interval);
} catch (InterruptedException ignored) {
Thread.currentThread().interrupt();
throw e;
}
sleptForMillis += interval;
}
long nextMaxInterval() {
long interval = (long) (period * Math.pow(1.5, attempt - 1));
return interval > maxPeriod ? maxPeriod : interval;
}
服务发现 (nacos集成)
Springcloud 服务发现客户端
Springcloud提供服务发现规范,如需要自定义服务发现需要集成DiscoveryClient 接口。 所有的服务发现客户端统一由CompositeDiscoveryClient (委托)管理。同时为@Primary不允许被覆盖。我们发现有两套CompositeDiscovery逻辑。其实Reactive前缀的是Spring Cloud 的响应式(reactive)服务发现客户端。 不带Reactive前缀的是传统非响应式(synchronous)服务发现客户端
因为loadbalancer3.1.1 集成的是响应式服务发现客户端,所以这里我们以响应式分析。
@Configuration(proxyBeanMethods = false)
@ConditionalOnDiscoveryEnabled
@ConditionalOnReactiveDiscoveryEnabled
public class ReactiveCompositeDiscoveryClientAutoConfiguration {
@Bean
@Primary
public ReactiveCompositeDiscoveryClient reactiveCompositeDiscoveryClient(
List<ReactiveDiscoveryClient> discoveryClients) {
return new ReactiveCompositeDiscoveryClient(discoveryClients);
}
}
Loadbalancer 集成
LoadBalancerClientConfiguration.ReactiveSupportConfiguration配置类中 通过withDiscoveryClient()创建服务发现查询客户端
@Bean
@ConditionalOnBean(ReactiveDiscoveryClient.class)
@ConditionalOnMissingBean
@Conditional(DefaultConfigurationCondition.class)
public ServiceInstanceListSupplier discoveryClientServiceInstanceListSupplier(
ConfigurableApplicationContext context) {
return ServiceInstanceListSupplier.builder().withDiscoveryClient().withCaching().build(context);
}
ServiceInstanceListSupplierBuilder#withDiscoveryClient
- Loadbalancer 从spring上下文中获取到ReactiveDiscoveryClient 客户端(响应式服务发现客户端
ReactiveCompositeDiscoveryClient) - 将响应式服务发现客户端委托给DiscoveryClientServiceInstanceListSupplier 。间接查询服务列表。
public ServiceInstanceListSupplierBuilder withDiscoveryClient() {
if (baseCreator != null && LOG.isWarnEnabled()) {
LOG.warn("Overriding a previously set baseCreator with a ReactiveDiscoveryClient baseCreator.");
}
this.baseCreator = context -> {
ReactiveDiscoveryClient discoveryClient = context.getBean(ReactiveDiscoveryClient.class);
return new DiscoveryClientServiceInstanceListSupplier(discoveryClient, context.getEnvironment());
};
return this;
}
DiscoveryClientServiceInstanceListSupplier通过构造方法构建serviceInstances, 注意serviceInstances是一个Flux 类型的对象。 即响应式返回查询结果。
private final Flux<List<ServiceInstance>> serviceInstances;
public DiscoveryClientServiceInstanceListSupplier(ReactiveDiscoveryClient delegate, Environment environment) {
this.serviceId = environment.getProperty(PROPERTY_NAME);
resolveTimeout(environment);
this.serviceInstances = Flux
.defer(() -> delegate.getInstances(serviceId).collectList().flux().timeout(timeout, Flux.defer(() -> {
logTimeout();
return Flux.just(new ArrayList<>());
})).onErrorResume(error -> {
logException(error);
return Flux.just(new ArrayList<>());
}));
}
@Override
public Flux<List<ServiceInstance>> get() {
return serviceInstances;
}
现在我们回到负载均衡算法 RandomLoadBalancer 场景中。现在再来看supplier.get(request) 方法就一幕了然了。
该方法返回Flux<List> ,那么间接会通过响应式模式调用 NacosReactiveDiscoveryClientgetInstances#getInstances 方法调用nacos API获取 nacos上注册的服务实例列表。
@Override
public Mono<Response<ServiceInstance>> choose(Request request) {
ServiceInstanceListSupplier supplier = serviceInstanceListSupplierProvider
.getIfAvailable(NoopServiceInstanceListSupplier::new);
return supplier.get(request).next()
.map(serviceInstances -> processInstanceResponse(supplier, serviceInstances));
}
重点问题
-
openfeign重试和loadbalancer 重试该如何选择?
- openfeign是基于retry组件实现的。 开发者可以自定义重写扩展重试组件。默认不重试
- loadbalancer是基于spring-retry组件实现的开发者无法重写内部逻辑。
- 因为openfeign重试机制可扩展,同时是最外层的重试,重试范围更大,建议使用openfeign 重试,关闭loadbalancer 重试。
-
自定义解码器之后。 openfeign的重试机制会发生什么变化?
- openfeign在获取到服务端响应数据后,如果服务端异常如需要重试。 则响应头中需要返回
Retry-After否则不重试, 如果自定义解码器不包含这块逻辑。会影响到openfeign的重试逻辑。
- openfeign在获取到服务端响应数据后,如果服务端异常如需要重试。 则响应头中需要返回
-
openfeign和loadbalancer 重试可以都开启吗?
- 禁止openfeign和loadbalancer 的重试都打开, 否则可能会导致笛卡尔积的重试次数
-
openfeign和loadbalancer 重试过程中能否做到不再重试上一次异常节点?
- 两者都不能做到不在重试上一次异常节点。
- openfeign重试是重新调用chose方法获取实例。
- Loadbalancer 是基于配置
maxRetriesOnSameServiceInstance和maxRetriesOnNextServiceInstance配置进行重试, 但是本质上即使配置maxRetriesOnSameServiceInstance=0,maxRetriesOnNextServiceInstance>0 也仅仅是重新调用chose方法,因为负载均衡chose方法没有维护服务运行状态,所以还是无法排除上次异常节点 。
-
Sentinel熔断和重试哪个优先级更高?Sentinel熔断优先级更高,因为在SentinelInvocationHandler中SphU.entry是在远程调用之前触发的
-
openfeign和loadbalacner过程中出现异常能否触发sentinle的熔断?- 可以, openfeign和loadbalancer任何异常 只要被上抛出来后,都会被
SentinelInvocationHandler触发fallback熔断逻辑
- 可以, openfeign和loadbalancer任何异常 只要被上抛出来后,都会被