Dubbo集群容错在Dubbo中，服务消费者会根据一定的策略选择服务提供方集群中的某个实例进行调用。当调用失败时，Dub

简介

通常生产环境中服务消费方和服务提供方都是以集群模式部署的，服务消费者会根据一定的策略选择服务提供方集群中的某个实例进行调用。当调用失败时，Dubbo 提供了多种容错方案，缺省为 failover （失败重试）。

以下为Dubbo提供的集群容错模式

容错模式	描述
Failover Cluster	失败自动重试。当调用出现失败，会自动切换到其它服务提供者实例进行重试。通常用于幂等的操作，但重试会带来更长延迟。该配置为缺省配置。
Failfast Cluster	快速失败。只发起一次调用，失败立即报错。通常用于非幂等性的操作。
Failsafe Cluster	安全失败。出现调用出现异常时，直接忽略异常。通常用于写入审计日志等操作。
Forking Cluster	并行调用。并行调用多个服务器，只要一个成功即返回。通常用于实时性要求较高的操作，但需要浪费更多服务资源。
Failback Cluster	失败自动恢复。后台记录失败请求，定时重发。通常用于消息通知操作。
Broadcast Cluster	广播调用。逐个调用所有提供者，任意一台报错则标志本次调用失败。通常用于通知所有提供者更新缓存或日志等本地资源信息。

配置示例

服务消费方和服务提供方共用配置

/**
 * Cluster strategy, legal values include: failover, failfast, failsafe, failback, forking
 */
String cluster() default "";

public class ForkingCluster implements Cluster {
    public final static String NAME = "forking";
}

服务消费方配置

@Reference(interfaceName = "com.xxx.XxxService", cluster = ForkingCluster.NAME)
private XxxService xxxService;

<dubbo:reference cluster="failsafe" />

服务提供方配置

@Service(cluster = FailoverCluster.NAME)
public class XxxServiceImpl implements XxxService

<dubbo:service cluster="failsafe" />

源码分析

下面以Dubbo默认的集群容错模式FailoverCluster为例，分析集群容错模式的源码。实现失败自动重试的是FailoverClusterInvoker类。

org.apache.dubbo.rpc.cluster.support.FailoverClusterInvoker

public Result doInvoke(Invocation invocation, final List<Invoker<T>> invokers, LoadBalance loadbalance) throws RpcException {
    
    // 所有服务提供者
    List<Invoker<T>> copyInvokers = invokers;
    checkInvokers(copyInvokers, invocation);
    String methodName = RpcUtils.getMethodName(invocation);
    
    // 获取重试次数。值=设置的重试次数+1
    int len = getUrl().getMethodParameter(methodName, Constants.RETRIES_KEY, Constants.DEFAULT_RETRIES) + 1;
    if (len <= 0) {
        len = 1;
    }
    
    RpcException le = null;
    List<Invoker<T>> invoked = new ArrayList<Invoker<T>>(copyInvokers.size()); 
    Set<String> providers = new HashSet<String>(len);
    
    // 进行循环调用，如果调用成功，则不再重试
    for (int i = 0; i < len; i++) {
        // 如果调用过程中，服务提供者列表发生改变，则获取新的提供者列表
        if (i > 0) {
            checkWhetherDestroyed();
            copyInvokers = list(invocation);
            // check again
            checkInvokers(copyInvokers, invocation);
        }
        
        // 根据负载均衡机制选取一个服务提供者
        Invoker<T> invoker = select(loadbalance, invocation, copyInvokers, invoked);
        invoked.add(invoker);
        RpcContext.getContext().setInvokers((List) invoked);
        try {
            // 调用成功，则返回，不再进行循环重试
            Result result = invoker.invoke(invocation);
            return result;
        } catch (RpcException e) {
            if (e.isBiz()) { // biz exception.
                throw e;
            }
            le = e;
        } catch (Throwable e) {
            le = new RpcException(e.getMessage(), e);
        } finally {
            providers.add(invoker.getUrl().getAddress());
        }
    }
    
    // 全部重试失败，抛出异常，标志调用失败
    throw new RpcException("Failed to invoke the method ");
}

校验服务提供者是否为空，如果为空则抛出异常

protected void checkInvokers(List<Invoker<T>> invokers, Invocation invocation) {
    if (CollectionUtils.isEmpty(invokers)) {
        throw new RpcException("");
    }
}

自定义集群容错模式

Dubbo内置了丰富的集群容错模式，如果在实际开发中，开发者有更个性化的的使用需求，那么也可以根据org.apache.dubbo.rpc.cluster.Cluster扩展接口进行自定义实现。

/**
 * Cluster. (SPI, Singleton, ThreadSafe)
 */
@SPI(FailoverCluster.NAME)
public interface Cluster {

    /**
     * Merge the directory invokers to a virtual invoker.
     */
    @Adaptive
    <T> Invoker<T> join(Directory<T> directory) throws RpcException;

}

步骤

1.定义扩展接口Cluster的自定义实现类

public class CustomCluster implements Cluster {

    public final static String NAME = "custom";

    @Override
    public <T> Invoker<T> join(Directory<T> directory) throws RpcException {
        return new CustomClusterInvoker<>(directory);
    }

}

2.结合AbstractClusterInvoker，编写ClusterInvoker

public class CustomClusterInvoker <T> extends AbstractClusterInvoker<T> {

    public CustomClusterInvoker(Directory<T> directory) {
        super(directory);
    }

    @Override
    public Result doInvoke(Invocation invocation, List<Invoker<T>> invokers, LoadBalance loadbalance) throws RpcException {
        checkInvokers(invokers, invocation);
        // 实际场景中的容错逻辑...
        return invoker.invoke(invocation);
    }

}

3.在META-INF/dubbo文件夹下，编写名称为扩展接口全路径的文件org.apache.dubbo.rpc.cluster.Cluster，内容如下

custom=com.xxx.cluster.CustomCluster

4.消费方配置

@Reference(interfaceName = "com.xxx.XxxService", cluster = CustomCluster.NAME)
private XxxService xxxService;