一文让你看懂Hystrix熔断机制

3,078 阅读3分钟

初始化

首先Hystrix中HystrixCircuitBreaker类代表熔断器对象,这个类在AbstractCommand类里,这个类在初始化的时候就会来初始化HystrixCircuitBreaker

 this.circuitBreaker = initCircuitBreaker(this.properties.circuitBreakerEnabled().get(), circuitBreaker, this.commandGroup, this.commandKey, this.properties, this.metrics);

initCircuitBreaker()这个方法就实现了熔断器的初始化,我们来看看他是怎么初始化的。

private static HystrixCircuitBreaker initCircuitBreaker(boolean enabled, HystrixCircuitBreaker fromConstructor,
                                                            HystrixCommandGroupKey groupKey, HystrixCommandKey commandKey,
                                                            HystrixCommandProperties properties, HystrixCommandMetrics metrics) {
        if (enabled) {
            if (fromConstructor == null) {
                // get the default implementation of HystrixCircuitBreaker
                return HystrixCircuitBreaker.Factory.getInstance(commandKey, groupKey, properties, metrics);
            } else {
                return fromConstructor;
            }
        } else {
            return new NoOpCircuitBreaker();
        }
    }

很显然它是从HystrixCircuitBreaker.Factory.getInstance(commandKey, groupKey, properties, metrics);这个方法来获得的,继续跟进。

 public static HystrixCircuitBreaker getInstance(HystrixCommandKey key, HystrixCommandGroupKey group, HystrixCommandProperties properties, HystrixCommandMetrics metrics) {
            // this should find it for all but the first time
            HystrixCircuitBreaker previouslyCached = circuitBreakersByCommand.get(key.name());
            if (previouslyCached != null) {
                return previouslyCached;
            }

            // if we get here this is the first time so we need to initialize

            // Create and add to the map ... use putIfAbsent to atomically handle the possible race-condition of
            // 2 threads hitting this point at the same time and let ConcurrentHashMap provide us our thread-safety
            // If 2 threads hit here only one will get added and the other will get a non-null response instead.
            HystrixCircuitBreaker cbForCommand = circuitBreakersByCommand.putIfAbsent(key.name(), new HystrixCircuitBreakerImpl(key, group, properties, metrics));
            if (cbForCommand == null) {
                // this means the putIfAbsent step just created a new one so let's retrieve and return it
                return circuitBreakersByCommand.get(key.name());
            } else {
                // this means a race occurred and while attempting to 'put' another one got there before
                // and we instead retrieved it and will now return it
                return cbForCommand;
            }
        }

可以看出每个方法对应一个HystrixCircuitBreaker,并把所有的HystrixCircuitBreaker保存在一个ConcurrentHashMap中,代码如下:

private static ConcurrentHashMap<String, HystrixCircuitBreaker> circuitBreakersByCommand = new ConcurrentHashMap<String, HystrixCircuitBreaker>();

如何监听各种异常信息

熔断器的实现类是HystrixCircuitBreakerImpl,看看这个类的构造方法,

protected HystrixCircuitBreakerImpl(HystrixCommandKey key, HystrixCommandGroupKey commandGroup, final HystrixCommandProperties properties, HystrixCommandMetrics metrics) {
            this.properties = properties;
            this.metrics = metrics;

            //On a timer, this will set the circuit between OPEN/CLOSED as command executions occur
            Subscription s = subscribeToStream();
            activeSubscription.set(s);
        }

Subscription s = subscribeToStream();这行代码很重要,熔断器监听各种异常信息就是通过这个方法来的,我们来看它的实现

  private Subscription subscribeToStream() {
            /*
             * This stream will recalculate the OPEN/CLOSED status on every onNext from the health stream
             */
            return metrics.getHealthCountsStream()
                    .observe()
                    .subscribe(new Subscriber<HealthCounts>() {
                        @Override
                        public void onCompleted() {

                        }

                        @Override
                        public void onError(Throwable e) {

                        }

                        @Override
                        public void onNext(HealthCounts hc) {
                            // check if we are past the statisticalWindowVolumeThreshold
                            if (hc.getTotalRequests() < properties.circuitBreakerRequestVolumeThreshold().get()) {
                                // we are not past the minimum volume threshold for the stat window,
                                // so no change to circuit status.
                                // if it was CLOSED, it stays CLOSED
                                // if it was half-open, we need to wait for a successful command execution
                                // if it was open, we need to wait for sleep window to elapse
                            } else {
                                if (hc.getErrorPercentage() < properties.circuitBreakerErrorThresholdPercentage().get()) {
                                    //we are not past the minimum error threshold for the stat window,
                                    // so no change to circuit status.
                                    // if it was CLOSED, it stays CLOSED
                                    // if it was half-open, we need to wait for a successful command execution
                                    // if it was open, we need to wait for sleep window to elapse
                                } else {
                                    // our failure rate is too high, we need to set the state to OPEN
                                    if (status.compareAndSet(Status.CLOSED, Status.OPEN)) {
                                        circuitOpened.set(System.currentTimeMillis());
                                    }
                                }
                            }
                        }
                    });
        }

熔断器订阅了metrics,每次如果有新的统计信息,就会来回调这个onNext()方法,该方法就会对统计信息进行各种检查,按照我们设置的一些参数,来完成对应的熔断的打开,hc.getTotalRequests() < properties.circuitBreakerRequestVolumeThreshold().get() 就是说在最近一个时间窗口内(10s),totalRequests(总请求数量)小于circuitBreakerRequestVolumeThreshold(默认是20),那么什么都不干,反之,如果说totalRequests(总请求数量) >= circuitBreakerRequestVolumeThreshold(默认是20),那么就会进入下一步的尝试。if (hc.getErrorPercentage() < properties.circuitBreakerErrorThresholdPercentage().get()) 如果说最近一个时间窗口(默认是10s)内的异常的请求次数所占的比例(25次请求,5次,20%),< circuitBreakerErrorThresholdPercentage(异常比例,默认是50%),什么都不干。。。。但是反之,如果最近一个时间窗口内(默认是10s)内的异常的请求次数所占的比例(25次请求,20次,80%) > circuitBreakerErrorThresholdPercentage(默认是50%),此时就会打开熔断开关。

if (status.compareAndSet(Status.CLOSED, Status.OPEN)) {
      circuitOpened.set(System.currentTimeMillis());
     }

这段逻辑,就会将熔断器的状态设置为OPEN,并把circuitOpened设置为当前时间,这个值默认为-1,代表熔断器是关闭的。

熔断器打开之后如何自动回复

 private boolean isAfterSleepWindow() {
            final long circuitOpenTime = circuitOpened.get();
            final long currentTime = System.currentTimeMillis();
            final long sleepWindowTime = properties.circuitBreakerSleepWindowInMilliseconds().get();
            return currentTime > circuitOpenTime + sleepWindowTime;
        }

circuitBreakerSleepWindowInMilliseconds 默认是5秒钟,我们可以自己去修改,不修改也没事 如果上次熔断打开的时间是:20:00:00 此时当前的时间是20:00:06 > 20:00:00 + circuitBreakerSleepWindowInMilliseconds(5s) = 20:00:05 当前时间比上一次熔断器打开的时间已经超过了5秒钟了 熔断器状态会从OPEN -> HALF_OPEN,这时会放这个请求过去,尝试执行。如果尝试请求失败了,拒绝、超时、失败,会走到handleFallback这个组件 ,然后调用 circuitBreaker.markNonSuccess();方法

 @Override
        public void markNonSuccess() {
            if (status.compareAndSet(Status.HALF_OPEN, Status.OPEN)) {
                //This thread wins the race to re-open the circuit - it resets the start time for the sleep window
                circuitOpened.set(System.currentTimeMillis());
            }
        }

这个方法就会把状态从HALF_OPEN改为OPEN,并且把circuitOpened设为当前时间。

如果尝试请求成功了,会回调markEmits或者markOnCompleted然后调用circuitBreaker.markSuccess() 方法。

 @Override
        public void markSuccess() {
            if (status.compareAndSet(Status.HALF_OPEN, Status.CLOSED)) {
                //This thread wins the race to close the circuit - it resets the stream to start it over from 0
                metrics.resetStream();
                Subscription previousSubscription = activeSubscription.get();
                if (previousSubscription != null) {
                    previousSubscription.unsubscribe();
                }
                Subscription newSubscription = subscribeToStream();
                activeSubscription.set(newSubscription);
                circuitOpened.set(-1L);
            }
        }

这个方法就会把状态从HALF_OPEN改为CLOSED,并且把circuitOpened设为-1。 整体流程图: enter description here