持续创作，加速成长！这是我参与「掘金日新计划 · 6 月更文挑战」的第6天，点击查看活动详情

Redis cluster 请求路由

请求重定向

redis 客户端在发起任一的关于键的命令是会有，计算key的slot值，计算slot的节点位置，对指定的节点发起该命令等过程，但是在整个过程中可能存在MOVED重定向的问题。即在发起命令的时候key的slot已经被迁移并且迁移的整个过程已经结束，但是客户端本地的slot<->node映射缓存并没有更新，所以redis server就会响应MOVED重定向，其中会包含重定向的详细信息，客户端可以用来更新本地缓存，可以重新对新的节点发起命令。

注意： 在Jedis中ASK重定向和MOVED重定向是存在差别的，ASK重定向表示在迁移过程中，并且不会更新客户端本地的缓存，只是临时的用于客户端去发起新的请求，但是MOVED重定向则表示迁移已经结束但是本地缓存没有被刷新，需要用最新的信息刷新客户端缓存的。

key的槽计算

对key的槽计算默认是使用CRC16算法获取key的散列值再除余16384得到的槽位置的，一般计算的时候都是使用整个key值，但是在一些需求下需要进行批量操作比如pipeline或者mget、mset等他们都是不能跨slot进行的，所以redis提供了一种hash_tag格式命名key，型如：test1:{abc}:test2 这样的格式，在计算key值对应得slot的过程中只会使用只会使用{}中的标记字，这个标记字就叫做hash_tag,hash_tag在涉及实务Lua脚本pipeline批操作上是很好的解决方案。

Jedis客户端分析

Jedis 客户端中维护了一组slot→node的映射关系，本地就可实现键到节点的查找，从而保证IO效率的最大化，而MOVED重定向负责协助Jedis客户端更新slot→node映射,以下是Jedis操作Redis Cluster流程：

1.在Jedis的JedisCluster中，在客户端初始化运行的时候会随机的选择一个节点发送cluster slots命令，用于初始化本地的slots-节点缓存（RedisClusterInfoCache）

2.JedisCluster解析cluster slots的响应，将信息保存到JedisClusterInfoCache中，并且为每一个节点创建一个单独的JedisPool连接池

3.执行相应的键命令，这个过程相对比较复杂，键执行流程：

**a.**计算slot并根据slots缓存获取目标节点连接，发送命令。

**b.**如果出现连接错误，使用随机连接重新执行键命令，每次命令重试对maxAttempts参数减1。

**c.**捕获到MOVED重定向错误，使用cluster slots命令更新slots缓存（renewSlotCache方法）。捕获到MOVED重定向错误，使用cluster slots命令更新slots缓存（renewSlotCache方法）。

**d.**重复执行1） ~3）步，直到命令执行成功，或者当maxAttempts<=0时抛出Jedis ClusterMaxRedirectionsException异常。

相关的代码实现如下（Jedis2.9.0）：

// redis.clients.jedis.JedisClusterCommand`中。
public abstract class JedisClusterCommand<T> {
    //集群节点连接处理器
    private JedisClusterConnectionHandler connectionHandler;
    //最大重试次数，默认5次
    private int maxAttempts;
    private ThreadLocal<Jedis> askConnection = new ThreadLocal();

    public JedisClusterCommand(JedisClusterConnectionHandler connectionHandler, int maxAttempts) {
        this.connectionHandler = connectionHandler;
        this.maxAttempts = maxAttempts;
    }
    //模板回调方法
    public abstract T execute(Jedis var1);

    public T run(String key) {
        if (key == null) {
            throw new JedisClusterException("No way to dispatch this command to Redis Cluster.");
        } else {
            return this.runWithRetries(SafeEncoder.encode(key), this.maxAttempts, false, false);
        }
    }
    //有重试的执行命令
    private T runWithRetries(byte[] key, int attempts, boolean tryRandomNode, boolean asking) {
        //超过最大重试次数则抛出JedisClusterMaxRedirectionsException
        if (attempts <= 0) {
            throw new JedisClusterMaxRedirectionsException("Too many Cluster redirections?");
        } else {
            Jedis connection = null;

            Object var7;
            try {
                if (asking) {//如果key计算得到slot在第一次请求后，redis -server响应了ASK重定向执行ASK重定向逻辑
                    connection = (Jedis)this.askConnection.get();
                    connection.asking();
                    asking = false;
                } else if (tryRandomNode) {//如果是第一次访问或者是Moved重定向以后的访问随机获取活跃节点连接
                    connection = this.connectionHandler.getConnection();
                } else {
                    //使用slot缓存获取目标连接
                    connection = this.connectionHandler.getConnectionFromSlot(JedisClusterCRC16.getSlot(key));
                }

                Object var6 = this.execute(connection);
                return var6;
            } catch (JedisNoReachableClusterNodeException var13) {
                throw var13;
            } catch (JedisConnectionException var14) {
                //出现连接异常，释放连接
                this.releaseConnection(connection);
                connection = null;
                if (attempts <= 1) {

                    this.connectionHandler.renewSlotCache();
                    throw var14;
                }

                var7 = this.runWithRetries(key, attempts - 1, tryRandomNode, asking);
                return var7;
            } catch (JedisRedirectionException var15) {
                if (var15 instanceof JedisMovedDataException) {
                    //出现MOVED重定向异常，则再执行cluster slots获取集群信息刷新缓存
                    this.connectionHandler.renewSlotCache(connection);
                }

                this.releaseConnection(connection);
                connection = null;
                if (var15 instanceof JedisAskDataException) {
                    asking = true;
                    this.askConnection.set(this.connectionHandler.getConnectionFromNode(var15.getTargetNode()));
                } else if (!(var15 instanceof JedisMovedDataException)) {//如果Redis Server 响应Moevd重定向则抛出JedisMovedDataException，runWithRetries是嵌套调用这个异常在外面一层的该方法中捕获，并且发送cluster slots命令且使用renewSlotCache更新本地的slot-node映射。
                    throw new JedisClusterException(var15);
                }
                //每次重试maxAttempts-1
                var7 = this.runWithRetries(key, attempts - 1, false, asking);
            } finally {
                this.releaseConnection(connection);
            }

            return var7;
        }
    }

问题分析：

1.JedisCluster 内部维护了一个数据槽（slot）到集群节点的映射，并且对于每一个节点都单独的维护了一个JedisPool,每一个pool里面又存在多个连接，当集群非常大的时候会维护很多的连接，对内存的消耗会很大；

2.常见异常---JedisClusterMaxRedirectionsException（重定向超过次数）原因是节点碟机或者连接超时时会抛出JedisConnectionException，这个异常会导致重试，maxAttempts<=0时就会抛出该异常

3.JedisConnectionException，收到这个异常Jedis就会认为节点连接存在异常，需要随机重试来更新本地的JedisClusterInfoCache缓存。以下是几种会导致该异常的情况： a.Jedis节点发生socket错误时候抛出；

b.所有命令或者是Lua脚本读写超时的时候抛出；

c.另外在老版本的Jedis中，从JedisPool中获取Jedis对象超时也会抛出，但是2.8.1以后对于c连接池超时的情况改为抛出JedisException，避免触发随机重试。

4.Redis Cluster支持自动故障迁移，这个过程需要一定的时间，节点宕机期间所有指向这个节点的命令都会触发随机重试，每次收到MOVED重定向后会调用JedisClusterInfoCache类的renewSlotCache方法。代码如下：


public void renewClusterSlots(Jedis jedis) {
    if (!this.rediscovering) {
        try {
            //获取读写锁
            this.w.lock();
            this.rediscovering = true;
            if (jedis != null) {
                try {
                    this.discoverClusterSlots(jedis);
                    return;
                } catch (JedisException var17) {
                    ;
                }
            }
            //如果连接为空触发以下的随机重试
            //随机获取一个连接池对象，并且发送 cluster slots命令获取集群slots分配详情（内部封装）
            Iterator var2 = this.getShuffledNodesPool().iterator();

            while(var2.hasNext()) {
                JedisPool jp = (JedisPool)var2.next();

                try {
                    jedis = jp.getResource();
                    this.discoverClusterSlots(jedis);
                    return;
                } catch (JedisConnectionException var15) {
                    ;
                } finally {
                    if (jedis != null) {
                        jedis.close();
                    }

                }
            }
        } finally {
            this.rediscovering = false;
            this.w.unlock();
        }
    }

}

个别节点操作异常导致频繁的更新slots缓存，多次调用cluster slots命令，高并发时将过度消耗Redis节点资源，如果集群slot<->node映射庞大则cluster slots返回信息越多，占用带宽越大，问题越严重。当出现JedisConnectionException时，命令发送次数为5次： 4次重试命令+1次cluster slots命令只有一次cluster slots执行是因为rediscovering变量保证了同一时刻只允许一个线程更改缓存。