RedisCluster源码分析

614 阅读8分钟

       项目中长时间使用了redis-client的jar包,但是对它的实现一无所知,只是会调用API去实现功能。本着知其然,要知其所以然的目标,分析了主干源码。

带着下面的问题,开始源码之旅。

  • cluster怎样初始化的?

  • Jedis对象到底是什么?

  • 一次set都做了哪些事情?

  • redis集群的一个主节点宕机了,从节点晋升为主节点,客户端怎样处理的?

  • redis集群的一个主节点和它的所有从节点都宕机了,客户端怎样处理的?

  • ...

初始化集群

public class JedisCluster extends BinaryJedisCluster implements JedisClusterCommands,
 MultiKeyJedisClusterCommands, JedisClusterScriptingCommands {  

    public JedisCluster(Set<HostAndPort> jedisClusterNode, int connectionTimeout,
                   int soTimeout, int maxAttempts, String password, String clientName, 
                   GenericObjectPoolConfig poolConfig) {   
               //调用父类构造器
               super(jedisClusterNode, connectionTimeout, soTimeout, maxAttempts, password, 
                     clientName, poolConfig);}
}

调用父类构造器

public BinaryJedisCluster(Set<HostAndPort> jedisClusterNode, int connectionTimeout, 
                          int soTimeout, int maxAttempts, String password, String clientName, 
                          GenericObjectPoolConfig poolConfig) {    
   //初始化集群连接处理器
   this.connectionHandler = new JedisSlotBasedConnectionHandler(jedisClusterNode, poolConfig, 
                                       connectionTimeout, soTimeout, password, clientName);   
   //执行命令的重试次数
   this.maxAttempts = maxAttempts;
}

集群连接处理器

将集群信息初始化好放入到JedisClusterInfoCache类型的属性中

public abstract class JedisClusterConnectionHandler implements Closeable {
    protected final JedisClusterInfoCache cache;

    public JedisClusterConnectionHandler(Set<HostAndPort> nodes, GenericObjectPoolConfig poolConfig, int connectionTimeout, int soTimeout, String password, String clientName) {
        //初始化集群信息缓存实体类
        this.cache = new JedisClusterInfoCache(poolConfig, connectionTimeout, soTimeout, password, clientName);
        this.initializeSlotsCache(nodes, poolConfig, connectionTimeout, soTimeout, password, clientName);
    }
    
    //...
}

集群信息缓存实体类

提供了维护nodes和slots的功能

public class JedisClusterInfoCache {
    //最主要的两个属性,nodes和slots
    private final Map<String, JedisPool> nodes;  //集群节点IP:PORT - JedisPool
    private final Map<Integer, JedisPool> slots; //集群槽位[0-16384] - JedisPool
    private final ReentrantReadWriteLock rwl;
    private final Lock r;
    private final Lock w;
    private volatile boolean rediscovering;
    private final GenericObjectPoolConfig poolConfig;
    private int connectionTimeout;
    private int soTimeout;
    private String password;
    private String clientName;
    private static final int MASTER_NODE_INDEX = 2;
    
    //初始化nodes和slots
    public void discoverClusterNodesAndSlots(Jedis jedis) {
        this.w.lock();
        try {
            this.reset();
            //获取所有节点信息
            List<Object> slots = jedis.clusterSlots();
            //Object是List对象有以下内容:
            //(0)槽起始位置,(1)槽结束位置,(2)主节点IP和PORT(list对象),(3)从节点IP和PORT(list对象)
            Iterator var3 = slots.iterator();
            while(true) {
                List slotInfo;
                //找到size大于2的Object对象
                do {
                    if (!var3.hasNext()) {
                        return;
                    }
                    Object slotInfoObj = var3.next();
                    slotInfo = (List)slotInfoObj; 
                } while(slotInfo.size() <= 2);
                //解析出该节点都有哪些槽位
                List<Integer> slotNums = this.getAssignedSlotArray(slotInfo);
                int size = slotInfo.size();
                for(int i = 2; i < size; ++i) {
                    //获取到主节点或者从节点的List对象
                    List<Object> hostInfos = (List)slotInfo.get(i);
                    if (hostInfos.size() > 0) {
                        HostAndPort targetNode = this.generateHostAndPort(hostInfos);
                        //初始化JedisClusterInfoCache的nodes属性
                        this.setupNodeIfNotExist(targetNode);
                        if (i == 2) { //表示为主节点
                             //初始化JedisClusterInfoCache的slots属性
                            this.assignSlotsToNode(slotNums, targetNode);
                        }
                    }
                }
            }
        } finally {
            this.w.unlock();
        }
    }
    //重新初始化nodes和slots
    public void renewClusterSlots(Jedis jedis) {
        if (!this.rediscovering) {
            try {
                this.w.lock();
                if (!this.rediscovering) {
                    this.rediscovering = true;
                    try {
                        if (jedis != null) {
                            try {
                                this.discoverClusterSlots(jedis);
                                return;
                            } catch (JedisException var26) {
                            }
                        }
                        //如果jedis是空,则打乱所有的jedisPool,并迭代取一个去实现重现缓存
                        Iterator var2 = this.getShuffledNodesPool().iterator();
                        while(var2.hasNext()) {
                            JedisPool jp = (JedisPool)var2.next();
                            Jedis j = null;
                            try {
                                j = jp.getResource();
                                this.discoverClusterSlots(j);
                                return;
                            } catch (JedisConnectionException var24) {
                            } finally {
                                if (j != null) {
                                    j.close();
                                }
                            }
                        }
                    } finally {
                        this.rediscovering = false;
                    }
                }
            } finally {
                this.w.unlock();
            }
        }
    }
   //根据一个连接去重建缓存
   private void discoverClusterSlots(Jedis jedis) {
        //重要!重要!重要!
        List<Object> slots = jedis.clusterSlots();
        this.slots.clear();
        Iterator var3 = slots.iterator();
        while(var3.hasNext()) {
            Object slotInfoObj = var3.next();
            List<Object> slotInfo = (List)slotInfoObj;
            if (slotInfo.size() > 2) {
                List<Integer> slotNums = this.getAssignedSlotArray(slotInfo);
                List<Object> hostInfos = (List)slotInfo.get(2);
                if (!hostInfos.isEmpty()) {
                    HostAndPort targetNode = this.generateHostAndPort(hostInfos);
                    this.assignSlotsToNode(slotNums, targetNode);
                }
            }
        }
    }
    //初始化槽slot-JedisPool的关系
    public void assignSlotsToNode(List<Integer> targetSlots, HostAndPort targetNode) {
        this.w.lock();
        try {
            JedisPool targetPool = this.setupNodeIfNotExist(targetNode);
            Iterator var4 = targetSlots.iterator();
            while(var4.hasNext()) {
                Integer slot = (Integer)var4.next();
                this.slots.put(slot, targetPool);
            }
        } finally {
            this.w.unlock();
        }
    }
    //初始化节点-JedisPool的关系
    public JedisPool setupNodeIfNotExist(HostAndPort node) {
        this.w.lock();
        JedisPool nodePool;
        try {
            String nodeKey = getNodeKey(node); //nodeKey是节点ip:port的格式
            JedisPool existingPool = (JedisPool)this.nodes.get(nodeKey);
            if (existingPool == null) {
                nodePool = new JedisPool(this.poolConfig, node.getHost(), node.getPort(), this.connectionTimeout, this.soTimeout, this.password, 0, this.clientName, false, (SSLSocketFactory)null, (SSLParameters)null, (HostnameVerifier)null);
                //创建节点与JedisPool的对应关系,一个节点对应一个JedisPool对象
                this.nodes.put(nodeKey, nodePool);
                JedisPool var5 = nodePool;
                return var5;
            }
            nodePool = existingPool;
        } finally {
            this.w.unlock();
        }
        return nodePool;
    }

}

初始化缓存实体类

遍历所有集群节点(包括主从节点),使用当前节点初始化,如果初始化完成则结束跳出循环;如果根据当前节点初始化异常了,就会使用下一个节点的连接进行初始化;

private void initializeSlotsCache(Set<HostAndPort> startNodes, GenericObjectPoolConfig poolConfig, int connectionTimeout, int soTimeout, String password, String clientName) {
        Iterator var7 = startNodes.iterator();
        while(var7.hasNext()) {
            HostAndPort hostAndPort = (HostAndPort)var7.next();
            Jedis jedis = null;
            try {
                jedis = new Jedis(hostAndPort.getHost(), hostAndPort.getPort(), connectionTimeout, soTimeout);
                if (password != null) {//验证集群密码
                    jedis.auth(password);
                }
                if (clientName != null) {
                    jedis.clientSetname(clientName);
                }
                //最!最!最!重要!的初始化 见JedisClusterInfoCache
                this.cache.discoverClusterNodesAndSlots(jedis);
                break;
            } catch (JedisConnectionException var14) {
            } finally {
                if (jedis != null) {
                    jedis.close();
                }
            }
        }
    }

执行命令

以set方法为例

public String set(final String key, final String value) {
        return (String)(new JedisClusterCommand<String>(this.connectionHandler, this.maxAttempts) {
            public String execute(Jedis connection) {
                return connection.set(key, value);
            }
        }).run(key);
    }

集群命令类(这个类的run方法特别重要)

public abstract class JedisClusterCommand<T> {
    //持有集群连接处理器
    private final JedisClusterConnectionHandler connectionHandler;
    //最大重试次数,执行出现连接异常的时候
    private final int maxAttempts;

    public JedisClusterCommand(JedisClusterConnectionHandler connectionHandler, int maxAttempts) {
        this.connectionHandler = connectionHandler;
        this.maxAttempts = maxAttempts;
    }

    public abstract T execute(Jedis var1);

    //执行命令
    public T run(String key) {
        return this.runWithRetries(JedisClusterCRC16.getSlot(key), this.maxAttempts, false, (JedisRedirectionException)null);
    }
    
    private T runWithRetries(int slot, int attempts, boolean tryRandomNode, JedisRedirectionException redirect) {
        if (attempts <= 0) {//重试失败时抛出异常
            throw new JedisClusterMaxAttemptsException("No more cluster attempts left.");
        } else {
            Jedis connection = null;
            Object var7;
            try {
                if (redirect != null) {
                    connection = this.connectionHandler.getConnectionFromNode(redirect.getTargetNode());
                    if (redirect instanceof JedisAskDataException) {
                        connection.asking();
                    }
                } else if (tryRandomNode) {
                    connection = this.connectionHandler.getConnection();
                } else {
                    //绝大部分走这里的逻辑,通过集群连接处理器获取到槽位对应的连接
                    connection = this.connectionHandler.getConnectionFromSlot(slot);
                }
                //真正执行命令
                Object var6 = this.execute(connection);
                return var6;
            } catch (JedisNoReachableClusterNodeException var13) {
                throw var13;
            } catch (JedisConnectionException var14) {
                this.releaseConnection(connection);
                connection = null;
                if (attempts <= 1) { 
                    //重建节点-JedisPool和槽-JedisPool的关系
                    this.connectionHandler.renewSlotCache();
                }
                //重试
                var7 = this.runWithRetries(slot, attempts - 1, tryRandomNode, redirect);
            } catch (JedisRedirectionException var15) {
                if (var15 instanceof JedisMovedDataException) {
                    this.connectionHandler.renewSlotCache(connection);
                }
                this.releaseConnection(connection);
                connection = null;
                //重试
                var7 = this.runWithRetries(slot, attempts - 1, false, var15);
                return var7;
            } finally {
                //释放连接,因此使用集群的时候不需要程序员主动关闭连接
                this.releaseConnection(connection);
            }
            return var7;
        }
    }
    //...
}

获取连接

通过集群连接处理器获得槽对应的连接

    public Jedis getConnectionFromSlot(int slot) {
        //先获取槽对应的JedisPool对象
        JedisPool connectionPool = this.cache.getSlotPool(slot);
        if (connectionPool != null) {
            return connectionPool.getResource();
        } else {
            //重建缓存中节点对应的JedisPool,槽对应的JedisPool
            this.renewSlotCache();
            //获取槽对应的JedisPool对象
            connectionPool = this.cache.getSlotPool(slot);
            return connectionPool != null ? connectionPool.getResource() : this.getConnection();
        }
    }

重建缓存

通过集群连接处理器->集群信息缓存类实现重建缓存

   public void renewSlotCache() {
        this.cache.renewClusterSlots((Jedis)null);
    }

回到set命令

connection.set(key, value);

Jedis

实际上是一个连接

public class Jedis extends BinaryJedis implements JedisCommands, MultiKeyCommands, AdvancedJedisCommands, ScriptingCommands, BasicCommands, ClusterCommands, SentinelCommands, ModuleCommands {   
    //...
    public String set(String key, String value) {
        this.checkIsInMultiOrPipeline();
        this.client.set(key, value);
        return this.client.getStatusCodeReply();
    }
    //...
}

BinaryJedis

public class BinaryJedis implements BasicCommands, BinaryJedisCommands, MultiKeyBinaryCommands, AdvancedBinaryJedisCommands, BinaryScriptingCommands, Closeable {
    protected Client client;
    protected Transaction transaction;
    protected Pipeline pipeline;
    //...
}

Client

public class Client extends BinaryClient implements Commands {
     //...
}

BinaryClient

public class BinaryClient extends Connection {
    private boolean isInMulti;
    private String password;
    private int db;
    private boolean isInWatch;
    //...
}

Connection

public class Connection implements Closeable {
    private static final byte[][] EMPTY_ARGS = new byte[0][];
    private String host = "localhost";
    private int port = 6379;
    private Socket socket;
    private RedisOutputStream outputStream;
    private RedisInputStream inputStream;
    private int connectionTimeout = 2000;
    private int soTimeout = 2000;
    private boolean broken = false;
    private boolean ssl;
    private SSLSocketFactory sslSocketFactory;
    private SSLParameters sslParameters;
    private HostnameVerifier hostnameVerifier;
    //...
}

跟踪执行流程

public class Jedis extends BinaryJedis implements JedisCommands, MultiKeyCommands, AdvancedJedisCommands, ScriptingCommands, BasicCommands, ClusterCommands, SentinelCommands, ModuleCommands {   
    //...
    public String set(String key, String value) {
        this.checkIsInMultiOrPipeline();
        //socket发送命令
        this.client.set(key, value);
        //从socket获取响应
        return this.client.getStatusCodeReply();
    }
    //...
}

client.set最终会执行到Connect类的以下方法

 public void sendCommand(ProtocolCommand cmd, byte[]... args) {
        try {
            this.connect();//如果没有连接则新建连接
            Protocol.sendCommand(this.outputStream, cmd, args);
        } catch (JedisConnectionException var6) {
            JedisConnectionException ex = var6;
            try {
                String errorMessage = Protocol.readErrorLineIfPossible(this.inputStream);
                if (errorMessage != null && errorMessage.length() > 0) {
                    ex = new JedisConnectionException(errorMessage, ex.getCause());
                }
            } catch (Exception var5) {
            }
            this.broken = true;
            throw ex;
        }
    }

验证并获取socket连接

public void connect() {
        if (!this.isConnected()) {//未连接
            try {
                this.socket = new Socket(); //创建一个Socket
                this.socket.setReuseAddress(true);
                this.socket.setKeepAlive(true);
                this.socket.setTcpNoDelay(true);
                this.socket.setSoLinger(true, 0);
                this.socket.connect(new InetSocketAddress(this.host, this.port), this.connectionTimeout);
                this.socket.setSoTimeout(this.soTimeout);
                if (this.ssl) {
                    if (null == this.sslSocketFactory) {
                        this.sslSocketFactory = (SSLSocketFactory)SSLSocketFactory.getDefault();
                    }

                    this.socket = this.sslSocketFactory.createSocket(this.socket, this.host, this.port, true);
                    if (null != this.sslParameters) {
                        ((SSLSocket)this.socket).setSSLParameters(this.sslParameters);
                    }

                    if (null != this.hostnameVerifier && !this.hostnameVerifier.verify(this.host, ((SSLSocket)this.socket).getSession())) {
                        String message = String.format("The connection to '%s' failed ssl/tls hostname verification.", this.host);
                        throw new JedisConnectionException(message);
                    }
                }
                //获取到两个流
                this.outputStream = new RedisOutputStream(this.socket.getOutputStream());
                this.inputStream = new RedisInputStream(this.socket.getInputStream());
            } catch (IOException var2) {
                this.broken = true;
                throw new JedisConnectionException("Failed connecting to host " + this.host + ":" + this.port, var2);
            }
        }
    }

client.getStatusCodeReply最终会执行到Connect的以下方法

    public String getStatusCodeReply() {
        this.flush(); //刷新流缓存的数据
        byte[] resp = (byte[])((byte[])this.readProtocolWithCheckingBroken());//读取响应
        return null == resp ? null : SafeEncoder.encode(resp); //反序列化获取字符串
    }

    protected Object readProtocolWithCheckingBroken() {
        try {
            return Protocol.read(this.inputStream);
        } catch (JedisConnectionException var2) {
            this.broken = true;
            throw var2;
        }
    }

集群节点变化

从上述JedisClusterCommand.runWithRetries方法可知,如果在执行命令最后一次重试的时候,会重建nodes和slots与JedisPool的对象关系

故障转移

在jedisCluster初始化之后,如果集群中某个主节点宕机了,然后从节点晋升了,那么发生故障转移的主节点已经发生了变化。nodes与JedisPool的关系,如果是原先节点那么还是原先的JedisPool对象,晋升为主节点的从节点对应的JedisPool对象还是缓存nodes中的,但是必须要重建slots与JedisPool的关系。因为宕机的主节点的JedisPool对象和新晋升的主节点JedisPool对象是两个不同的对象,而slots槽中缓存的是JedisPool对象,由于主从节点对应不同JedisPool对象所以需要重建slots-JedisPool对象的关系。

那么在什么时候发生重建呢?

在某个线程执行到发生故障转移节点管理的槽时候
以下分析的方法均来自JedisClusterCommand.runWithRetries
1.首先能正常从连接池获取连接对象,但是该连接对象是针对原先宕机的主节点的IP和端口的
connection = this.connectionHandler.getConnectionFromSlot(slot);
2.然后 Object var6 = this.execute(connection);该方法底层会校验连接是否可用,由于原先
主节点已经宕机了,必定会校验不通过;
3.执行在原先主节点重试,都不成功,会执行重建slots-JedisPool关系
 this.connectionHandler.renewSlotCache(); 注意此处使用的是可用的从JedisPool,原先nodes-JedisPool
关系包含了主节点的,同时也包含从节点的,所以从原先的nodes-JedisPool取出从IP和端口(现在是主了)的
JedisPool对象
【Jedis是线程不安全;同时有可能多个不同的Jedis去执行都要进行重建,但是底层使用了锁做到线程安全】
【从集群初始化 discoverClusterNodesAndSlots(jedis)可知,slots-JedisPool中JedisPool是主节点对应的对象】

新增节点

在jedisCluster初始化之后,如果集群添加了节点,那么每个节点管理的槽已经和原先的范围不一样了。nodes与JedisPool的关系,如果是原先节点那么还是原先的JedisPool对象,新增的节点对应的JedisPool对象则是新建出来的,并且缓存到了nodes中。然后必然又需要重建slots与JedisPool的关系。

经过故障转移时候如何重建,很容易知道,如果添加了新的节点,那么新节点必然需要新建新的JedisPool对象
并缓存到nodes-JedisPool关系中,并重建nodes-JedisPool关系。如果停机扩容的话,那么是在初始化集群的
时候初始化nodes-JedisPool和slots-JedisPool关系的;如果不停机扩容,那么是在执行命令时候的时候重建,
这个时候就 public T run(String key) 方法JedisConnectionException(底层创建socket异常导致的)和
JedisRedirectionException(访问一个数据不在当前节点中,常出现在新增或者删除节点之后,由于通过hash
计算到对应的槽并获取得到的是原先JedisPool对应的连接导致)中触发重建。

另外,删除节点也同理。

主从皆宕

如果管理某一段槽的主从节点都宕机了,那么整个集群不可用。

集群解决了什么问题

  • 容量 :

  • 如果是单机依赖于服务器资源,比如内存、磁盘等。如果是集群那么可以充分利用多台服务器的资源

  • 高可用

  • 主从结构,故障转移