Seata通信模块分析

647 阅读9分钟

Seata通信模块分析

Server

整体概览

在seata项目中,client与server是利用netty来完成基于tcp的通信的。在server模块中,RemotingServer接口定义了Server的基本功能,AbstractNettyRemoting实现了远程消息的处理、同步/异步发送、任务超时管理等基本功能,AbstractNettyRemotingServer基于AbstractNettyRemoting近一步进行封装,定义了处理远程消息的ServerHandler,实现了RemotingServer中的业务接口。而NettyRemotingServer则是负责进行业务处理器的组装。

image-20210723173004783.png

AbstractNettyRemoting

在AbstractNettyRemoting中,所有需要同步发送中的消息都被放在一个ConcurrentHashMap中,key是同步消息的id,value是自定义的MessageFuture。同步消息在发送时首先会初始化一个MessageFuture,通过MessageFuture中的CompletableFuture来实现异步结果的阻塞等待与异步结果设置的功能(同Netty中的ChannelPromise)。而AbstractNettyRemoting的构造函数中,会初始化一个定时任务,来定时的从这些Map中清理掉所有已经超时的同步消息请求。

protected Object sendSync(Channel channel, RpcMessage rpcMessage, long timeoutMillis) throws TimeoutException {
    if (timeoutMillis <= 0) {
        throw new FrameworkException("timeout should more than 0ms");
    }
    if (channel == null) {
        LOGGER.warn("sendSync nothing, caused by null channel.");
        return null;
    }

    MessageFuture messageFuture = new MessageFuture();
    messageFuture.setRequestMessage(rpcMessage);
    messageFuture.setTimeout(timeoutMillis);
    futures.put(rpcMessage.getId(), messageFuture);

    channelWritableCheck(channel, rpcMessage.getBody());

    String remoteAddr = ChannelUtil.getAddressFromChannel(channel);
    doBeforeRpcHooks(remoteAddr, rpcMessage);

    channel.writeAndFlush(rpcMessage).addListener((ChannelFutureListener) future -> {
        if (!future.isSuccess()) {
            MessageFuture messageFuture1 = futures.remove(rpcMessage.getId());
            if (messageFuture1 != null) {
                messageFuture1.setResultMessage(future.cause());
            }
            destroyChannel(future.channel());
        }
    });

    try {
        Object result = messageFuture.get(timeoutMillis, TimeUnit.MILLISECONDS);
        doAfterRpcHooks(remoteAddr, rpcMessage, result);
        return result;
    } catch (Exception exx) {
        LOGGER.error("wait response error:{},ip:{},request:{}", exx.getMessage(), channel.remoteAddress(),
            rpcMessage.getBody());
        if (exx instanceof TimeoutException) {
            throw (TimeoutException) exx;
        } else {
            throw new RuntimeException(exx);
        }
    }
}

在发送请求之前,首先会检查对应channel是否可写,然后调用自定义的hook,hook的加载利用的是EnhancedServiceLoader实现的SPI,最后会注册一个发送监听器,当发送失败时会关闭channel。值得一提的是在检查channel是否可写时用到了锁,当channel不可用时会调用wait释放锁,而当channel中的channelWritabilityChanged回调后再notify所有的写操作。从全局角度去看不同的channel用的是同一个锁,虽然牺牲了一定的并发效率,但是当channel不可写时通常意味着I/O此时遇到了瓶颈,如果不受控制的将数据写入到channel中,数据将会在channel的ChannelOutboundBuffer缓冲中排队,会造成数据积压的恶性循环引发OOM,用最简单明了的方法避免了这个问题。

private void channelWritableCheck(Channel channel, Object msg) {
    int tryTimes = 0;
    synchronized (lock) {
        while (!channel.isWritable()) {
            try {
                tryTimes++;
                if (tryTimes > NettyClientConfig.getMaxNotWriteableRetry()) {
                    destroyChannel(channel);
                    throw new FrameworkException("msg:" + ((msg == null) ? "null" : msg.toString()),
                        FrameworkErrorCode.ChannelIsNotWritable);
                }
                lock.wait(NOT_WRITEABLE_CHECK_MILLS);
            } catch (InterruptedException exx) {
                LOGGER.error(exx.getMessage());
            }
        }
    }
}

异步消息发送与同步消息发送类似,省去了放入到futureMap的过程。再来看下处理消息的过程,实际上处理过程与Sentinel的处理过程也比较的类似,都是获取到消息体的类型,然后从处理器map中获取到对应的处理器,与Sentinel不一样的是,Sentinel的处理是在netty的I/O线程中直接处理请求的,seata server提供了在ExecutorService中异步执行代码的功能,主要是为了处理一些例如包含数据库的阻塞I/O操作避免阻塞通信的I/O线程。

protected void processMessage(ChannelHandlerContext ctx, RpcMessage rpcMessage) throws Exception {
    if (LOGGER.isDebugEnabled()) {
        LOGGER.debug(String.format("%s msgId:%s, body:%s", this, rpcMessage.getId(), rpcMessage.getBody()));
    }
    Object body = rpcMessage.getBody();
    if (body instanceof MessageTypeAware) {
        MessageTypeAware messageTypeAware = (MessageTypeAware) body;
        final Pair<RemotingProcessor, ExecutorService> pair = this.processorTable.get((int) messageTypeAware.getTypeCode());
        if (pair != null) {
            if (pair.getSecond() != null) {
                try {
                    pair.getSecond().execute(() -> {
                        try {
                            pair.getFirst().process(ctx, rpcMessage);
                        } catch (Throwable th) {
                            LOGGER.error(FrameworkErrorCode.NetDispatch.getErrCode(), th.getMessage(), th);
                        } finally {
                            MDC.clear();
                        }
                    });
                } catch (RejectedExecutionException e) {
                    LOGGER.error(FrameworkErrorCode.ThreadPoolFull.getErrCode(),
                        "thread pool is full, current max pool size is " + messageExecutor.getActiveCount());
                    if (allowDumpStack) {
                        String name = ManagementFactory.getRuntimeMXBean().getName();
                        String pid = name.split("@")[0];
                        int idx = new Random().nextInt(100);
                        try {
                            Runtime.getRuntime().exec("jstack " + pid + " >d:/" + idx + ".log");
                        } catch (IOException exx) {
                            LOGGER.error(exx.getMessage());
                        }
                        allowDumpStack = false;
                    }
                }
            } else {
                try {
                    pair.getFirst().process(ctx, rpcMessage);
                } catch (Throwable th) {
                    LOGGER.error(FrameworkErrorCode.NetDispatch.getErrCode(), th.getMessage(), th);
                }
            }
        } else {
            LOGGER.error("This message type [{}] has no processor.", messageTypeAware.getTypeCode());
        }
    } else {
        LOGGER.error("This rpcMessage body[{}] is not MessageTypeAware type.", body);
    }
}

AbstractNettyRemotingServer

AbstractNettyRemotingServer继承自AbstractNettyRemoting,负责初始化Netty的Server端点(包括注册ServerHandler)以及注册消息处理器。这里主要关注两项内容,一是ServerBootstrap的设置,另一个则是用于链接管理的ChannelManager。

相比与Sentinel的配置而言,Seata加入了TCP KeepAlive的设置,同时配合了IdleStateHandler来协同管理TCP闲置链接的移除。其次注意到的是Seata设置了写缓冲区的水位线,在前面我们提到AbstractNettyRemoting中会判断Channel的可写性,而Channel的可写性实际上是根据当前缓冲区待发送内容与水位线的差来决定是否可写的。高于最高水位线则不可写,直到低于最低水位线。

this.serverBootstrap.group(this.eventLoopGroupBoss, this.eventLoopGroupWorker)
    .channel(NettyServerConfig.SERVER_CHANNEL_CLAZZ)
    .option(ChannelOption.SO_BACKLOG, nettyServerConfig.getSoBackLogSize())
    .option(ChannelOption.SO_REUSEADDR, true)
    .childOption(ChannelOption.SO_KEEPALIVE, true)
    .childOption(ChannelOption.TCP_NODELAY, true)
    .childOption(ChannelOption.SO_SNDBUF, nettyServerConfig.getServerSocketSendBufSize())
    .childOption(ChannelOption.SO_RCVBUF, nettyServerConfig.getServerSocketResvBufSize())
    .childOption(ChannelOption.WRITE_BUFFER_WATER_MARK,
        new WriteBufferWaterMark(nettyServerConfig.getWriteBufferLowWaterMark(),
            nettyServerConfig.getWriteBufferHighWaterMark()))
    .localAddress(new InetSocketAddress(listenPort))
    .childHandler(new ChannelInitializer<SocketChannel>() {
        @Override
        public void initChannel(SocketChannel ch) {
            ch.pipeline().addLast(new IdleStateHandler(nettyServerConfig.getChannelMaxReadIdleSeconds(), 0, 0))
                .addLast(new ProtocolV1Decoder())
                .addLast(new ProtocolV1Encoder());
            if (channelHandlers != null) {
                addChannelPipelineLast(ch, channelHandlers);
            }

        }
    });

ChannelManager是Server端用来管理远程连接,在ChannelManager中使用了三个ConcurrentHashMap来管理远程连接,分别是IDENTIFIED_CHANNELS、RM_CHANNELS以及TM_CHANNELS。在这三个Map中Entry的Value保存的最终都并非是Channel本身,而是RpcContext。RpcContext实际上是对Channel的一层信息封装,每个RpcContext对应一个远端的Channel,并且包含了远端Channel的一些基本信息,比如当前Channel对应的应用ID、客户端ID等。特别的是RpcContext中的ConcurrentMap,clientIDHolderMap、clientTMHolderMap以及clientRMHolderMap。ClientIdHolderMap保存了所有Channel与其对应的RpcContext,ClientTMHolderMap保存的则是端口号与其对应的RpcContext,ClientRMHolderMap则保存了所有资源ID与其对应的RpcContext(这里的RpcContext保存的实际上是端口号与RpcContext)。而这些Map实际上都与来自与ChannelManager中的三个Map。

private static final ConcurrentMap<Channel, RpcContext> IDENTIFIED_CHANNELS = new ConcurrentHashMap<>();

/**
 * resourceId -> applicationId -> ip -> port -> RpcContext
 */
private static final ConcurrentMap<String, ConcurrentMap<String, ConcurrentMap<String,
    ConcurrentMap<Integer, RpcContext>>>> RM_CHANNELS = new ConcurrentHashMap<>();

/**
 * ip+appname,port
 */
private static final ConcurrentMap<String, ConcurrentMap<Integer, RpcContext>> TM_CHANNELS
    = new ConcurrentHashMap<>();

而这些Map实际上都是用于在Seata的业务流程中查找远端的应用所对应的Channel,举个例子当需要通知远端的RM进行二阶段的提交时,需要找到当前事务对应的所有RM发送提交请求。

public static Channel getChannel(String resourceId, String clientId) {
    Channel resultChannel = null;

    //client-id由应用id:地址组成
    String[] clientIdInfo = readClientId(clientId);

    if (clientIdInfo == null || clientIdInfo.length != 3) {
        throw new FrameworkException("Invalid Client ID: " + clientId);
    }

    String targetApplicationId = clientIdInfo[0];
    String targetIP = clientIdInfo[1];
    int targetPort = Integer.parseInt(clientIdInfo[2]);

    //首先找到注册了这个资源的所有应用
    ConcurrentMap<String, ConcurrentMap<String, ConcurrentMap<Integer,
        RpcContext>>> applicationIdMap = RM_CHANNELS.get(resourceId);

    if (targetApplicationId == null || applicationIdMap == null ||  applicationIdMap.isEmpty()) {
        if (LOGGER.isInfoEnabled()) {
            LOGGER.info("No channel is available for resource[{}]", resourceId);
        }
        return null;
    }

    ConcurrentMap<String, ConcurrentMap<Integer, RpcContext>> ipMap = applicationIdMap.get(targetApplicationId);

    if (ipMap != null && !ipMap.isEmpty()) {
        // Firstly, try to find the original channel through which the branch was registered.
        // 首先尝试根据注册时的ClientId找到对应的Channel
        ConcurrentMap<Integer, RpcContext> portMapOnTargetIP = ipMap.get(targetIP);
        if (portMapOnTargetIP != null && !portMapOnTargetIP.isEmpty()) {
            RpcContext exactRpcContext = portMapOnTargetIP.get(targetPort);
            if (exactRpcContext != null) {
                Channel channel = exactRpcContext.getChannel();
                if (channel.isActive()) {
                    resultChannel = channel;
                    if (LOGGER.isDebugEnabled()) {
                        LOGGER.debug("Just got exactly the one {} for {}", channel, clientId);
                    }
                } else {
                    if (portMapOnTargetIP.remove(targetPort, exactRpcContext)) {
                        if (LOGGER.isInfoEnabled()) {
                            LOGGER.info("Removed inactive {}", channel);
                        }
                    }
                }
            }

            // The original channel was broken, try another one.
            //如果没有找到原Channel,那么尝试从这个应用的其他端口上面找到一个可用的Channel
            if (resultChannel == null) {
                for (ConcurrentMap.Entry<Integer, RpcContext> portMapOnTargetIPEntry : portMapOnTargetIP
                    .entrySet()) {
                    Channel channel = portMapOnTargetIPEntry.getValue().getChannel();

                    if (channel.isActive()) {
                        resultChannel = channel;
                        if (LOGGER.isInfoEnabled()) {
                            LOGGER.info(
                                "Choose {} on the same IP[{}] as alternative of {}", channel, targetIP, clientId);
                        }
                        break;
                    } else {
                        if (portMapOnTargetIP.remove(portMapOnTargetIPEntry.getKey(),
                            portMapOnTargetIPEntry.getValue())) {
                            if (LOGGER.isInfoEnabled()) {
                                LOGGER.info("Removed inactive {}", channel);
                            }
                        }
                    }
                }
            }
        }

        // No channel on the this app node, try another one.
        //否则,尝试从其他应用节点(同应用不同IP)上面再次查找
        if (resultChannel == null) {
            for (ConcurrentMap.Entry<String, ConcurrentMap<Integer, RpcContext>> ipMapEntry : ipMap
                .entrySet()) {
                if (ipMapEntry.getKey().equals(targetIP)) { continue; }

                ConcurrentMap<Integer, RpcContext> portMapOnOtherIP = ipMapEntry.getValue();
                if (portMapOnOtherIP == null || portMapOnOtherIP.isEmpty()) {
                    continue;
                }

                for (ConcurrentMap.Entry<Integer, RpcContext> portMapOnOtherIPEntry : portMapOnOtherIP.entrySet()) {
                    Channel channel = portMapOnOtherIPEntry.getValue().getChannel();

                    if (channel.isActive()) {
                        resultChannel = channel;
                        if (LOGGER.isInfoEnabled()) {
                            LOGGER.info("Choose {} on the same application[{}] as alternative of {}", channel, targetApplicationId, clientId);
                        }
                        break;
                    } else {
                        if (portMapOnOtherIP.remove(portMapOnOtherIPEntry.getKey(),
                            portMapOnOtherIPEntry.getValue())) {
                            if (LOGGER.isInfoEnabled()) {
                                LOGGER.info("Removed inactive {}", channel);
                            }
                        }
                    }
                }
                if (resultChannel != null) { break; }
            }
        }
    }

    //如果在对应的应用里面找不到Channel,尝试从其他应用中找到一个注册了该RM的应用,这个方法对于AT模式是没有问题的,但是对于TCC模式就会出现了同资源不同应用没有二阶段相关代码的问题
    if (resultChannel == null) {
        resultChannel = tryOtherApp(applicationIdMap, targetApplicationId);

        if (resultChannel == null) {
            if (LOGGER.isInfoEnabled()) {
                LOGGER.info("No channel is available for resource[{}] as alternative of {}", resourceId, clientId);
            }
        } else {
            if (LOGGER.isInfoEnabled()) {
                LOGGER.info("Choose {} on the same resource[{}] as alternative of {}", resultChannel, resourceId, clientId);
            }
        }
    }

    return resultChannel;

}

NettyRemotingServer则是实际负责组装消息注册器以及启动Server端点的实际类,代码相对简单不再细述,这里提一点的是无论是在Seata还是Sentinel中,负责启动Server的实际类都会利用一个原子变量来保障仅启动一次。

Client

整体概览

在Client侧,基础的消息发送与消息处理使用的也是AbstractNettyRemoting提供的功能,与AbstractNettyRemotingServer不同的是AbstractNettyRemotingClient中针对消息的发送以及连接的建立增加了一些客户端层面的优化。RemotingClient与RemotingServer相比而言额外增加了关于注册消息处理的接口,均由对应的TmNettingRemotingClient与RmNettyRemotingClient来负责实现。

image-20210726114816357.png

AbstractNettyRemotingClient

初始化函数中额外增加了两个功能,一个是定时断线重连的功能,还有一个则是客户端请求批量发送的线程池初始化。这里的断线重连与实际上的断线重连还是有一定的差别的,在Seata的设计上,每个Client会与当前所有的Server建立一个连接,而这个Server通常是通过服务注册中心来获取的动态列表,因此Client采用了定时拉取列表,检查连接是否建立的方式来实现连接的保持,而不是像Sentinel一样通过监听Client的断线事件的方式。

@Override
public void init() {
    timerExecutor.scheduleAtFixedRate(new Runnable() {
        @Override
        public void run() {
            clientChannelManager.reconnect(getTransactionServiceGroup());
        }
    }, SCHEDULE_DELAY_MILLS, SCHEDULE_INTERVAL_MILLS, TimeUnit.MILLISECONDS);
    if (NettyClientConfig.isEnableClientBatchSendRequest()) {
        mergeSendExecutorService = new ThreadPoolExecutor(MAX_MERGE_SEND_THREAD,
            MAX_MERGE_SEND_THREAD,
            KEEP_ALIVE_TIME, TimeUnit.MILLISECONDS,
            new LinkedBlockingQueue<>(),
            new NamedThreadFactory(getThreadPrefix(), MAX_MERGE_SEND_THREAD));
        mergeSendExecutorService.submit(new MergedSendRunnable());
    }
    super.init();
    clientBootstrap.start();
}

在发送同步消息的时候,与Server端不同的是Client发送消息之前会先从Server列表中查找一个目标Server节点的地址,如果开启了批量发送模式消息除了要被放进futureMap中以外,还会放入到Server地址中对应的阻塞队列中,消息的发送最终由MergedSendRunnable从阻塞队列的Map中轮训合并发送。注意到这边用了一个MergeLock,利用了锁的wait方法,当一次消息发送处理完以后,合并发送线程会等待避免空轮训造成cpu负载,然后由发送线程根据状态来激活。所以实际上我们注意到合并发送的过程是发生在线程从等待状态进入运行状态到此次发送处理完毕。

@Override
public Object sendSyncRequest(Object msg) throws TimeoutException {
    String serverAddress = loadBalance(getTransactionServiceGroup(), msg);
    int timeoutMillis = NettyClientConfig.getRpcRequestTimeout();
    RpcMessage rpcMessage = buildRequestMessage(msg, ProtocolConstants.MSGTYPE_RESQUEST_SYNC);

    // send batch message
    // put message into basketMap, @see MergedSendRunnable
    if (NettyClientConfig.isEnableClientBatchSendRequest()) {

        // send batch message is sync request, needs to create messageFuture and put it in futures.
        MessageFuture messageFuture = new MessageFuture();
        messageFuture.setRequestMessage(rpcMessage);
        messageFuture.setTimeout(timeoutMillis);
        futures.put(rpcMessage.getId(), messageFuture);

        // put message into basketMap
        BlockingQueue<RpcMessage> basket = CollectionUtils.computeIfAbsent(basketMap, serverAddress,
            key -> new LinkedBlockingQueue<>());
        if (!basket.offer(rpcMessage)) {
            LOGGER.error("put message into basketMap offer failed, serverAddress:{},rpcMessage:{}",
                    serverAddress, rpcMessage);
            return null;
        }
        if (LOGGER.isDebugEnabled()) {
            LOGGER.debug("offer message: {}", rpcMessage.getBody());
        }
        if (!isSending) {
            synchronized (mergeLock) {
                mergeLock.notifyAll();
            }
        }

        try {
            return messageFuture.get(timeoutMillis, TimeUnit.MILLISECONDS);
        } catch (Exception exx) {
            LOGGER.error("wait response error:{},ip:{},request:{}",
                exx.getMessage(), serverAddress, rpcMessage.getBody());
            if (exx instanceof TimeoutException) {
                throw (TimeoutException) exx;
            } else {
                throw new RuntimeException(exx);
            }
        }

    } else {
        Channel channel = clientChannelManager.acquireChannel(serverAddress);
        return super.sendSync(channel, rpcMessage, timeoutMillis);
    }

}

接下来来看一下用于连接远程Server的Netty client侧的相关配置,注意到如果客户端设置了enbaleNative选项,并且不在mac上运行的时候会将epoll模式设置为边缘触发模式,除了边缘触发模式还有水平触发模式。边缘触发模式指的是每次事件只会通知一次,水平触发模式指的是只要事件没有消费完,则会一直通知,举个例子如果读缓冲中有数据,但是没有被完全读取完,边缘触发模式不会再次触发可读事件,而水平触发模式会再次触发可读事件。同时还在客户端侧开启了TCP_QUICKACK选项,用来加速ACK包的响应。

this.bootstrap.group(this.eventLoopGroupWorker).channel(
    nettyClientConfig.getClientChannelClazz()).option(
    ChannelOption.TCP_NODELAY, true).option(ChannelOption.SO_KEEPALIVE, true).option(
    ChannelOption.CONNECT_TIMEOUT_MILLIS, nettyClientConfig.getConnectTimeoutMillis()).option(
    ChannelOption.SO_SNDBUF, nettyClientConfig.getClientSocketSndBufSize()).option(ChannelOption.SO_RCVBUF,
    nettyClientConfig.getClientSocketRcvBufSize());

if (nettyClientConfig.enableNative()) {
    if (PlatformDependent.isOsx()) {
        if (LOGGER.isInfoEnabled()) {
            LOGGER.info("client run on macOS");
        }
    } else {
        bootstrap.option(EpollChannelOption.EPOLL_MODE, EpollMode.EDGE_TRIGGERED)
            .option(EpollChannelOption.TCP_QUICKACK, true);
    }
}

最后来看下客户端侧用于管理连接的NettyClientChannelManager,在NettyClientChannelManager中GenericKeyedObjectPool来维护一个Channel的链接池,并且包含了三个map,channelLocks用来存放每个channel对应的锁,锁主要是用于确保对于同一个Server地址只会建立一个连接;channels用来存放所有的channel;poolKeyMap用于存放server地址与NettyPoolKey的映射关系,而NettyPoolKey同时也是作为Channel连接池的Key。NettyPoolKey是根据Server地址生成出来的对象,不同的Client实现类生成的方式不同,在Key中包含了当前客户端的角色信息、地址以及一条消息,这条消息是初次注册的消息。

下面我们通过一次连接的获取来简单看一下ChannelMananger的主体流程。首先是从channels中获取对应的Channel,然后需要对Channel是否激活进行判断,在getExistAliveChannel中如果Channel处于非激活状态不会马上进入连接的流程,而是选择等待一段时间来看是否会有连接正在处理过程中。否则,将会获取一个server地址对应的锁对象,获取锁以后再进行连接。

Channel acquireChannel(String serverAddress) {
    Channel channelToServer = channels.get(serverAddress);
    if (channelToServer != null) {
        channelToServer = getExistAliveChannel(channelToServer, serverAddress);
        if (channelToServer != null) {
            return channelToServer;
        }
    }
    if (LOGGER.isInfoEnabled()) {
        LOGGER.info("will connect to " + serverAddress);
    }
    Object lockObj = CollectionUtils.computeIfAbsent(channelLocks, serverAddress, key -> new Object());
    synchronized (lockObj) {
        return doConnect(serverAddress);
    }
}

在doConnect函数中我们看到,在连接之前又判断了一次是否已经有了Channel。如果没有,才会确认进行连接获取。连接的获取利用了对象池,如果获取到对象池中的连接,如果是RM的NettyPoolKey还会对Key进行一次更新。

private Channel doConnect(String serverAddress) {
    Channel channelToServer = channels.get(serverAddress);
    if (channelToServer != null && channelToServer.isActive()) {
        return channelToServer;
    }
    Channel channelFromPool;
    try {
        NettyPoolKey currentPoolKey = poolKeyFunction.apply(serverAddress);
        NettyPoolKey previousPoolKey = poolKeyMap.putIfAbsent(serverAddress, currentPoolKey);
        if (previousPoolKey != null && previousPoolKey.getMessage() instanceof RegisterRMRequest) {
            RegisterRMRequest registerRMRequest = (RegisterRMRequest) currentPoolKey.getMessage();
            ((RegisterRMRequest) previousPoolKey.getMessage()).setResourceIds(registerRMRequest.getResourceIds());
        }
        channelFromPool = nettyClientKeyPool.borrowObject(poolKeyMap.get(serverAddress));
        channels.put(serverAddress, channelFromPool);
    } catch (Exception exx) {
        LOGGER.error("{} register RM failed.",FrameworkErrorCode.RegisterRM.getErrCode(), exx);
        throw new FrameworkException("can not register RM,err:" + exx.getMessage());
    }
    return channelFromPool;
}

而NettyPoolableFactory则是负责真正创建连接与销毁连接的对象工厂类,创建连接时还会发送NettyPoolKey中对应的Request信息,根据Request信息来回调注册是否成功。

@Override
public Channel makeObject(NettyPoolKey key) {
    InetSocketAddress address = NetUtil.toInetSocketAddress(key.getAddress());
    if (LOGGER.isInfoEnabled()) {
        LOGGER.info("NettyPool create channel to " + key);
    }
    Channel tmpChannel = clientBootstrap.getNewChannel(address);
    long start = System.currentTimeMillis();
    Object response;
    Channel channelToServer = null;
    if (key.getMessage() == null) {
        throw new FrameworkException("register msg is null, role:" + key.getTransactionRole().name());
    }
    try {
        response = rpcRemotingClient.sendSyncRequest(tmpChannel, key.getMessage());
        if (!isRegisterSuccess(response, key.getTransactionRole())) {
            rpcRemotingClient.onRegisterMsgFail(key.getAddress(), tmpChannel, response, key.getMessage());
        } else {
            channelToServer = tmpChannel;
            rpcRemotingClient.onRegisterMsgSuccess(key.getAddress(), tmpChannel, response, key.getMessage());
        }
    } catch (Exception exx) {
        if (tmpChannel != null) {
            tmpChannel.close();
        }
        throw new FrameworkException(
            "register " + key.getTransactionRole().name() + " error, errMsg:" + exx.getMessage());
    }
    if (LOGGER.isInfoEnabled()) {
        LOGGER.info("register success, cost " + (System.currentTimeMillis() - start) + " ms, version:" + getVersion(
            response, key.getTransactionRole()) + ",role:" + key.getTransactionRole().name() + ",channel:"
            + channelToServer);
    }
    return channelToServer;
}

而具体的TmNettyRemotingClient与RmNettyRemotingClient并没有做过多的工作,只是定义了一些基本参数,与Server端类似,不再细述。

协议

在Seata中,Client与Server之间的通信消息最终被抽象成了RpcMessage对象,在这个RpcMessage对象中包含了消息id,消息类型,消息的编码方式,压缩器,头部以及真正的消息体。

image-20210726151642755.png

用于解码的ProtocolV1Decoder实际上是继承自LengthFieldBasedFrameDecoder,解码的过程相对而言还是比较的直观,首先解析头部内容,然后获取消息体body,根据压缩字段判断如果解压,再根据序列化字段来判断要使用哪种反序列化器。

public Object decodeFrame(ByteBuf frame) {
    byte b0 = frame.readByte();
    byte b1 = frame.readByte();
    if (ProtocolConstants.MAGIC_CODE_BYTES[0] != b0
            || ProtocolConstants.MAGIC_CODE_BYTES[1] != b1) {
        throw new IllegalArgumentException("Unknown magic code: " + b0 + ", " + b1);
    }

    byte version = frame.readByte();
    // TODO  check version compatible here

    int fullLength = frame.readInt();
    short headLength = frame.readShort();
    byte messageType = frame.readByte();
    byte codecType = frame.readByte();
    byte compressorType = frame.readByte();
    int requestId = frame.readInt();

    RpcMessage rpcMessage = new RpcMessage();
    rpcMessage.setCodec(codecType);
    rpcMessage.setId(requestId);
    rpcMessage.setCompressor(compressorType);
    rpcMessage.setMessageType(messageType);

    // direct read head with zero-copy
    int headMapLength = headLength - ProtocolConstants.V1_HEAD_LENGTH;
    if (headMapLength > 0) {
        Map<String, String> map = HeadMapSerializer.getInstance().decode(frame, headMapLength);
        rpcMessage.getHeadMap().putAll(map);
    }

    // read body
    if (messageType == ProtocolConstants.MSGTYPE_HEARTBEAT_REQUEST) {
        rpcMessage.setBody(HeartbeatMessage.PING);
    } else if (messageType == ProtocolConstants.MSGTYPE_HEARTBEAT_RESPONSE) {
        rpcMessage.setBody(HeartbeatMessage.PONG);
    } else {
        int bodyLength = fullLength - headLength;
        if (bodyLength > 0) {
            byte[] bs = new byte[bodyLength];
            frame.readBytes(bs);
            Compressor compressor = CompressorFactory.getCompressor(compressorType);
            bs = compressor.decompress(bs);
            Serializer serializer = EnhancedServiceLoader.load(Serializer.class, SerializerType.getByCode(rpcMessage.getCodec()).name());
            rpcMessage.setBody(serializer.deserialize(bs));
        }
    }

    return rpcMessage;
}

关于Protobuf而言,这里与Sentinel不同的是,Sentinel采用的外层即是Protobuf序列化,而用any类型来定义消息体的内容,而Seata外层是自定义的序列化,内层内容采用不同的序列化的方式,更加的灵活,像Protobuf而言即可以根据不同的字段类型采用不同的序列化手段,传输更加的高效。

@Override
public <T> T deserialize(byte[] bytes) {
    if (bytes == null) {
        throw new NullPointerException();
    }
    ByteBuffer byteBuffer = ByteBuffer.wrap(bytes);
    //获取类名
    int clazzNameLength = byteBuffer.getInt();
    byte[] clazzName = new byte[clazzNameLength];
    byteBuffer.get(clazzName);
    //获取对象数据
    byte[] body = new byte[bytes.length - clazzNameLength - 4];
    byteBuffer.get(body);
    final String descriptorName = new String(clazzName, UTF8);
    Class protobufClazz = ProtobufConvertManager.getInstance().fetchProtoClass(descriptorName);
    Object protobufObject = ProtobufInnerSerializer.deserializeContent(protobufClazz.getName(), body);
    //translate back to core model
    final PbConvertor pbConvertor = ProtobufConvertManager.getInstance().fetchReversedConvertor(protobufClazz.getName());
    Object newBody = pbConvertor.convert2Model(protobufObject);
    return (T)newBody;
}