前言

最近开始看Kafka源码了，首先从网络I/O开始阅读源码，本文从客户端代码开始分析，了解kafka在客户端是如何处理网络请求的

正文

Kafka客户端网络I/O使用了Java Nio，这里复习一下Java Nio上面重要的泪：

SocketChannel：客户端通道，对于底层字节数据的读取和写入就发生在通道上，常用的是channel.read(buffer)，channel.write(buffer)方法，表示读取到缓冲区和从缓冲区发送数据
Selector：选择器，Channel会发送读事件和写事件，Selector可以选择键的方式监听事件的发生
SelectionKey：选择键，可以通过channel.register(selector)将通道注册到Selector上，然后返回SelectorKey，当读写事件发生的时候可以通过SelectorKey找到对应的通道，从而执行对应的读写方法

Nio的具体使用方法就不再赘述了，netty篇就完全讲过。。。。

kafka通过封装上诉这三个核心类来实现自己的客户端网络I/O，下面看一下具体实现

KafkaChannel

KafkaChannel包装了SocketChannel，同样负责底层数据的读写

    //唯一id
    private final String id;
    //key和channel
    private final TransportLayer transportLayer;
    private final int maxReceiveSize;
    //分配内存
    private final MemoryPool memoryPool;
    //接收缓存
    private NetworkReceive receive;
    //发送缓存
    private Send send;

TransportLayer有多种实现模式，ssl模式、加密ssl模式和纯文本模式，这里使用PlaintextTransportLayer纯文本模式

public class PlaintextTransportLayer implements TransportLayer {
    private final SelectionKey key;
    private final SocketChannel socketChannel;

PlaintextTransportLayern封装了SelectionKey和SocketChannel，具体write和read方法就是在这里调用的

public void setSend(Send send) {
        if (this.send != null)
            throw new IllegalStateException("Attempt to begin a send operation with prior send operation still in progress, connection id is " + id);
        this.send = send;
        this.transportLayer.addInterestOps(SelectionKey.OP_WRITE);
    }

这是KafkaChannel的setSend方法，send字段是即将发送的数据，通过设置setSend来设置这个字段，并且注册OP_WRITE事件来发送，从这里可以看出来一个KafkaChannel一次只能处理一个发送请求。

++这里的Send对象，实际上就是对ByteBuffer（发送缓存）封装，它主要包含了将要发往的节点ID信息、buffer大小、是否发送完毕等信息++

如果需要向服务端发送数据，只需要调用setSend方法来添加写事件，就可以让事件驱动调用发送，下面看一下真正的读写方法

read方法

public NetworkReceive read() throws IOException {
        NetworkReceive result = null;

        if (receive == null) {
            receive = new NetworkReceive(maxReceiveSize, id, memoryPool);
        }

        receive(receive);
        if (receive.complete()) {
            receive.payload().rewind();
            result = receive;
            receive = null;
        } else if (receive.requiredMemoryAmountKnown() && !receive.memoryAllocated() && isInMutableState()) {
            //pool must be out of memory, mute ourselves.
            mute();
        }
        return result;
    }

read方法中调用receive方法，而NetworkReceive实际上就是对Buffer的封装，如果判断读取完毕会返回NetworkReceive对象

private long receive(NetworkReceive receive) throws IOException {
        return receive.readFrom(transportLayer);
    }

receive方法实际调用了NetworkReceive的readFrom方法，传入的参数是transportLayer，因为transportLayer里面保存了SocketChannel，从这里可以得知readFrom方法一定调用了SocketChannel.read(buffer)方法

 public long readFrom(ScatteringByteChannel channel) throws IOException {
        int read = 0;
        //判断buffer是否还有空间
        if (size.hasRemaining()) {
            int bytesRead = channel.read(size);
            if (bytesRead < 0)
                throw new EOFException();
            read += bytesRead;
            //如果没有空间了说明装满了
            if (!size.hasRemaining()) {
                size.rewind();
                int receiveSize = size.getInt();
                if (receiveSize < 0)
                    throw new InvalidReceiveException("Invalid receive (size = " + receiveSize + ")");
                if (maxSize != UNLIMITED && receiveSize > maxSize)
                    throw new InvalidReceiveException("Invalid receive (size = " + receiveSize + " larger than " + maxSize + ")");
                requestedBufferSize = receiveSize;
                if (receiveSize == 0) {
                    buffer = EMPTY_BUFFER;
                }
            }
        }
        //还没有分配buffer
        if (buffer == null && requestedBufferSize != -1) {
            buffer = memoryPool.tryAllocate(requestedBufferSize);
            if (buffer == null)
                log.trace("Broker low on memory - could not allocate buffer of size {} for source {}", requestedBufferSize, source);
        }
        if (buffer != null) {
            int bytesRead = channel.read(buffer);
            if (bytesRead < 0)
                throw new EOFException();
            read += bytesRead;
        }

        return read;
    }

步骤如下：

判断size中是否还有剩余空间，size=4，用于描述数据包的长度，如果有说明可以从nio中读数据
判断size是否装满了，如果装满了说明已经达到了包体长度，然后会申请一个这个长度的buffer
调用read来读取数据

++这里是经典的读取数据代码，因为TCP会粘包，所以会读取前4个字节来描述包体的长度，然后读该长度的数据++

public boolean complete() {
        return !size.hasRemaining() && buffer != null && !buffer.hasRemaining();
    }

判断是否读取完成，size和buffer都没有剩余空间说明读取了完整的包体

write方法

 public Send write() throws IOException {
        Send result = null;
        if (send != null && send(send)) {
            result = send;
            send = null;
        }
        return result;
    }

调用send方法将Send发送，并且把缓存设置为空

private boolean send(Send send) throws IOException {
        midWrite = true;
        send.writeTo(transportLayer);
        if (send.completed()) {
            midWrite = false;
            transportLayer.removeInterestOps(SelectionKey.OP_WRITE);
        }
        return send.completed();
    }

write方法就很简单了，writeTo方法同样传入的是transportLayer，在写入后会判断是否写入完成，完成后会移除OP_WRITE事件

@Override
    public long writeTo(GatheringByteChannel channel) throws IOException {
        long written = channel.write(buffers);
        if (written < 0)
            throw new EOFException("Wrote negative bytes to channel. This shouldn't happen.");
        remaining -= written;
        pending = TransportLayers.hasPendingWrites(channel);
        return written;
    }

核心调用了SocketChannel.write(buffers)方法

总结：KafkaChannel对SocketChannel和SelectorKey进行封装，使用了预发送功能，将带发送数据设置成send缓存，注册OP_WRITE事件然后通过事件循环来处理发送，最终调用SocketChannel的方法

Selector

Selector对JavaNio的Selector进行封装，看一下是如何跟KafkaChannel进行联动的

    // java底层
    private final java.nio.channels.Selector nioSelector;
    private final Map<String, KafkaChannel> channels;
    private final List<Send> completedSends;
    private final List<NetworkReceive> completedReceives;

使用了Map保存了ID映射KafkaChannel，方便找到对应的KafkaChannel，同时有两个Set分别保存已完成的发送和读取数据

public void connect(String id, InetSocketAddress address, int sendBufferSize, int receiveBufferSize) throws IOException {
        ensureNotRegistered(id);
        //打开一个SocketChannel
        SocketChannel socketChannel = SocketChannel.open();
        SelectionKey key = null;
        try {
            //设置Channel属性
            configureSocketChannel(socketChannel, sendBufferSize, receiveBufferSize);
            //远程连接
            boolean connected = doConnect(socketChannel, address);
            //注册Channel
            key = registerChannel(id, socketChannel, SelectionKey.OP_CONNECT);

            if (connected) {
                // OP_CONNECT won't trigger for immediately connected channels
                log.debug("Immediately connected to node {}", id);
                immediatelyConnectedKeys.add(key);
                key.interestOps(0);
            }
        } catch (IOException | RuntimeException e) {
            if (key != null)
                immediatelyConnectedKeys.remove(key);
            channels.remove(id);
            socketChannel.close();
            throw e;
        }
    }

connect方法和普通nio方法比较相似，打开了一个SocketChannel并且组册到selector上

private void configureSocketChannel(SocketChannel socketChannel, int sendBufferSize, int receiveBufferSize)
            throws IOException {
        socketChannel.configureBlocking(false);
        Socket socket = socketChannel.socket();
        socket.setKeepAlive(true);
        if (sendBufferSize != Selectable.USE_DEFAULT_BUFFER_SIZE)
            socket.setSendBufferSize(sendBufferSize);
        if (receiveBufferSize != Selectable.USE_DEFAULT_BUFFER_SIZE)
            socket.setReceiveBufferSize(receiveBufferSize);
        socket.setTcpNoDelay(true);
    }

protected SelectionKey registerChannel(String id, SocketChannel socketChannel, int interestedOps) throws IOException {
        SelectionKey key = socketChannel.register(nioSelector, interestedOps);
        KafkaChannel channel = buildAndAttachKafkaChannel(socketChannel, id, key);
//添加map
        this.channels.put(id, channel);
        if (idleExpiryManager != null)
            idleExpiryManager.update(channel.id(), time.nanoseconds());
        return key;
    }

这里设置了configureBlocking，使用非阻塞模式；registerChannel方法不仅使用了nio的register方法，也将KafkaChannel添加进了Map中

private int select(long timeoutMs) throws IOException {
        if (timeoutMs < 0L)
            throw new IllegalArgumentException("timeout should be >= 0");

        if (timeoutMs == 0L)
            return this.nioSelector.selectNow();
        else
            return this.nioSelector.select(timeoutMs);
    }

select方法对nioSelector.select进行封装

 public void poll(long timeout) throws IOException {
.....
long startSelect = time.nanoseconds();
        int numReadyKeys = select(timeout);
        long endSelect = time.nanoseconds();
        this.sensors.selectTime.record(endSelect - startSelect, time.milliseconds());
 if (numReadyKeys > 0 || !immediatelyConnectedKeys.isEmpty() || dataInBuffers) {
            Set<SelectionKey> readyKeys = this.nioSelector.selectedKeys();
pollSelectionKeys(readyKeys, false, endSelect);
            // Clear all selected keys so that they are included in the ready count for the next select
            readyKeys.clear();

            pollSelectionKeys(immediatelyConnectedKeys, true, endSelect);
            immediatelyConnectedKeys.clear();

.....

poll方法是selector核心方法，这里调用了select方法去选择事件，然后通过pollSelectionKeys去处理准备好的事件

void pollSelectionKeys(Set<SelectionKey> selectionKeys,
                           boolean isImmediatelyConnected,
                           long currentTimeNanos) {
.....
attemptRead(key, channel);
 if (channel.ready() && key.isWritable() && !channel.maybeBeginClientReauthentication(
                    () -> channelStartTimeNanos != 0 ? channelStartTimeNanos : currentTimeNanos)) {
                    Send send;
                    try {
                        send = channel.write();
                    } catch (Exception e) {
                        sendFailed = true;
                        throw e;
                    }
                    if (send != null) {
                        this.completedSends.add(send);
                        this.sensors.recordBytesSent(channel.id(), send.size());
                    }
                }

我们只关注读写事件，读事件调用attemptRead，写事件调用了刚才分析过的channel.write()方法，如果返回send不等于空，则添加到completedSends中

 private void attemptRead(SelectionKey key, KafkaChannel channel) throws IOException {
        //if channel is ready and has bytes to read from socket or buffer, and has no
        //previous receive(s) already staged or otherwise in progress then read from it
        if (channel.ready() && (key.isReadable() || channel.hasBytesBuffered()) && !hasStagedReceive(channel)
            && !explicitlyMutedChannels.contains(channel)) {
            NetworkReceive networkReceive;
            while ((networkReceive = channel.read()) != null) {
                madeReadProgressLastPoll = true;
                addToStagedReceives(channel, networkReceive);
            }
            if (channel.isMute()) {
                outOfMemory = true; //channel has muted itself due to memory pressure.
            } else {
                madeReadProgressLastPoll = true;
            }
        }
    }

attemptRead是一个分批读取数据方法

  while ((networkReceive = channel.read()) != null) {

这里会多次调用channel.read()方法，直到没有数据可读为止，然后调用addToStagedReceives方法

 private void addToStagedReceives(KafkaChannel channel, NetworkReceive receive) {
        if (!stagedReceives.containsKey(channel))
            stagedReceives.put(channel, new ArrayDeque<>());

        Deque<NetworkReceive> deque = stagedReceives.get(channel);
        deque.add(receive);
    }

会将读取到的数据放入对应KafkaChannel的Deque里面，在poll方法的最后会将Deque里面的数据放到对应Channel的stagedReceives里面等待处理

public void send(Send send) {
        //获取对应的channel
        String connectionId = send.destination();
        KafkaChannel channel = openOrClosingChannelOrFail(connectionId);
        if (closingChannels.containsKey(connectionId)) {
            // ensure notification via `disconnected`, leave channel in the state in which closing was triggered
            this.failedSends.add(connectionId);
        } else {
            try {
                channel.setSend(send);
            } catch (Exception e) {
                // update the state for consistency, the channel will be discarded after `close`
                channel.state(ChannelState.FAILED_SEND);
                // ensure notification via `disconnected` when `failedSends` are processed in the next poll
                this.failedSends.add(connectionId);
                close(channel, CloseMode.DISCARD_NO_NOTIFY);
                if (!(e instanceof CancelledKeyException)) {
                    log.error("Unexpected exception during send, closing connection {} and rethrowing exception {}",
                            connectionId, e);
                    throw e;
                }
            }
        }
    }

selector的send方法，获取对应的channel，然后调用setSend方法

总结

网络IO.png

如图所示，Selector.poll用于处理事件循环，Selector.send用于提交一个发送请求，一次一只能发送一个请求；不管是读事件还是写事件，都会对网络进行轮循，直到所有的send都发送或所有的数据都读取完毕。

当然，Selector和KafkaChannel都只是读写数据，他们维护了两个集合用于存放发送成功以及读取成功的消息，在这里并没有进行处理，对于客户端来说我要知道数据是否发送成功以及接收数据处理，对这部分的处理在后面细说

Kafka2.4源码阅读——客户端网络I/O

前言

正文

KafkaChannel

read方法

write方法

Selector

总结