前言
最近开始看Kafka源码了,首先从网络I/O开始阅读源码,本文从客户端代码开始分析,了解kafka在客户端是如何处理网络请求的
正文
Kafka客户端网络I/O使用了Java Nio,这里复习一下Java Nio上面重要的泪:
- SocketChannel:客户端通道,对于底层字节数据的读取和写入就发生在通道上,常用的是channel.read(buffer),channel.write(buffer)方法,表示读取到缓冲区和从缓冲区发送数据
- Selector:选择器,Channel会发送读事件和写事件,Selector可以选择键的方式监听事件的发生
- SelectionKey:选择键,可以通过channel.register(selector)将通道注册到Selector上,然后返回SelectorKey,当读写事件发生的时候可以通过SelectorKey找到对应的通道,从而执行对应的读写方法
Nio的具体使用方法就不再赘述了,netty篇就完全讲过。。。。
kafka通过封装上诉这三个核心类来实现自己的客户端网络I/O,下面看一下具体实现
KafkaChannel
KafkaChannel包装了SocketChannel,同样负责底层数据的读写
//唯一id
private final String id;
//key和channel
private final TransportLayer transportLayer;
private final int maxReceiveSize;
//分配内存
private final MemoryPool memoryPool;
//接收缓存
private NetworkReceive receive;
//发送缓存
private Send send;
TransportLayer有多种实现模式,ssl模式、加密ssl模式和纯文本模式,这里使用PlaintextTransportLayer纯文本模式
public class PlaintextTransportLayer implements TransportLayer {
private final SelectionKey key;
private final SocketChannel socketChannel;
PlaintextTransportLayern封装了SelectionKey和SocketChannel,具体write和read方法就是在这里调用的
public void setSend(Send send) {
if (this.send != null)
throw new IllegalStateException("Attempt to begin a send operation with prior send operation still in progress, connection id is " + id);
this.send = send;
this.transportLayer.addInterestOps(SelectionKey.OP_WRITE);
}
这是KafkaChannel的setSend方法,send字段是即将发送的数据,通过设置setSend来设置这个字段,并且注册OP_WRITE事件来发送,从这里可以看出来一个KafkaChannel一次只能处理一个发送请求。
++这里的Send对象,实际上就是对ByteBuffer(发送缓存)封装,它主要包含了将要发往的节点ID信息、buffer大小、是否发送完毕等信息++
如果需要向服务端发送数据,只需要调用setSend方法来添加写事件,就可以让事件驱动调用发送,下面看一下真正的读写方法
read方法
public NetworkReceive read() throws IOException {
NetworkReceive result = null;
if (receive == null) {
receive = new NetworkReceive(maxReceiveSize, id, memoryPool);
}
receive(receive);
if (receive.complete()) {
receive.payload().rewind();
result = receive;
receive = null;
} else if (receive.requiredMemoryAmountKnown() && !receive.memoryAllocated() && isInMutableState()) {
//pool must be out of memory, mute ourselves.
mute();
}
return result;
}
read方法中调用receive方法,而NetworkReceive实际上就是对Buffer的封装,如果判断读取完毕会返回NetworkReceive对象
private long receive(NetworkReceive receive) throws IOException {
return receive.readFrom(transportLayer);
}
receive方法实际调用了NetworkReceive的readFrom方法,传入的参数是transportLayer,因为transportLayer里面保存了SocketChannel,从这里可以得知readFrom方法一定调用了SocketChannel.read(buffer)方法
public long readFrom(ScatteringByteChannel channel) throws IOException {
int read = 0;
//判断buffer是否还有空间
if (size.hasRemaining()) {
int bytesRead = channel.read(size);
if (bytesRead < 0)
throw new EOFException();
read += bytesRead;
//如果没有空间了说明装满了
if (!size.hasRemaining()) {
size.rewind();
int receiveSize = size.getInt();
if (receiveSize < 0)
throw new InvalidReceiveException("Invalid receive (size = " + receiveSize + ")");
if (maxSize != UNLIMITED && receiveSize > maxSize)
throw new InvalidReceiveException("Invalid receive (size = " + receiveSize + " larger than " + maxSize + ")");
requestedBufferSize = receiveSize;
if (receiveSize == 0) {
buffer = EMPTY_BUFFER;
}
}
}
//还没有分配buffer
if (buffer == null && requestedBufferSize != -1) {
buffer = memoryPool.tryAllocate(requestedBufferSize);
if (buffer == null)
log.trace("Broker low on memory - could not allocate buffer of size {} for source {}", requestedBufferSize, source);
}
if (buffer != null) {
int bytesRead = channel.read(buffer);
if (bytesRead < 0)
throw new EOFException();
read += bytesRead;
}
return read;
}
步骤如下:
- 判断size中是否还有剩余空间,size=4,用于描述数据包的长度,如果有说明可以从nio中读数据
- 判断size是否装满了,如果装满了说明已经达到了包体长度,然后会申请一个这个长度的buffer
- 调用read来读取数据
++这里是经典的读取数据代码,因为TCP会粘包,所以会读取前4个字节来描述包体的长度,然后读该长度的数据++
public boolean complete() {
return !size.hasRemaining() && buffer != null && !buffer.hasRemaining();
}
判断是否读取完成,size和buffer都没有剩余空间说明读取了完整的包体
write方法
public Send write() throws IOException {
Send result = null;
if (send != null && send(send)) {
result = send;
send = null;
}
return result;
}
调用send方法将Send发送,并且把缓存设置为空
private boolean send(Send send) throws IOException {
midWrite = true;
send.writeTo(transportLayer);
if (send.completed()) {
midWrite = false;
transportLayer.removeInterestOps(SelectionKey.OP_WRITE);
}
return send.completed();
}
write方法就很简单了,writeTo方法同样传入的是transportLayer,在写入后会判断是否写入完成,完成后会移除OP_WRITE事件
@Override
public long writeTo(GatheringByteChannel channel) throws IOException {
long written = channel.write(buffers);
if (written < 0)
throw new EOFException("Wrote negative bytes to channel. This shouldn't happen.");
remaining -= written;
pending = TransportLayers.hasPendingWrites(channel);
return written;
}
核心调用了SocketChannel.write(buffers)方法
总结:KafkaChannel对SocketChannel和SelectorKey进行封装,使用了预发送功能,将带发送数据设置成send缓存,注册OP_WRITE事件然后通过事件循环来处理发送,最终调用SocketChannel的方法
Selector
Selector对JavaNio的Selector进行封装,看一下是如何跟KafkaChannel进行联动的
// java底层
private final java.nio.channels.Selector nioSelector;
private final Map<String, KafkaChannel> channels;
private final List<Send> completedSends;
private final List<NetworkReceive> completedReceives;
使用了Map保存了ID映射KafkaChannel,方便找到对应的KafkaChannel,同时有两个Set分别保存已完成的发送和读取数据
public void connect(String id, InetSocketAddress address, int sendBufferSize, int receiveBufferSize) throws IOException {
ensureNotRegistered(id);
//打开一个SocketChannel
SocketChannel socketChannel = SocketChannel.open();
SelectionKey key = null;
try {
//设置Channel属性
configureSocketChannel(socketChannel, sendBufferSize, receiveBufferSize);
//远程连接
boolean connected = doConnect(socketChannel, address);
//注册Channel
key = registerChannel(id, socketChannel, SelectionKey.OP_CONNECT);
if (connected) {
// OP_CONNECT won't trigger for immediately connected channels
log.debug("Immediately connected to node {}", id);
immediatelyConnectedKeys.add(key);
key.interestOps(0);
}
} catch (IOException | RuntimeException e) {
if (key != null)
immediatelyConnectedKeys.remove(key);
channels.remove(id);
socketChannel.close();
throw e;
}
}
connect方法和普通nio方法比较相似,打开了一个SocketChannel并且组册到selector上
private void configureSocketChannel(SocketChannel socketChannel, int sendBufferSize, int receiveBufferSize)
throws IOException {
socketChannel.configureBlocking(false);
Socket socket = socketChannel.socket();
socket.setKeepAlive(true);
if (sendBufferSize != Selectable.USE_DEFAULT_BUFFER_SIZE)
socket.setSendBufferSize(sendBufferSize);
if (receiveBufferSize != Selectable.USE_DEFAULT_BUFFER_SIZE)
socket.setReceiveBufferSize(receiveBufferSize);
socket.setTcpNoDelay(true);
}
protected SelectionKey registerChannel(String id, SocketChannel socketChannel, int interestedOps) throws IOException {
SelectionKey key = socketChannel.register(nioSelector, interestedOps);
KafkaChannel channel = buildAndAttachKafkaChannel(socketChannel, id, key);
//添加map
this.channels.put(id, channel);
if (idleExpiryManager != null)
idleExpiryManager.update(channel.id(), time.nanoseconds());
return key;
}
这里设置了configureBlocking,使用非阻塞模式;registerChannel方法不仅使用了nio的register方法,也将KafkaChannel添加进了Map中
private int select(long timeoutMs) throws IOException {
if (timeoutMs < 0L)
throw new IllegalArgumentException("timeout should be >= 0");
if (timeoutMs == 0L)
return this.nioSelector.selectNow();
else
return this.nioSelector.select(timeoutMs);
}
select方法对nioSelector.select进行封装
public void poll(long timeout) throws IOException {
.....
long startSelect = time.nanoseconds();
int numReadyKeys = select(timeout);
long endSelect = time.nanoseconds();
this.sensors.selectTime.record(endSelect - startSelect, time.milliseconds());
if (numReadyKeys > 0 || !immediatelyConnectedKeys.isEmpty() || dataInBuffers) {
Set<SelectionKey> readyKeys = this.nioSelector.selectedKeys();
pollSelectionKeys(readyKeys, false, endSelect);
// Clear all selected keys so that they are included in the ready count for the next select
readyKeys.clear();
pollSelectionKeys(immediatelyConnectedKeys, true, endSelect);
immediatelyConnectedKeys.clear();
.....
poll方法是selector核心方法,这里调用了select方法去选择事件,然后通过pollSelectionKeys去处理准备好的事件
void pollSelectionKeys(Set<SelectionKey> selectionKeys,
boolean isImmediatelyConnected,
long currentTimeNanos) {
.....
attemptRead(key, channel);
if (channel.ready() && key.isWritable() && !channel.maybeBeginClientReauthentication(
() -> channelStartTimeNanos != 0 ? channelStartTimeNanos : currentTimeNanos)) {
Send send;
try {
send = channel.write();
} catch (Exception e) {
sendFailed = true;
throw e;
}
if (send != null) {
this.completedSends.add(send);
this.sensors.recordBytesSent(channel.id(), send.size());
}
}
我们只关注读写事件,读事件调用attemptRead,写事件调用了刚才分析过的channel.write()方法,如果返回send不等于空,则添加到completedSends中
private void attemptRead(SelectionKey key, KafkaChannel channel) throws IOException {
//if channel is ready and has bytes to read from socket or buffer, and has no
//previous receive(s) already staged or otherwise in progress then read from it
if (channel.ready() && (key.isReadable() || channel.hasBytesBuffered()) && !hasStagedReceive(channel)
&& !explicitlyMutedChannels.contains(channel)) {
NetworkReceive networkReceive;
while ((networkReceive = channel.read()) != null) {
madeReadProgressLastPoll = true;
addToStagedReceives(channel, networkReceive);
}
if (channel.isMute()) {
outOfMemory = true; //channel has muted itself due to memory pressure.
} else {
madeReadProgressLastPoll = true;
}
}
}
attemptRead是一个分批读取数据方法
while ((networkReceive = channel.read()) != null) {
这里会多次调用channel.read()方法,直到没有数据可读为止,然后调用addToStagedReceives方法
private void addToStagedReceives(KafkaChannel channel, NetworkReceive receive) {
if (!stagedReceives.containsKey(channel))
stagedReceives.put(channel, new ArrayDeque<>());
Deque<NetworkReceive> deque = stagedReceives.get(channel);
deque.add(receive);
}
会将读取到的数据放入对应KafkaChannel的Deque里面,在poll方法的最后会将Deque里面的数据放到对应Channel的stagedReceives里面等待处理
public void send(Send send) {
//获取对应的channel
String connectionId = send.destination();
KafkaChannel channel = openOrClosingChannelOrFail(connectionId);
if (closingChannels.containsKey(connectionId)) {
// ensure notification via `disconnected`, leave channel in the state in which closing was triggered
this.failedSends.add(connectionId);
} else {
try {
channel.setSend(send);
} catch (Exception e) {
// update the state for consistency, the channel will be discarded after `close`
channel.state(ChannelState.FAILED_SEND);
// ensure notification via `disconnected` when `failedSends` are processed in the next poll
this.failedSends.add(connectionId);
close(channel, CloseMode.DISCARD_NO_NOTIFY);
if (!(e instanceof CancelledKeyException)) {
log.error("Unexpected exception during send, closing connection {} and rethrowing exception {}",
connectionId, e);
throw e;
}
}
}
}
selector的send方法,获取对应的channel,然后调用setSend方法
总结
如图所示,Selector.poll用于处理事件循环,Selector.send用于提交一个发送请求,一次一只能发送一个请求;不管是读事件还是写事件,都会对网络进行轮循,直到所有的send都发送或所有的数据都读取完毕。
当然,Selector和KafkaChannel都只是读写数据,他们维护了两个集合用于存放发送成功以及读取成功的消息,在这里并没有进行处理,对于客户端来说我要知道数据是否发送成功以及接收数据处理,对这部分的处理在后面细说