小菜鸡的探索:netty啲工作流程

249 阅读12分钟

0. netty工作流程

netty工作流程鱼形图.png

netty流程图.png

1. 启动服务

1.1. [thread-user] 创建主reactor事件循环线程组bossGroup。bossGroup 线程组用于监听 OP_ACCEPT 事件,创建 SocketChannel 。

  1. 创建EventExecutor数组,数组长度为bossGroup线程数(默认物理机CPU核数*2,也可通过-Dio.netty.eventLoopThreads设置
  2. EventExecutor数组,创建数组元素 NioEventLoop
  3. 每个 NioEventLoop 在构建实例对象时,会伴随1个多路复用器 selector 的构建
  4. 多路复用器 selector 是基于 rt.jar 包下的 SelectorProvider 进行创建,达到支持不同平台的目的,上篇文章提到。

1.2. [thread-user] 创建从reactor事件循环线程组 workerGroup

与1.1的流程一致,创建 workerGroup。

ServerBootstrap绑定主从reactor的线程组ServerBootstrap#group

1.3. [thread-user] ServerBootstrap 的初始化

1.3.1. ServerBootstrap 创建 ServerSocketChannel 的对象工厂

ServerBootstrap#channel

/**
 * The {@link Class} which is used to create {@link Channel} instances from.
 * You either use this or {@link #channelFactory(io.netty.channel.ChannelFactory)} if your
 * {@link Channel} implementation has no no-args constructor.
 */
public B channel(Class<? extends C> channelClass) {
    return channelFactory(new ReflectiveChannelFactory<C>(
            ObjectUtil.checkNotNull(channelClass, "channelClass")
    ));
}

1.3.2. ServerBootstrap 记录要设置 ServerSocketChannel 的参数ChannelOption

ServerBootstrap#option

/**
 * Allow to specify a {@link ChannelOption} which is used for the {@link Channel} instances once they got
 * created. Use a value of {@code null} to remove a previous set {@link ChannelOption}.
 */
public <T> B option(ChannelOption<T> option, T value) {
    ObjectUtil.checkNotNull(option, "option");
    if (value == null) {
        synchronized (options) {
            options.remove(option);
        }
    } else {
        synchronized (options) {
            options.put(option, value);
        }
    }
    return self();
}

1.3.3. ServerBootstrap 记录要设置 ServerSocketChannel 的处理器 ChannelHandler

ServerBootstrap#handler

/**
 * the {@link ChannelHandler} to use for serving the requests.
 */
public B handler(ChannelHandler handler) {
    this.handler = ObjectUtil.checkNotNull(handler, "handler");
    return self();
}

1.3.4. ServerBootstrap 记录要设置 SocketChannel 的参数ChannelOption

ServerBootstrap#childOption

/**
 * Allow to specify a {@link ChannelOption} which is used for the {@link Channel} instances once they get created
 * (after the acceptor accepted the {@link Channel}). Use a value of {@code null} to remove a previous set
 * {@link ChannelOption}.
 */
public <T> ServerBootstrap childOption(ChannelOption<T> childOption, T value) {
    ObjectUtil.checkNotNull(childOption, "childOption");
    if (value == null) {
        synchronized (childOptions) {
            childOptions.remove(childOption);
        }
    } else {
        synchronized (childOptions) {
            childOptions.put(childOption, value);
        }
    }
    return this;
}

1.3.5. ServerBootstrap 记录要设置 SocketChannel 的childHandler

ServerBootstrap#childHandler

/**
 * Set the {@link ChannelHandler} which is used to serve the request for the {@link Channel}'s.
 */
public ServerBootstrap childHandler(ChannelHandler childHandler) {
    this.childHandler = ObjectUtil.checkNotNull(childHandler, "childHandler");
    return this;
}

1.4. [thread-user] ServerSocketChannel 初始化

1.4.1. ServerSocketChannel 的创建

ServerSocketChannel 利用工厂模式 & 泛型 & 反射 创建实例。

使用 ReflectiveChannelFactory<ServerSocketChannel>#newChannel 创建实例。

1.4.2. ServerSocketChannel 初始化

  1. 设置 ServerBootstrap 初始化时设置的 ChannelOption;
  2. 设置 ServerBootstrap 初始化时设置的 Attribute 属性;
  3. 在 ServerSocketChannel 的 ChannelPipeline 中添加 ChannelInitializer A;

1.5. [thread-user ~ thread-boss] ServerSocketChannel 的注册

1.5.1. [thread-user] 从 bossGroup 中,分配1个 EventLoop 给 ServerSocketChannel

netty 的 Channel 会绑定1个 EventLoop 对象,在生命周期内有且仅绑定1个 EventLoop 对象。

EventExecutorChooser#next

EventExecutorChooser 的实现:

  • PowerOfTwoEventExecutorChooser
  • GenericEventExecutorChooser

1.5.2. [thread-boss] 将 ServerSocketChannel 注册到多路复用器 selector

io.netty.channel.nio.AbstractNioChannel#doRegister

@Override
protected void doRegister() throws Exception {
    boolean selected = false;
    for (;;) {
        try {
            // 注册到多路复用器上,注意此处的0(暂不设置感兴趣事件)
            selectionKey = javaChannel().register(eventLoop().unwrappedSelector(), 0, this);
            return;
        } catch (CancelledKeyException e) {
            // 省略代码...
        }
    }
}

1.5.3. [thread-boss] ServerSocketChannel 的 pipeline 上传播 callHandlerAdded 事件

pipeline 上传播 callHandlerAdded,pipeline 上的 Handler 会被依次执行 ChannelHandler#handlerAdded

当前 ServerSocketChannel pipeline 包含:

  • HeadContext
  • [1.4.2.] ChannelInitializer A
  • TailContext

ChannelInitializer A 的 ChannelInitializer#handlerAdded 的操作:

  • ChannelInitializer#initChannel
  • 从pipeline中移除当前 Handler(即[1.4.2] 添加的 ChannelInitializer A)

ChannelInitializer A initChannel操作包括:

  • 将 ServerBootstrap 的 handler 添加到 ServerSocketChannel 的 ChannelPipeline 的末尾
  • 异步添加 ServerBootstrapAcceptor ,ServerBootstrapAcceptor 用于处理 ServerSocketChannel 接收到的数据,即处理 SocketChannel 的初始化
p.addLast(new ChannelInitializer<Channel>() {
    @Override
    public void initChannel(final Channel ch) throws Exception {
        final ChannelPipeline pipeline = ch.pipeline();
        ChannelHandler handler = config.handler();
        if (handler != null) {
            pipeline.addLast(handler);
        }

        // 异步添加 ServerBootstrapAcceptor, 
        // ServerBootstrapAcceptor 负责接收客户端连接创建 SocketChannel 后,对 SocketChannel 的初始化工作。
        ch.eventLoop().execute(new Runnable() {
            @Override
            public void run() {
                pipeline.addLast(new ServerBootstrapAcceptor(
                        ch, currentChildGroup, currentChildHandler, currentChildOptions, currentChildAttrs));
            }
        });
    }
});

此时,pipeline中包含的handler依次为:

  • HeadContext
  • [1.3.3] ServerBootstrap 设置的 handler
  • ServerBootstrapAcceptor(异步添加)
  • TailContxt

1.6. [thread-boss] ServerSocketChannel 绑定地址

io.netty.channel.socket.nio.NioServerSocketChannel#doBind

@Override
protected void doBind(SocketAddress localAddress) throws Exception {
    if (PlatformDependent.javaVersion() >= 7) {
        javaChannel().bind(localAddress, config.getBacklog());
    } else {
        javaChannel().socket().bind(localAddress, config.getBacklog());
    }
}

1.7. [thread-boss] ServerSocketChannel 的 pipeline 中传播 fireChannelActive 事件

HeadContext 在调用 fireChannelActive 时,会依次调用:

  • io.netty.channel.DefaultChannelPipeline.HeadContext#read
  • io.netty.channel.Channel.Unsafe#beginRead
  • io.netty.channel.nio.AbstractNioChannel#doBeginRead
@Override
protected void doBeginRead() throws Exception {
    // Channel.read() or ChannelHandlerContext.read() was called
    final SelectionKey selectionKey = this.selectionKey;
    if (!selectionKey.isValid()) {
        return;
    }

    readPending = true;

    final int interestOps = selectionKey.interestOps();
    if ((interestOps & readInterestOp) == 0) {
        // 此处设置 ServerSocketChannel 感兴趣的事件 SelectionKey.OP_ACCEPT
        // 此时, ServerSocketChannel 可开始接受连接事件
        selectionKey.interestOps(interestOps | readInterestOp);
    }
}

readInterestOp 标注为final,在 NioServerSocketChannel 构建时,确定感兴趣的事件。

/**
 * Create a new instance using the given {@link ServerSocketChannel}.
 */
public NioServerSocketChannel(ServerSocketChannel channel) {
    super(null, channel, SelectionKey.OP_ACCEPT);
    config = new NioServerSocketChannelConfig(this, javaChannel().socket());
}

2. 构建连接

2.1. [thread-boss] NioEventLoop 事件循环中,Selector#select,轮询创建连接事件OP_ACCEPT

2.2. [thread-boss] 从事件key绑定的 NioServerSocketChannel 创建 SocketChannel

io.netty.channel.socket.nio.NioServerSocketChannel#doReadMessages

@Override
protected int doReadMessages(List<Object> buf) throws Exception {
    // accept SocketChannel
    SocketChannel ch = SocketUtils.accept(javaChannel());

    try {
        if (ch != null) {
            // 创建 NioSocketChannel
            buf.add(new NioSocketChannel(this, ch));
            return 1;
        }
    } catch (Throwable t) {
        logger.warn("Failed to create a new channel from an accepted socket.", t);

        try {
            ch.close();
        } catch (Throwable t2) {
            logger.warn("Failed to close a socket.", t2);
        }
    }

    return 0;
}

2.3. [thread-boss] 在 NioServerSocketChannel 的pipeline中,传播 SocketChannel

io.netty.channel.nio.AbstractNioMessageChannel.NioMessageUnsafe#read

@Override
public void read() {
        // 省略部分代码
        int size = readBuf.size();
        for (int i = 0; i < size; i ++) {
            readPending = false;
            // 传播接受的 SocketChannel
            pipeline.fireChannelRead(readBuf.get(i));
        }
        readBuf.clear();
        allocHandle.readComplete();
        // 传播 ChannelReadComplete 事件
        pipeline.fireChannelReadComplete();

        if (exception != null) {
            closed = closeOnReadError(exception);
            // 传播 ExceptionCaught 事件
            pipeline.fireExceptionCaught(exception);
        }
        // 省略部分代码
}

2.4. [thread-boss] NioServerSocketChannel 的pipeline中的 ServerBootstrapAcceptor ,进行对 SocketChannel 的初始化

io.netty.bootstrap.ServerBootstrap.ServerBootstrapAcceptor#channelRead

@Override
@SuppressWarnings("unchecked")
public void channelRead(ChannelHandlerContext ctx, Object msg) {
        // ServerSocketChannel pipeline 传播的是 SocketChannel 
        // 直接可强转
    final Channel child = (Channel) msg;

    // 添加 childHandler 到 SocketChannel 的pipeline
    child.pipeline().addLast(childHandler);
    // 设置 ChannelOption到 SocketChannel
    setChannelOptions(child, childOptions, logger);
    // 设置属性到 SocketChannel
    for (Entry<AttributeKey<?>, Object> e: childAttrs) {
        child.attr((AttributeKey<Object>) e.getKey()).set(e.getValue());
    }

    try {
        // 将 SocketChannel 注册到 childGroup(即workerGroup,从Reactor)
        childGroup.register(child).addListener(new ChannelFutureListener() {
            @Override
            public void operationComplete(ChannelFuture future) throws Exception {
                if (!future.isSuccess()) {
                    forceClose(child, future.cause());
                }
            }
        });
    } catch (Throwable t) {
        forceClose(child, t);
    }
}

2.5. [thread-boss ~ thread-worker] SocketChannel 的注册流程

2.5.1. [thread-boss] 从 workerGroup 中,分配1个 EventLoop 给 SocketChannel

流程与 [1.5.1] 一致,区别是 EventLoopGroup 不同。

2.5.2. [thread-worker] 将 SocketChannel 注册到多路复用器 selector

流程与 [1.5.2] 一致,区别是 EventLoopGroup 不同。

此时,注册到 Selector 时,并未设置感兴趣的事件 (即0)。

2.5.3. [thread-worker] SocketChannel 的 pipeline 上传播 callHandlerAdded 事件

流程与 [1.5.3] 一致,触发 ChannelInitializer#initChannel,完善SocketChannel 的 pipeline 的初始化。

2.5.4. [thread-worker] SocketChannel 的 pipeline 上传播 ChannelRegistered 事件

io.netty.channel.DefaultChannelPipeline#fireChannelRegistered

2.5.5. [thread-worker] SocketChannel 的 pipeline 上传播 ChannelActive 事件

io.netty.channel.DefaultChannelPipeline#fireChannelActive

HeadContext 在接收到 fireChannelActive 时,会依次调用:

  • io.netty.channel.DefaultChannelPipeline.HeadContext#read
  • io.netty.channel.Channel.Unsafe#beginRead
  • io.netty.channel.nio.AbstractNioChannel#doBeginRead
@Override
protected void doBeginRead() throws Exception {
    // Channel.read() or ChannelHandlerContext.read() was called
    final SelectionKey selectionKey = this.selectionKey;
    if (!selectionKey.isValid()) {
        return;
    }

    readPending = true;

    final int interestOps = selectionKey.interestOps();
    if ((interestOps & readInterestOp) == 0) {
        // 此处设置 SocketChannel 感兴趣的事件 SelectionKey#OP_READ
        // 此时, SocketChannel 可开始接受数据
        selectionKey.interestOps(interestOps | readInterestOp);
    }
}

readInterestOp 标注为final,在 NioSocketChannel 构建时,确定感兴趣的事件。

protected AbstractNioByteChannel(Channel parent, SelectableChannel ch) {
    super(parent, ch, SelectionKey.OP_READ);
}

NioSocketChannel 继承 AbstractNioByteChannel。

3. 接收数据 [thread-worker]

在 workerGroup 中,每个 NioEventLoop 都会轮询 Selector 。当 Selector#selectedKeys 不为空时,则从 SocketChannel 中接收数据。

io.netty.channel.nio.AbstractNioByteChannel.NioByteUnsafe#read

    @Override
    public final void read() {
        final ChannelConfig config = config();
        if (shouldBreakReadReady(config)) {
            clearReadPending();
            return;
        }
        final ChannelPipeline pipeline = pipeline();
        final ByteBufAllocator allocator = config.getAllocator();
        // io.netty.channel.DefaultChannelConfig 中获取RecvByteBufAllocator
        // 默认实现自适应ByteBuff分配器 AdaptiveRecvByteBufAllocator
        final RecvByteBufAllocator.Handle allocHandle = recvBufAllocHandle();
        // 重置 allocHandle 的数据,如已读取的字节数、已读取多少次
        allocHandle.reset(config);

        ByteBuf byteBuf = null;
        boolean close = false;
        try {
            do {
                // 尝试分配合适大小的 ByteBuf,初始大小 1024
                byteBuf = allocHandle.allocate(allocator);
               // 读取数据到ByteBuf,并记录本次读取的字节数
               allocHandle.lastBytesRead(doReadBytes(byteBuf));
               // 读取的字节数为-1,代表 Socket 关闭
                if (allocHandle.lastBytesRead() <= 0) {
                    // nothing was read. release the buffer.
                    byteBuf.release();
                    byteBuf = null;
                    close = allocHandle.lastBytesRead() < 0;
                    if (close) {
                        // There is nothing left to read as we received an EOF.
                        readPending = false;
                    }
                    break;
                }

                // 累计读取次数
                allocHandle.incMessagesRead(1);
                readPending = false;
                // 在pipeline 上 fireChannelRead 已读取到的数据
                pipeline.fireChannelRead(byteBuf);
                byteBuf = null;
                // 是否继续本 SocketChannel 的读取数据操作
                // 默认读取次数上限 16
                // 若超过次数后,会让度给其他绑定在同一个EventLoop的 SocketChannel
                // 雨露均沾,结束本轮读取,等待下次 OP_READ
            } while (allocHandle.continueReading());

            // 用本次读事件循环总读取的字节数,计算下次预估分配 ByteBuf 的大小
            allocHandle.readComplete();
            // pipeline 上传播 ReadComplete 事件
            pipeline.fireChannelReadComplete();

            if (close) {
                closeOnRead(pipeline);
            }
        } catch (Throwable t) {
            handleReadException(pipeline, byteBuf, t, close, allocHandle);
        } finally {
            // Check if there is a readPending which was not processed yet.
            // This could be for two reasons:
            // * The user called Channel.read() or ChannelHandlerContext.read() in channelRead(...) method
            // * The user called Channel.read() or ChannelHandlerContext.read() in channelReadComplete(...) method
            //
            // See https://github.com/netty/netty/issues/2254
            if (!readPending && !config.isAutoRead()) {
                removeReadOp();
            }
        }
    }
}

自适应ByteBuf分配器 AdaptiveRecvByteBufAllocator 预估 ByteBuff 的扩缩容:

io.netty.channel.AdaptiveRecvByteBufAllocator.HandleImpl#record

/**
* @param actualReadBytes 实际读取的字节数
*/
private void record(int actualReadBytes) {
    // SIZE_TABLE
    // [16, 32, 48, 64, 80, 96, 112, 128, 144, 160, 176, 192, 208, 224, 240, 256, 272, 288, 304, 320, 336, 352, 368, 384, 400, 416, 432, 448, 464, 480, 496, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536, 131072, 262144, 524288, 1048576, 2097152, 4194304, 8388608, 16777216, 33554432, 67108864, 134217728, 268435456, 536870912, 1073741824]
    
    // 实际读取的字节比预估的小,要缩小下次分配的ByteBuf
    if (actualReadBytes <= SIZE_TABLE[max(0, index - INDEX_DECREMENT - 1)]) {
        if (decreaseNow) {
            index = max(index - INDEX_DECREMENT, minIndex);
            nextReceiveBufferSize = SIZE_TABLE[index];
            decreaseNow = false;
        } else {
            decreaseNow = true;
        }
    // 实际读取的字节比预估的大,要扩大下次分配的ByteBuf
    } else if (actualReadBytes >= nextReceiveBufferSize) {
        index = min(index + INDEX_INCREMENT, maxIndex);
        nextReceiveBufferSize = SIZE_TABLE[index];
        decreaseNow = false;
    }
}

4. 处理数据

处理数据的基本流程:

  1. 读取到的数据,立刻在pipeline中传播ChannelRead事件
  2. 读取到的数据,在pipeline传播中,依次从HeadContext开始经过各个ChannelInboundHandler,数据经过重重解码(byteToMessage、MessageToMessage),最终交到业务Handler处理业务
  3. 单个SocketChannel读取完毕后,会立刻在pipeline中传播ChannelReadComplete事件
  4. 业务Handler#channelRead,在接收到Message后,进行业务处理。

一般来说,此处都应该提交到业务线程池处理,提交时,需要关注多线程并发问题。 因为同一 EventLoop 是被多个 SocketChannel 共享使用的,业务处理在同时在此 EventLoop 内处理,假如某个操作耗时较长(同步硬盘IO),势必会影响到 SocketChannel 的数据读取效率,进而造成延迟。

解决并发问题,可以使用netty提供的方案,在ChannelPipeline设置Handler指定线程池:

io.netty.channel.ChannelPipeline#addLast(io.netty.util.concurrent.EventExecutorGroup, io.netty.channel.ChannelHandler...)

/**
 * Inserts {@link ChannelHandler}s at the last position of this pipeline.
 *
 * @param group     the {@link EventExecutorGroup} which will be used to execute the {@link ChannelHandler}s
 *                  methods.
 * @param handlers  the handlers to insert last
 *
 */
ChannelPipeline addLast(EventExecutorGroup group, ChannelHandler... handlers);

5. 发送数据

发送数据有俩种方式:

  • 写数据:io.netty.channel.AbstractChannelHandlerContext#write
  • 写数据并且刷新OutBoundBuffer:io.netty.channel.AbstractChannelHandlerContext#writeAndFlush

5.1. 发送数据时,会在 pipeline 上,依次调用io.netty.channel.ChannelOutboundHandler#write ,方向为从 Tail 往Head的方向。在各个 ChannelOutboundHandler 传播时,也会被层层 encode,最终转化成 ByteBuffer

此处注意:

AbstractChannelHandlerContext#write:从当前 Handler 往HeadContext 方向,寻找ChannelOutboundHandler,终止于 HeadContext

AbstractChannelHandlerContext#pipeline#write:从 TailContext 往 HeadContext 方向,寻找 ChannelOutboundHandler ,终止于 HeadContext

5.2. 数据到达 HeadContext 后,并非立刻发送出去,而是缓存到 io.netty.channel.ChannelOutboundBuffer

io.netty.channel.AbstractChannel.AbstractUnsafe#write

@Override
public final void write(Object msg, ChannelPromise promise) {
    assertEventLoop();

    ChannelOutboundBuffer outboundBuffer = this.outboundBuffer;
    if (outboundBuffer == null) {
        // If the outboundBuffer is null we know the channel was closed and so
        // need to fail the future right away. If it is not null the handling of the rest
        // will be done in flush0()
        // See https://github.com/netty/netty/issues/2362
        safeSetFailure(promise, newClosedChannelException(initialCloseCause));
        // release message now to prevent resource-leak
        ReferenceCountUtil.release(msg);
        return;
    }

    int size;
    try {
        // 添加到 outboundBuffer ,可支持转换,如 heap buffer 到 direct buffer
        msg = filterOutboundMessage(msg);
        // 估计msg的大小
        size = pipeline.estimatorHandle().size(msg);
        if (size < 0) {
            size = 0;
        }
    } catch (Throwable t) {
        safeSetFailure(promise, t);
        ReferenceCountUtil.release(msg);
        return;
    }
    // 缓存到 outboundBuffer
    outboundBuffer.addMessage(msg, size, promise);
}

io.netty.channel.ChannelOutboundBuffer#addMessage

public void addMessage(Object msg, int size, ChannelPromise promise) {
    // 写入的数据,封装成Entry
    Entry entry = Entry.newInstance(msg, size, total(msg), promise);
    if (tailEntry == null) {
        flushedEntry = null;
    } else {
        // 添加到链表的最后一个结点
        Entry tail = tailEntry;
        tail.next = entry;
    }
    // 记录最后一个Entry
    tailEntry = entry;
    if (unflushedEntry == null) {
        unflushedEntry = entry;
    }

    // increment pending bytes after adding message to the unflushed arrays.
    // See https://github.com/netty/netty/issues/1619
    incrementPendingOutboundBytes(entry.pendingSize, false);
}

5.3. flush 数据

此处注意:

AbstractChannelHandlerContext#flush:从当前 Handler 往HeadContext 方向,寻找ChannelOutboundHandler,终止于 HeadContext

AbstractChannelHandlerContext#pipeline#flush:从 TailContext 往 HeadContext 方向,寻找 ChannelOutboundHandler ,终止于 HeadContext

io.netty.channel.AbstractChannel.AbstractUnsafe#flush

@Override
public final void flush() {
    assertEventLoop();

    ChannelOutboundBuffer outboundBuffer = this.outboundBuffer;
    if (outboundBuffer == null) {
        return;
    }
    // 将 unflushedEntry 的数据投递给 flushedEntry
    outboundBuffer.addFlush();
    // ChannelOutboundBuffer#flushedEntry 写入到 SocketChannel
    flush0();
}

将 unflushedEntry 的数据投递给 flushedEntry io.netty.channel.ChannelOutboundBuffer#addFlush

ChannelOutBoundBuffer.png

flushedEntry 的数据写入 SocketChannel io.netty.channel.socket.nio.NioSocketChannel#doWrite

@Override
protected void doWrite(ChannelOutboundBuffer in) throws Exception {
    SocketChannel ch = javaChannel();
    // 存在数据写出时,默认最多尝试16次,雨露均沾
    int writeSpinCount = config().getWriteSpinCount();
    do {
        // ChannelOutboundBuffer 为空,清空 OP_WRITE
        if (in.isEmpty()) {
            // All written so clear OP_WRITE
            clearOpWrite();
            // Directly return here so incompleteWrite(...) is not called.
            return;
        }

        // Ensure the pending writes are made of ByteBufs only.
        int maxBytesPerGatheringWrite = ((NioSocketChannelConfig) config).getMaxBytesPerGatheringWrite();
        // 单次写出数据,最多1024个ByteBuf,总大小不超过 maxBytesPerGatheringWrite
        ByteBuffer[] nioBuffers = in.nioBuffers(1024, maxBytesPerGatheringWrite);
        int nioBufferCnt = in.nioBufferCount();

        // Always us nioBuffers() to workaround data-corruption.
        // See https://github.com/netty/netty/issues/2761
        switch (nioBufferCnt) {
            case 0:
                // We have something else beside ByteBuffers to write so fallback to normal writes.
                writeSpinCount -= doWrite0(in);
                break;
            case 1: {
                // Only one ByteBuf so use non-gathering write
                // Zero length buffers are not added to nioBuffers by ChannelOutboundBuffer, so there is no need
                // to check if the total size of all the buffers is non-zero.
                ByteBuffer buffer = nioBuffers[0];
                int attemptedBytes = buffer.remaining();
                // 写出数据到SocketChannel
                final int localWrittenBytes = ch.write(buffer);
                if (localWrittenBytes <= 0) {
                    // SocketChannel 缓冲区已满,写不出
                    // 注册OP_WRITE,跳出循环,等待下次再写出
                    incompleteWrite(true);
                    return;
                }
                adjustMaxBytesPerGatheringWrite(attemptedBytes, localWrittenBytes, maxBytesPerGatheringWrite);
                // outBoundBuffer移除已写出的数据
                in.removeBytes(localWrittenBytes);
                --writeSpinCount;
                break;
            }
            default: {
                // Zero length buffers are not added to nioBuffers by ChannelOutboundBuffer, so there is no need
                // to check if the total size of all the buffers is non-zero.
                // We limit the max amount to int above so cast is safe
                long attemptedBytes = in.nioBufferSize();
                // 写出多个数据到SocketChannel
                final long localWrittenBytes = ch.write(nioBuffers, 0, nioBufferCnt);
                if (localWrittenBytes <= 0) {
                    // SocketChannel 缓冲区已满,写不出
                    // 注册OP_WRITE,跳出循环,等待下次再写出
                    incompleteWrite(true);
                    return;
                }
                // Casting to int is safe because we limit the total amount of data in the nioBuffers to int above.
                adjustMaxBytesPerGatheringWrite((int) attemptedBytes, (int) localWrittenBytes,
                        maxBytesPerGatheringWrite);
                in.removeBytes(localWrittenBytes);
                --writeSpinCount;
                break;
            }
        }
    } while (writeSpinCount > 0);
    // 16次,写出数据
    // 若还没写完(writeSpinCount < 0),注册 OP_WRITE
    // 若已写完(writeSpinCount >= 0), 提交 flush task
    incompleteWrite(writeSpinCount < 0);
}

6. 断开连接

6.1. 当连接发生中断时, SocketChannel 会触发 OP_READ 事件。

io.netty.channel.nio.AbstractNioByteChannel.NioByteUnsafe#read

@Override
    public final void read() {
        final ChannelConfig config = config();
        if (shouldBreakReadReady(config)) {
            clearReadPending();
            return;
        }
        final ChannelPipeline pipeline = pipeline();
        final ByteBufAllocator allocator = config.getAllocator();
        final RecvByteBufAllocator.Handle allocHandle = recvBufAllocHandle();
        allocHandle.reset(config);

        ByteBuf byteBuf = null;
        boolean close = false;
        try {
            do {
                byteBuf = allocHandle.allocate(allocator);
                // 读取数据
                allocHandle.lastBytesRead(doReadBytes(byteBuf));
                if (allocHandle.lastBytesRead() <= 0) {
                    // nothing was read. release the buffer.
                    byteBuf.release();
                    byteBuf = null;
                    // 读取的 java.nio.channels.ReadableByteChannel#read 为负数,
                    // 代表SocketChannel 要关闭了
                    close = allocHandle.lastBytesRead() < 0;
                    if (close) {
                        // There is nothing left to read as we received an EOF.
                        readPending = false;
                    }
                    break;
                }

                allocHandle.incMessagesRead(1);
                readPending = false;
                pipeline.fireChannelRead(byteBuf);
                byteBuf = null;
            } while (allocHandle.continueReading());

            allocHandle.readComplete();
            pipeline.fireChannelReadComplete();

            // 关闭连接
            if (close) {
                closeOnRead(pipeline);
            }
        } catch (Throwable t) {
            handleReadException(pipeline, byteBuf, t, close, allocHandle);
        } finally {
            // Check if there is a readPending which was not processed yet.
            // This could be for two reasons:
            // * The user called Channel.read() or ChannelHandlerContext.read() in channelRead(...) method
            // * The user called Channel.read() or ChannelHandlerContext.read() in channelReadComplete(...) method
            //
            // See https://github.com/netty/netty/issues/2254
            if (!readPending && !config.isAutoRead()) {
                removeReadOp();
            }
        }
    }
}

6.2. 关闭 SocketChannel

io.netty.channel.nio.AbstractNioByteChannel.NioByteUnsafe#closeOnRead

private void closeOnRead(ChannelPipeline pipeline) {
    if (!isInputShutdown0()) {
        // 是否支持半关
        // io.netty.channel.ChannelOption#ALLOW_HALF_CLOSURE 可开启
        if (isAllowHalfClosure(config())) {
            // 关闭输入端
            shutdownInput();
           // 在 pipeline 传播  ChannelInputShutdownEvent
           pipeline.fireUserEventTriggered(ChannelInputShutdownEvent.INSTANCE);
        } else {
            // 关闭 SocketChannel
            close(voidPromise());
        }
    } else {
        inputClosedSeenErrorOnRead = true;
        pipeline.fireUserEventTriggered(ChannelInputShutdownReadComplete.INSTANCE);
    }
}

io.netty.channel.AbstractChannel.AbstractUnsafe#doClose0

private void doClose0(ChannelPromise promise) {
    try {
        // 关闭 SocketChannel
        doClose();
        // 设置 closeFuture 已关闭状态,唤醒等待 closeFuture 的线程
        closeFuture.setClosed();
        // 设置关闭操作 promise 成功执行,唤醒等待此 promise 的线程
        safeSetSuccess(promise);
    } catch (Throwable t) {
        closeFuture.setClosed();
        safeSetFailure(promise, t);
    }
}

io.netty.channel.socket.nio.NioSocketChannel#doClose

@Override
protected void doClose() throws Exception {
    super.doClose();
    // 关闭 Socket SocketChannel#close
    javaChannel().close();
}

6.3. SocketChannel 注册取消

io.netty.channel.AbstractChannel.AbstractUnsafe#deregister(io.netty.channel.ChannelPromise, boolean)

private void deregister(final ChannelPromise promise, final boolean fireChannelInactive) {
    // 省略部分代码。。。    
    invokeLater(new Runnable() {
        @Override
        public void run() {
            try {
                // 取消注册
                doDeregister();
            } catch (Throwable t) {
                logger.warn("Unexpected exception occurred while deregistering a channel.", t);
            } finally {
                // 在 pipeline 中传播 ChannelInactive
                if (fireChannelInactive) {
                    pipeline.fireChannelInactive();
                }

                if (registered) {
                    registered = false;
                    // 在 pipeline 中传播 ChannelUnregistered
                    pipeline.fireChannelUnregistered();
                }
                safeSetSuccess(promise);
            }
        }
    });
}

io.netty.channel.nio.AbstractNioChannel#doDeregister

@Override
protected void doDeregister() throws Exception {
    eventLoop().cancel(selectionKey());
}

io.netty.channel.nio.NioEventLoop#cancel

void cancel(SelectionKey key) {
    // 从Selector中取消key
    key.cancel();
    // 累计取消的key数量
    cancelledKeys ++;
    // 当处理一批事件时,发现很多连接都断了(默认256),
    // 后面的事件很大可能都失效了,所以可尝试 select again。
    if (cancelledKeys >= CLEANUP_INTERVAL) {
        cancelledKeys = 0;
        needsToSelectAgain = true;
    }
}

7. 关闭服务

官方示例代码推荐,优雅的关闭服务方式:

// 先关闭boss,期待不要有新的连接进来
bossGroup.shutdownGracefully();
// 再关闭worker,尽可能干完当前的事情
workerGroup.shutdownGracefully();

触发关闭操作时,会修改 NioEventLoop state 状态位。状态位用于指示当前 EventLoop 的运行状态。

state 状态有以下几种情况:

  • ST_NOT_STARTED:值为1,未启动;
  • ST_STARTED:2,已启动;
  • ST_SHUTTING_DOWN:3,关闭中;
  • ST_SHUTDOWN:4,已关闭;
  • ST_TERMINATED:5,服务已终止;

io.netty.util.concurrent.SingleThreadEventExecutor#shutdownGracefully

@Override
public Future<?> shutdownGracefully(/*静默期时长*/long quietPeriod, /*最大超时时长*/long timeout, TimeUnit unit) {
// ...

    // 关闭中,返回future
    if (isShuttingDown()) {
        return terminationFuture();
    }

    // 修改状态位 state 为 ST_SHUTTING_DOWN 关闭中状态
    boolean inEventLoop = inEventLoop();
    boolean wakeup;
    int oldState;
    for (;;) {
        // 关闭中,返回future
        if (isShuttingDown()) {
            return terminationFuture();
        }
        int newState;
        wakeup = true;
        oldState = state;
        if (inEventLoop) {
            newState = ST_SHUTTING_DOWN;
        } else {
            switch (oldState) {
                case ST_NOT_STARTED:
                case ST_STARTED: //2
                    newState = ST_SHUTTING_DOWN; //3
                    break;
                default:
                    newState = oldState;
                    wakeup = false;
            }
        }
        if (STATE_UPDATER.compareAndSet(this, oldState, newState)) {
            break;
        }
    }
    // 静默期时长
    gracefulShutdownQuietPeriod = unit.toNanos(quietPeriod);
    // 允许关闭的最长时长
    gracefulShutdownTimeout = unit.toNanos(timeout);

    // 若线程未启动,则启动线程
    if (ensureThreadStarted(oldState)) {
        return terminationFuture;
    }

    // 提交空任务到 EventLoop,以此唤醒 EventLoop 执行任务
    if (wakeup) {
        wakeup(inEventLoop);
    }

    return terminationFuture();
}

io.netty.channel.nio.NioEventLoop#run

@Override
protected void run() {
    for (;;) {
        //...
        // selector#select 多路复用器轮询是否有触发事件的key
        // processSelectedKey 处理触发事件的key
        // runAllTasks 处理任务
        
        
        try {
            // 状态处于关闭中
            if (isShuttingDown()) {
                // 关闭所有 SocketChannel
                // key#cancel
                closeAll();
                // 是否立刻关闭
                if (confirmShutdown()) {
                    return;
                }
            }
        } catch (Throwable t) {
            handleLoopException(t);
        }
    }
}

io.netty.util.concurrent.SingleThreadEventExecutor#confirmShutdown

protected boolean confirmShutdown() {
    if (!isShuttingDown()) {
        return false;
    }

    if (!inEventLoop()) {
        throw new IllegalStateException("must be invoked from an event loop");
    }

    // 取消定时任务
    cancelScheduledTasks();

    if (gracefulShutdownStartTime == 0) {
        gracefulShutdownStartTime = ScheduledFutureTask.nanoTime();
    }

    // 假若当前有task/hook,则执行,且不让关闭
    // 已经关闭了 SocketChannel,要先把当前的任务执行完毕
    if (runAllTasks() || runShutdownHooks()) {
        if (isShutdown()) {
            // Executor shut down - no new tasks anymore.
            return true;
        }

        // There were tasks in the queue. Wait a little bit more until no tasks are queued for the quiet period or
        // terminate if the quiet period is 0.
        // See https://github.com/netty/netty/issues/4241
        if (gracefulShutdownQuietPeriod == 0) {
            return true;
        }
        wakeup(true);
        return false;
    }

    final long nanoTime = ScheduledFutureTask.nanoTime();

    //如果已强制关闭(shutdownNow) 或者 超过最大允许时间,关闭,不再等待
    if (isShutdown() || nanoTime - gracefulShutdownStartTime > gracefulShutdownTimeout) {
        return true;
    }

    //静默期内,执行了任务,暂时不关闭,sleep 100ms,再检查下,进入下次循环。
    if (nanoTime - lastExecutionTime <= gracefulShutdownQuietPeriod) {
        // Check if any tasks were added to the queue every 100ms.
        // TODO: Change the behavior of takeTask() so that it returns on timeout.
        wakeup(true);
        try {
            Thread.sleep(100);
        } catch (InterruptedException e) {
            // Ignore
        }

        return false;
    }

    //静默期没有执行任务,关闭。
    // No tasks were added for last quiet period - hopefully safe to shut down.
    // (Hopefully because we really cannot make a guarantee that there will be no execute() calls by a user.)
    return true;
}

执行完任务,确认可以执行关闭EventLoop,则关闭 Selector。

io.netty.util.concurrent.SingleThreadEventExecutor#doStartThread

private void doStartThread() {
    assert thread == null;
    executor.execute(new Runnable() {
        @Override
        public void run() {
            // ...
            boolean success = false;
            updateLastExecutionTime();
            try {
                SingleThreadEventExecutor.this.run();
                success = true;
            } catch (Throwable t) {
                logger.warn("Unexpected exception from an event executor: ", t);
            } finally {
                // 修改状态位 ST_SHUTTING_DOWN
                for (;;) {
                    int oldState = state;
                    if (oldState >= ST_SHUTTING_DOWN || STATE_UPDATER.compareAndSet(
                            SingleThreadEventExecutor.this, oldState, ST_SHUTTING_DOWN)) {
                        break;
                    }
                }

                // Check if confirmShutdown() was called at the end of the loop.
                if (success && gracefulShutdownStartTime == 0) {
                    if (logger.isErrorEnabled()) {
                        logger.error("Buggy " + EventExecutor.class.getSimpleName() + " implementation; " +
                                SingleThreadEventExecutor.class.getSimpleName() + ".confirmShutdown() must " +
                                "be called before run() implementation terminates.");
                    }
                }

                try {
                    // Run all remaining tasks and shutdown hooks.
                    for (;;) {
                        if (confirmShutdown()) {
                            break;
                        }
                    }
                } finally {
                    try {
                        // 关闭selector
                        cleanup();
                    } finally {
                        // Lets remove all FastThreadLocals for the Thread as we are about to terminate and notify
                        // the future. The user may block on the future and once it unblocks the JVM may terminate
                        // and start unloading classes.
                        // See https://github.com/netty/netty/issues/6596.
                        FastThreadLocal.removeAll();
                        // 设置终止状态
                        STATE_UPDATER.set(SingleThreadEventExecutor.this, ST_TERMINATED);
                        threadLock.countDown();
                        if (logger.isWarnEnabled() && !taskQueue.isEmpty()) {
                            logger.warn("An event executor terminated with " +
                                    "non-empty task queue (" + taskQueue.size() + ')');
                        }
                        terminationFuture.setSuccess(null);
                    }
                }
            }
        }
    });
}

8. 总结

8.1. 启动服务要点

  • Selector 创建于 NioEventLoop 的构造函数内;
  • ServerSocketChannel 首次注册到 Selector 时,并非立刻监听 OP_ACCEPT,而是0;
  • ServerSocketChannel 监听 OP_ACCEPT 发生在 bind 到 Selector 后,fireChannelActive;
  • ChannelInitializer#initChannel 会在 pipeline#fireChannelRegistered 传播注册事件时触发,并在触发后,从pipeline中移除(用完即弃),可用作如登录授权之类的操作;

8.2. 构建连接要点

  • pipeline 是 SocketChannel 独占的,即每个 SocketChannel 的 pipeline 都非同一对象;
  • SocketChannel 的初始化和注册操作,是通过 ServerSocketChannel 的 ChannelPipeline#fireChannelRead ,pipeline 中的 ServerBootstrapAcceptor 完成;
  • 同样,SocketChannel 首次注册时,是0,而非 OP_READ ;
  • SocketChannel 监听 OP_READ 发生在 fireChannelActive;

8.3. 接收数据要点

  • 读取数据时,自适应 ByteBuf 分配器 AdaptiveRecvByteBufAllocator 来分配数据的缓冲区,大胆扩容,小心缩容;
  • SocketChannel 一直存在数据待读取,会一直读取吗?答案是不会的,雨露均沾,默认16次读取上限次数;
  • 读取到的数据,以 ByteBuf 立刻在 pipeline 中 fireChannelRead,并在本 SocketChannel 不再读取时,触发 fireChannelReadComplete ;
  • NioEventLoop 是绑定在此 EventLoop 中所有的 SocketChannel 共享的线程,并且在生命周期内不会发生变更;

8.4. 数据处理要点

  • 接收到数据 ByteBuf,在 pipeline 上传播 channelRead,由于 SocketChannel 在生命周期内,是不会发生变更的 EventLoop ,因此,接收顺序可保持与客户端发送顺序一致;
  • pipeline 上的 Handler 也可指定处理线程池,此操作有可能会破坏数据管道顺序;
  • pipeline 本质上是一个双向链表,链表头是HeadContext,链表末尾是TailContext;
  • 接收数据时,会从 HeadContext 开始,依次触发 pipeline 上所有 ChannelInboundHandler 处理,可中途退出,不保证执行到 TailContext;

8.5. 发送数据要点

  • AbstractChannelHandlerContext#write 与 AbstractChannelHandlerContext#pipeline#write 是存在区别的,写回数据时,前者是从当前处理 Handler 到 HeadContext ,而后再,则是从 TailContext 到 HeadContext;
  • write 和 flush 之间存在 OutBoundBuffer,写回的数据并非立刻写入到 SocketChannel, 而是中间存在一个缓冲;
  • 写入数据到 SocketChannel 时,若多次写入(默认最多16次)仍未写完,则提交一个定时任务到NioEventLoop,下次再继续写;若无法写入或写入不多时,则注册 OP_WRITE 事件;
  • 触发 OP_WRITE 事件时,代表不是有数据可写,而是 SocketChannel 可以写入数据了;
  • 当 SocketChannel 写入(生产端)的数据速度高于客户端读取速度(消费端),OutBoundBuffer 会出现数据堆积,达到一定量时,达到高水位,默认64KB(io.netty.channel.WriteBufferWaterMark#DEFAULT_HIGH_WATER_MARK,可通过io.netty.channel.ChannelOption#WRITE_BUFFER_HIGH_WATER_MARK 设置),SocketChannel 会设置标记为(io.netty.channel.ChannelOutboundBuffer#unwritable)不可写入,但是我们其实可以继续写入,极限情况会出现OOM。因此,我们在写入数据前,最好先判断是否可以写入数据(io.netty.channel.AbstractChannel#isWritable),由业务层去处理不能写入时的情况。
  • 当数据低于低水位,默认32K(io.netty.channel.WriteBufferWaterMark#DEFAULT_LOW_WATER_MARK,可通过io.netty.channel.ChannelOption#WRITE_BUFFER_LOW_WATER_MARK设置),SocketChannel 会设置标记为(io.netty.channel.ChannelOutboundBuffer#unwritable)可以写入;

8.6. 断开连接要点

  • 关闭连接,会触发 OP_READ 事件,读取数据字节数为 -1 ,代表关闭 SocketChannel;
  • 数据读取进行时,强行关闭,会触发 IOExeception,从而关闭 SocketChannel;

8.7. 关闭服务要点

  • 先不接活了(state 设置为 ST_SHUTTING_DOWN,优先关闭 boss,再关闭 worker,关闭所有channel);
  • 尽量干完手头的活,不保证全部干完(gracefulShutdownTimeout 最长优雅关闭时长,gracefulShutdownQuietPeriod 优雅关闭静默期);