Java NIO基础non-blocking io，即非阻塞IO NIO三大组件通道（Channel）和缓冲区（Buf

本文内容来自B站黑马课程及相关书籍学习总结

non-blocking io，即非阻塞IO

1 NIO三大组件

1.1 通道（Channel）和缓冲区（Buffer）

channel 有一点类似于 stream，它是读写数据的双向通道，可以从 channel 将数据读入 buffer，也可以将 buffer 的数据写入 channel。

常见的Channel包括：

FileChannel：文件通道，用于文件的数据读写。
SocketChannel：套接字通道，用于套接字TCP连接的数据读写。
ServerSocketChannel：服务器套接字通道（或服务器监听通道），允许我们监听TCP连接请求，为每个监听到的请求创建一个SocketChannel通道。
DatagramChannel：数据报通道，用于UDP的数据读写。

buffer 则用来缓冲读写数据，常见的 buffer 有：

ByteBuffer
- MappedByteBuffer
- DirectByteBuffer
- HeapByteBuffer
ShortBuffer
IntBuffer
LongBuffer
FloatBuffer
DoubleBuffer
CharBuffer

1.2 选择器（Selector）

IO多路复用指的是一个进程/线程可以同时监视多个文件描述符（含socket连接），一旦其中的一个或者多个文件描述符可读或者可写，该监听进程/线程就能够进行IO就绪事件的查询。

在Java应用层面，selector可以理解为一个IO事件的监听与查询器，其作用就是配合一个线程来管理多个 channel，获取这些 channel 上发生的IO就绪事件，这些 channel 工作在非阻塞模式下，不会让线程吊死在一个 channel 上。适合连接数特别多，但流量低的场景（low traffic）

graph TD
subgraph selector 流程
thread --> selector
selector --> c1(channel)
selector --> c2(channel)
selector --> c3(channel)
end

调用 selector 的 select() 会阻塞，直到 channel 发生了IO就绪事件，这些事件发生，select 方法就会返回这些事件交给 thread 来处理

2 ByteBuffer

保存原生字节的缓冲区

读取文本内容案例：

public static void main(String[] args) {
    //生成FileChannel的方式：1、FileInputStream或FileOutputStream 2、RandomAccessFile
    try (FileChannel fileChannel = new FileInputStream("data.txt").getChannel()) {
        ByteBuffer buffer = ByteBuffer.allocate(10);
        int length = 0;
        while ((length = fileChannel.read(buffer)) != -1) {
            log.debug("读取到的字节数：{}", length);
            buffer.flip();//切换为读模式
            while (buffer.hasRemaining()) {
                byte b = buffer.get();
                log.debug("实际字节：{}", (char) b);
            }
            buffer.clear();//切换为写模式
        }
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
}

2.1 使用ByteBuffer的基本步骤

使用allocate()方法创建一个ByteBuffer类的实例对象
向 buffer 写入数据，例如调用 channel.read(buffer)
在开始读取数据前调用 flip() 切换至读模式
从 buffer 读取数据，例如调用 buffer.get()
调用 clear() 或 compact() 切换至写模式
重复 2~5 步骤

2.2 ByteBuffer重要属性

为了记录读写的状态和位置，Buffer类提供了四个重要属性：
1、容量（Capacity）

capacity属性表示缓冲区能够容纳的数据元素的最大数量。

2、上界（Limit）

limit属性表示可以写入或者读取的数据最大上限，具体含义也与缓冲区的读写模式有关：

写模式下，limit 代表的是最大能写入的数据，这个时候 limit 等于 capacity。

读模式下，limit 等于 Buffer 中实际的数据大小，因为 Buffer 不一定被写满了。

3、位置（Position）

下一个要被读或写的元素的索引。位置会自动由相应的 get()和 put()函数更新。

position 的初始值是 0，每往 Buffer 中写入一个值，position 就自动加 1，代表下一次的写入位置，最大值为limit，表示无空间可写。

从写操作模式到读操作模式切换的时候（调用flip方法），position 归零，表示可以从头开始读。

4、标记（Mark)

一个备忘位置。调用 mark( )来设定 mark = postion。调用 reset( )设定 position = mark。

演示：

创建ByteBUffer后初始状态

写模式下，position 是下一个写入位置，limit等于capacity，下图表示写入了 4 个字节后的状态

flip 动作发生后，position 切换为0，limit 切换为读取限制

读取 4 个字节后，状态

clear 动作发生后，回到初始状态

还有一个compact方法，它会保留未读取的方法，将未读完的部分向前压缩，然后切换至写模式

2.3 ByteBuffer常见方法

Buffer类方法：

2.3.1 分配空间

可以使用 allocate 方法为 ByteBuffer 分配空间，并获取该ByteBuffer实例对象

Bytebuffer buf = ByteBuffer.allocate(16);

2.3.2 向buffer写数据

调用 channel 的 read 方法
调用 buffer 自己的 put 方法

2.3.3 从buffer读取数据

调用 channel 的 write 方法
调用 buffer 自己的 get 方法

get 方法会让 position 读指针向后走，如果想重复读取数据

可以调用 rewind 方法将 position 重新置为 0
或者调用 get(int i) 方法获取索引 i 的内容，它不会移动读指针

2.3.4 mark()和reset()

mark()和reset()两个方法是配套使用的：Buffer.mark()方法将当前position的值保存起来放在mark属性中，让mark属性记住这个临时位置；

调用Buffer.reset()方法可以将mark的值恢复到position中。

注意：rewind 和 flip 都会清除 mark 位置

2.3.5 字符串和ByteBuffer互转

//1. 字符串转为ByteBuffer
ByteBuffer buffer1 = ByteBuffer.allocate(16);
buffer1.put("hello".getBytes(StandardCharsets.UTF_8));
debugAll(buffer1);

//2. Charset
ByteBuffer buffer2 = StandardCharsets.UTF_8.encode("hello");
debugAll(buffer2);

//3. wrap
ByteBuffer buffer3 = ByteBuffer.wrap("hello".getBytes(StandardCharsets.UTF_8));
debugAll(buffer3);

//4. ByteBuffer转换为字符串
String str1 = StandardCharsets.UTF_8.decode(buffer2).toString();
System.out.println(str1);

buffer1.flip();
String str2 = StandardCharsets.UTF_8.decode(buffer1).toString();
System.out.println(str2);

2.4 ByteBuffer分散读

分散读取一个文本内容，从文件通道将数据读取到多个ByteBuffer中

try (FileChannel fileChannel = new RandomAccessFile("words.txt", "rw").getChannel()) {
    ByteBuffer buffer1 = ByteBuffer.allocate(3);
    ByteBuffer buffer2 = ByteBuffer.allocate(3);
    ByteBuffer buffer3 = ByteBuffer.allocate(5);
    fileChannel.read(new ByteBuffer[]{buffer1,buffer2,buffer3});
    //......
} catch (IOException e) {
    throw new RuntimeException(e);
}

2.5 ByteBuffer集中写

将多个ByteBuffer缓冲区内容写入一个通道

try (FileChannel fileChannel = new RandomAccessFile("word2.txt", "rw").getChannel()) {
    ByteBuffer b1 = StandardCharsets.UTF_8.encode("hello");
    ByteBuffer b2 = StandardCharsets.UTF_8.encode("world");
    ByteBuffer b3 = StandardCharsets.UTF_8.encode("你好");
    fileChannel.write(new ByteBuffer[]{b1, b2, b3});
} catch (IOException e) {
    e.printStackTrace();
}

2.6 处理粘包和半包问题

通过分隔符进行解析

public static void main(String[] args) {
    ByteBuffer buffer = ByteBuffer.allocate(1024);
    buffer.put("Hello,world\nI'm mimang\nho".getBytes());
    split(buffer);
    buffer.put("w are you?\n".getBytes());
    split(buffer);
}

private static void split(ByteBuffer buffer) {
    buffer.flip();
    for (int i = 0; i < buffer.limit(); i++) {
        //get(i)不会移动position位置
        if (buffer.get(i) == '\n') {
            int length = i + 1 - buffer.position();
            ByteBuffer target = ByteBuffer.allocate(length);
            for (int j = 0; j < length; j++) {
                target.put(buffer.get());
            }
            debugAll(target);
        }
    }
    buffer.compact();
}

3 文件编程

3.1 FileChannel

⚠️注意：

FileChannel只能工作在阻塞模式下

获取FileChannel

不能直接打开 FileChannel，必须通过 FileInputStream、FileOutputStream 或者 RandomAccessFile 来获取 FileChannel，它们都有 getChannel 方法

通过 FileInputStream 获取的 channel 只能读
通过 FileOutputStream 获取的 channel 只能写
通过 RandomAccessFile 是否能读写根据构造 RandomAccessFile 时的读写模式决定

读取数据

会从 channel 读取数据填充 ByteBuffer，返回值表示读到了多少字节，-1 表示到达了文件的末尾

int readBytes = channel.read(buffer);

写入数据

while(buffer.hasRemaining()) {
    channel.write(buffer);
}

在 while 中调用 channel.write 是因为 write 方法并不能保证一次将 buffer 中的内容全部写入 channel

关闭通道

使用后Channel必须关闭，调用 FileInputStream、FileOutputStream 或者 RandomAccessFile 的 close 方法会间接地调用 channel 的 close 方法，也可以直接调用channel的close方法进行关闭。

大小

使用 size 方法获取文件的大小

强制刷新到磁盘

在将buffer写入channel时，出于性能的原因，操作系统不会每次都实时地将数据落地到磁盘，完成最终的数据保存。如果要保证数据立刻写入磁盘，可以在写入后调用一下FileChannel的force()方法。

//强制刷新到磁盘
channel.force(true);

3.2 FileChannel文件复制

try (FileChannel from = new FileInputStream("data.txt").getChannel();
     FileChannel to = new FileOutputStream("out.txt").getChannel()) {
    //效率高，底层会利用操作系统的零拷贝进行优化 2G
    from.transferTo(0, from.size(), to);
} catch (IOException e) {
    e.printStackTrace();
}

传输超过2G的大文件：

try (FileChannel from = new FileInputStream("data.txt").getChannel();
     FileChannel to = new FileOutputStream("out.txt").getChannel()) {
    //效率高，底层会利用操作系统的零拷贝进行优化 2g数据
    long size = from.size();
    for (long left = size; left > 0; ) {
        //left代表剩余未传输的字节数
        left -= from.transferTo((size - left), left, to);
    }
} catch (IOException e) {
    e.printStackTrace();
}

4 网络编程

4.1 阻塞和非阻塞

阻塞

阻塞模式下，相关方法都会导致线程暂停
- ServerSocketChannel.accept 会在没有连接建立时让线程暂停
- SocketChannel.read 会在没有数据可读时让线程暂停
- 阻塞的表现其实就是线程暂停了，暂停期间不会占用 cpu，但线程相当于闲置
单线程下，阻塞方法之间相互影响，几乎不能正常工作，需要多线程支持
多线程下，有新的问题，体现在以下方面
- 如果当前连接数过多，可能会导致 OOM，并且线程太多时，会因为频繁上下文切换导致性能降低
- 可以采用线程池技术来减少线程数和线程上下文切换，但治标不治本，如果有很多连接建立，但长时间 inactive，会阻塞线程池中所有线程，因此不适合长连接，只适合短连接

非阻塞

非阻塞模式下，相关方法都会不会让线程暂停
- 在 ServerSocketChannel.accept 在没有连接建立时，会返回 null，继续运行
- SocketChannel.read 在没有数据可读时，会返回 0，但线程不会阻塞，可以去执行其它 SocketChannel 的 read 或是去执行 ServerSocketChannel.accept
- 写数据时，线程只是等待数据写入 Channel 即可，无需等 Channel 通过网络把数据发送出去
但非阻塞模式下，即使没有连接建立，和可读数据时，线程仍然在不断运行，cpu空转，浪费资源
数据复制过程中，线程实际还是阻塞的（AIO 改进的地方）

IO多路复用

单线程可以配合 Selector 完成对多个 Channel IO就绪事件的监控，称之为多路复用

多路复用仅针对网络 IO、普通文件 IO 没法利用多路复用
如果不用 Selector 的非阻塞模式，线程大部分时间都在做无用功，而 Selector 能够保证
- 有可连接事件时才去连接
- 有可读事件才去读取
- 有可写事件才去写入
  - 限于网络传输能力，Channel 未必时时可写，一旦 Channel 可写，会触发 Selector 的可写事件

4.2 Selector

一个单线程处理一个选择器，一个选择器可以监控多个通道的事件。事件发生线程才去处理。避免非阻塞模式下所做无用功；

只用一个线程就可以处理所有的通道，这样会大量地减少线程之间上下文切换的开销；

创建

Selector selector = Selector.open();

注册Channel事件

通道和选择器之间的关联通过register（注册）的方式完成

channel.configureBlocking(false);
SelectionKey key = channel.register(selector, 绑定事件);

channel 必须工作在非阻塞模式
FileChannel 没有非阻塞模式，因此不能配合 selector 一起使用
绑定的事件类型可以有
- 可读：SelectionKey.OP_READ
- 可写：SelectionKey.OP_WRITE
- 连接：SelectionKey.OP_CONNECT
- 接收：SelectionKey.OP_ACCEPT

如果选择器要监控通道的多种事件，可以用“按位或”运算符来实现

//监控通道的多种事件，用“按位或”运算符来实现
int key = SelectionKey.OP_READ | SelectionKey.OP_WRITE ;

监听 Channel 事件

可以通过下面三种方法来监听是否有事件发生，方法的返回值代表有多少 channel 发生了selector感兴趣的事件

方法1，阻塞直到绑定事件发生

int count = selector.select();

方法2，阻塞直到绑定事件发生，或是超时（时间单位为 ms）

int count = selector.select(long timeout);

方法3，不会阻塞，也就是不管有没有事件，立刻返回，自己根据返回值检查是否有事件

int count = selector.selectNow();

💡 select 何时不阻塞

事件发生时

客户端发起连接请求，会触发 accept 事件

客户端发送数据过来，客户端正常、异常关闭时，都会触发 read 事件，另外如果发送的数据大于 buffer 缓冲区，会触发多次读取事件

channel 可写，会触发 write 事件

在 linux 下 nio bug 发生时

调用 selector.wakeup()

调用 selector.close()

selector 所在线程 interrupt

4.3 处理Accept和Read事件

public static void main(String[] args) throws IOException {
    //1. 创建Selector
    Selector selector = Selector.open();
    //2.创建ServerSocketChannel
    ServerSocketChannel ssc = ServerSocketChannel.open();
    ssc.configureBlocking(false);//设置非阻塞模式
    ssc.bind(new InetSocketAddress(8080));//设置监听端口
    //3.注册通道连接事件到selector
    SelectionKey sscKey = ssc.register(selector, SelectionKey.OP_ACCEPT);
    log.debug("sscKey:{}", sscKey);
    //4. selector监听IO就绪事件，没有就绪事件会阻塞等待，有事件时线程恢复运行
    while (selector.select() > 0) {
        //获取SelectionKey集合
        Set<SelectionKey> keySet = selector.selectedKeys();
        Iterator<SelectionKey> iterator = keySet.iterator();
        while (iterator.hasNext()) {
            SelectionKey key = iterator.next();
            //需主动从selectionKey集合移除，否则下次会重复处理
            iterator.remove();
            log.debug("key:{}", key);
            if (key.isAcceptable()) {
                handleAccept(selector, key);
            } else if (key.isReadable()) {
                handleRead(key);
            }
        }
    }
}

/**
 * 处理Accept事件
 * @param selector
 * @param key
 * @throws IOException
 */
private static void handleAccept(Selector selector, SelectionKey key) throws IOException {
    ServerSocketChannel ssc;
    ssc = (ServerSocketChannel) key.channel();
    SocketChannel sc = ssc.accept();
    sc.configureBlocking(false);//非阻塞
    sc.register(selector, SelectionKey.OP_READ);//注册IO读就绪事件
}

/**
 * 处理Read事件
 * @param key
 */
private static void handleRead(SelectionKey key) {
    try {
        SocketChannel channel = (SocketChannel) key.channel();
        ByteBuffer buffer = ByteBuffer.allocate(16);
        int read = channel.read(buffer);//客户端正常断开，这里返回-1
        if (read == -1) {
            key.cancel();
        } else {
            buffer.flip();
            debugRead(buffer);
        }
    } catch (IOException e) {
        e.printStackTrace();
        key.cancel();//客户端断开时，需要将关联的key取消
    }
}

为什么要调用iterator.remove

因为 select 在事件发生后，就会将相关的 key 放入 selectedKeys 集合中，但处理完后不会主动从 selectedKeys 集合中移除，需要我们自己编码删除。

cancel的作用

cancel 会取消注册在 selector 上的 channel，并从 keys 集合中删除 key 后续不会再监听事件

处理消息边界问题

一种思路是固定消息长度，数据包大小一样，服务器按预定长度读取，缺点是浪费带宽
另一种思路是按分隔符拆分，缺点是效率低
自定义Head-Content协议，在正式数据前面指定数据的字节数length，就可以方便地获取每次传输消息的大小

ByteBuffer 大小分配

每个 channel 都需要记录可能被切分的消息，因为 ByteBuffer 不能被多个 channel 共同使用，因此需要为每个 channel 维护一个独立的 ByteBuffer
ByteBuffer 不能太大，比如一个 ByteBuffer 1Mb 的话，要支持百万连接就要 1Tb 内存，因此需要设计大小可变的 ByteBuffer
- 一种思路是首先分配一个较小的 buffer，例如 4k，如果发现数据不够，再分配 8k 的 buffer，将 4k buffer 内容拷贝至 8k buffer，优点是消息连续容易处理，缺点是数据拷贝耗费性能
- 另一种思路是用多个数组组成 buffer，一个数组不够，把多出来的内容写入新的数组，与前面的区别是消息存储不连续解析复杂，优点是避免了拷贝引起的性能损耗

4.4 处理Write事件

非阻塞模式下，无法保证把 buffer 中所有数据都写入 channel，因此需要追踪 write 方法的返回值（代表实际写入字节数）
用 selector 监听所有 channel 的可写事件，每个 channel 都需要一个 key 来跟踪 buffer，但这样又会导致占用内存过多，就有两阶段策略
- 当消息处理器第一次写入消息时，才将 channel 注册到 selector 上
- selector 检查 channel 上的可写事件，如果所有的数据写完了，就取消 channel 的事件注册
- 如果不取消，会每次可写均会触发 write 事件

案例：

try (Selector selector = Selector.open();
     ServerSocketChannel ssc = ServerSocketChannel.open()) {
  ssc.configureBlocking(false);
  ssc.bind(new InetSocketAddress(8080));
  ssc.register(selector, SelectionKey.OP_ACCEPT);
  while (selector.select() > 0) {
    Set<SelectionKey> keySet = selector.selectedKeys();
    Iterator<SelectionKey> iterator = keySet.iterator();
    while (iterator.hasNext()) {
      SelectionKey key = iterator.next();
      iterator.remove();
      if (key.isAcceptable()) {
        SocketChannel sc = ssc.accept();
        sc.configureBlocking(false);
        SelectionKey scKey = sc.register(selector, SelectionKey.OP_READ);
        StringBuilder sb = new StringBuilder();
        //1. 向客户端发送大量数据
        for (int i = 0; i < 3000000; i++) {
          sb.append("a");
        }
        ByteBuffer buffer = Charset.defaultCharset().encode(sb.toString());
        //2. 返回值代表实际写入的字节数
        int write = sc.write(buffer);
        log.debug("write bytes:{} ", write);
        if (buffer.hasRemaining()) {
          //注册写事件
          sc.register(selector, scKey.interestOps() | SelectionKey.OP_WRITE);
          //将未写完的数据作为附件关联到scKey
          scKey.attach(buffer);
        }
      } else if (key.isWritable()) {
        ByteBuffer buffer = (ByteBuffer) key.attachment();
        SocketChannel sc = (SocketChannel) key.channel();
        int write = sc.write(buffer);
        log.debug("write bytes:{} ", write);
        if (!buffer.hasRemaining()) {
          //注销监听Write事件
          key.interestOps(key.interestOps() ^ SelectionKey.OP_WRITE);
          //将附件buffer置为null，便于垃圾回收内存
          key.attach(null);
        }
      }
    }
  }
}

💡 写数据完成后为何要注销Write事件

只要向 channel 发送数据时，socket 缓冲可写，这个事件会频繁触发，因此应当只在 socket 缓冲区写不下时再关注可写事件，数据写完之后再取消关注。

4.5 多线程优化

前面的方式都是单个线程处理Accept和其他IO就绪事件，没有充分利用多核CPU

改进方案：分两组选择器

使用单线程配一个选择器，专门处理 accept 事件
创建 cpu 核心数的线程，每个线程配一个选择器，轮流处理其他事件

案例代码：

public class MultiThreadServer {
    public static void main(String[] args) throws IOException {
        Selector boss = Selector.open();
        ServerSocketChannel ssc = ServerSocketChannel.open();
        ssc.configureBlocking(false);
        ssc.bind(new InetSocketAddress(8080));
        SelectionKey sscKey = ssc.register(boss, 0);
        sscKey.interestOps(SelectionKey.OP_ACCEPT);
        //1. 创建固定数量的worker
        Worker[] workers = new Worker[Runtime.getRuntime().availableProcessors()];
        for (int i = 0; i < 2; i++) {
            workers[i] = new Worker("worker-" + i);
        }
        AtomicInteger index = new AtomicInteger();
        while (boss.select() > 0) {
            Set<SelectionKey> keySet = boss.selectedKeys();
            Iterator<SelectionKey> iterator = keySet.iterator();
            while (iterator.hasNext()) {
                SelectionKey key = iterator.next();
                iterator.remove();
                if (key.isAcceptable()) {
                    SocketChannel sc = ssc.accept();
                    sc.configureBlocking(false);
                    log.debug("connected...{}", sc.getRemoteAddress());
                    //2. 关联selector
                    log.debug("before register...{}", sc.getRemoteAddress());
                    workers[index.getAndIncrement() & workers.length].register(sc);//初始化selector，启动worker-0
                    log.debug("after register...{}", sc.getRemoteAddress());
                }
            }
        }
    }

    static class Worker implements Runnable {
        private Thread thread;
        private Selector selector;
        private String name;
        private volatile boolean start = false;
        private final ConcurrentLinkedQueue<Runnable> queue = new ConcurrentLinkedQueue<>();

        public Worker(String name) {
            this.name = name;
        }

        public void register(SocketChannel sc) throws IOException {
            if (!start) {
                thread = new Thread(this, name);
                selector = Selector.open();
                thread.start();
                start = true;
            }
            //向队列添加一个任务
            queue.add(() -> {
                try {
                    sc.register(selector, SelectionKey.OP_READ);
                } catch (ClosedChannelException e) {
                    e.printStackTrace();
                }
            });
            selector.wakeup();//唤醒select方法
        }

        @Override
        public void run() {
            while (true) {
                try {
                    selector.select();//worker-0
                    Runnable task = queue.poll();
                    if (task != null) {
                        task.run();//执行注册Read事件
                    }
                    Set<SelectionKey> keySet = selector.selectedKeys();
                    Iterator<SelectionKey> iterator = keySet.iterator();
                    while (iterator.hasNext()) {
                        SelectionKey key = iterator.next();
                        iterator.remove();
                        if (key.isReadable()) {
                            ByteBuffer buffer = ByteBuffer.allocate(16);
                            SocketChannel channel = (SocketChannel) key.channel();
                            log.debug("read...{}", channel.getRemoteAddress());
                            channel.read(buffer);
                            buffer.flip();
                            debugRead(buffer);
                        }
                    }
                } catch (IOException e) {
                    throw new RuntimeException(e);
                }
            }
        }
    }
}

5 NIO vs BIO

stream 不会自动缓冲数据，channel 会利用系统提供的发送缓冲区、接收缓冲区（更为底层）
stream 仅支持阻塞 API，channel 同时支持阻塞、非阻塞 API，网络 channel 可配合 selector 实现多路复用
二者均为全双工，即读写可以同时进行