一、NIO的Buffer
- NIO的数据读取和写入都是经过缓冲区的
- Buffer的实质是操作字节或者基础类型的数组
1.Buffer的重要参数
// Invariants: mark <= position <= limit <= capacity
private int mark = -1;
private int position = 0;
private int limit;
private int capacity;
// Used only by direct buffers
// NOTE: hoisted here for speed in JNI GetDirectBufferAddress
long address;
1.1 mark
用来标记当前position的位置,当调用reset方法时会将当前position重置到mark的位置
/**
* Sets this buffer's mark at its position.
*
* @return This buffer
*/
public final Buffer mark() {
mark = position;
return this;
}
/**
* Resets this buffer's position to the previously-marked position.
*
* <p> Invoking this method neither changes nor discards the mark's
* value. </p>
*
* @return This buffer
*
* @throws InvalidMarkException
* If the mark has not been set
*/
public final Buffer reset() {
int m = mark;
if (m < 0)
throw new InvalidMarkException();
position = m;
return this;
}
1.2 position
用来标识下一个要读取或者写入的位置
1.3 limit
- 用来标识第一个不能被读或者写的位置
- 所以在读模式下limit是等于容量capacity的
- 写模式下limit等于buffer中存在数据的最大值的位置
1.4 capacity
表示缓冲区的最大容量
1.5 address
仅仅在直接内存时使用,标识物理位置
2、Buffer的重要API
// 指针重新设置,为下一个Channe做准备
public final Buffer clear() {
position = 0;
limit = capacity;
mark = -1;
return this;
}
// 从头开始读或者写
public final Buffer flip() {
limit = position;
position = 0;
mark = -1;
return this;
}
// 和flip的区别是不设置limit
public final Buffer rewind() {
position = 0;
mark = -1;
return this;
}
3、Buffer的重要实现类
3.1 ByteBuffer
- ByteBuffer数组底层的数据结构是字节数组
- slice共享原Buffer的内容,两个Buffer的修改都会对彼此产生影响,因为底层是维护同一个数组;
- duplicate和slice其实都算是浅拷贝
public abstract class ByteBuffer
extends Buffer
implements Comparable<ByteBuffer>
{
final byte[] hb; // Non-null only for heap buffers
final int offset;
boolean isReadOnly; // Valid only for heap buffers
// 创建一个直接内存分配的ByteBuffer
public static ByteBuffer allocateDirect(int capacity) {
return new DirectByteBuffer(capacity);
}
// 创建一个堆内存分配的ByteBuffer
public static ByteBuffer allocate(int capacity) {
if (capacity < 0)
throw new IllegalArgumentException();
return new HeapByteBuffer(capacity, capacity);
}
/**
* Creates a new byte buffer whose content is a shared subsequence of
* this buffer's content.
*
* <p> The content of the new buffer will start at this buffer's current
* position. Changes to this buffer's content will be visible in the new
* buffer, and vice versa; the two buffers' position, limit, and mark
* values will be independent.
*
* <p> The new buffer's position will be zero, its capacity and its limit
* will be the number of bytes remaining in this buffer, and its mark
* will be undefined. The new buffer will be direct if, and only if, this
* buffer is direct, and it will be read-only if, and only if, this buffer
* is read-only. </p>
*
* @return The new byte buffer
*/
public abstract ByteBuffer slice();
/**
* Creates a new byte buffer that shares this buffer's content.
*
* <p> The content of the new buffer will be that of this buffer. Changes
* to this buffer's content will be visible in the new buffer, and vice
* versa; the two buffers' position, limit, and mark values will be
* independent.
*
* <p> The new buffer's capacity, limit, position, and mark values will be
* identical to those of this buffer. The new buffer will be direct if,
* and only if, this buffer is direct, and it will be read-only if, and
* only if, this buffer is read-only. </p>
*
* @return The new byte buffer
*/
public abstract ByteBuffer duplicate();
/**
* Relative <i>get</i> method. Reads the byte at this buffer's
* current position, and then increments the position.
*/
public abstract byte get();
/**
* <p> Writes the given byte into this buffer at the current
* position, and then increments the position. </p>
*/
public abstract ByteBuffer put(byte b);
}
3.2 IntBuffer
- IntBuffer的底层是int数组
- 除了IntBuffer,还有LongBuffer、FloatBuffer等
- 这些都只是抽象类,具体实现要看下面的基于堆和基于直接内存的实现
3.3 DirectByteBuffer&&HeapByteBuffer
- HeapByteBuffer和DirectByteBuffer的实现就是操作Buffer的指针
- HeapByteBuffer是基于index来获取或者设置数组数据的,DirectByteBuffer是基于Unsafe直接操作内存地址来实现数据的获取的
- 下面以get方法为例说明
3.3.1 HeapByteBuffer.get
final byte[] hb; // Non-null only for heap buffers
protected int ix(int i) {
// offset默认是0,也就是获取数组在position位置的数据
return i + offset;
}
public byte get() {
return hb[ix(nextGetIndex())];
}
final int nextGetIndex() { // package-private
if (position >= limit)
throw new BufferUnderflowException();
return position++;
}
3.3.2 DirectByteBuffer.get
// 通过内存地址去获取数组数据
private long ix(int i) {
return address + ((long)i << 0);
}
public byte get() {
// unsafe可以参考https://tech.meituan.com/2019/02/14/talk-about-java-magic-class-unsafe.html
return ((unsafe.getByte(ix(nextGetIndex()))));
}
二、netty的ByteBuf
1、ByteBuf的重要参数
int readerIndex; // 读指针
int writerIndex; // 写指针
private int markedReaderIndex; // mark之后的读指针
private int markedWriterIndex; // mark之后的写指针
private int maxCapacity; // 最大容量
- 实现读写分离,使得对Buffer的操作更加方便
- 下图是各变量之间的相互关系
+-------------------+------------------+------------------+
| discardable bytes | readable bytes | writable bytes |
| | (CONTENT) | |
+-------------------+------------------+------------------+
| | | |
0 <= readerIndex <= writerIndex <= capacity
2、ByteBuf的重要API
// 从当前readerIndex指针开始往后读一个字节的数据并移动readerIndex,将存储单位转化为Byte
@Override
public byte readByte() {
checkReadableBytes0(1);
int i = readerIndex;
byte b = _getByte(i);
readerIndex = i + 1;
return b;
}
// 从当前readerIndex指针开始往后读4个字节的数据并移动readerIndex,将存储单位转化为Int
@Override
public int readInt() {
checkReadableBytes0(4);
int v = _getInt(readerIndex);
readerIndex += 4;
return v;
}
// 从当前writerIndex开始往后写src.size个字节并移动writerIndex
@Override
public ByteBuf writeBytes(byte[] src) {
writeBytes(src, 0, src.length);
return this;
}
// 获取当前Buffer中可读的字节数
@Override
public int readableBytes() {
return writerIndex - readerIndex;
}
// 获取当前Buffer中可写的字节数
@Override
public int writableBytes() {
return capacity() - writerIndex;
}
3、ByteBuf的重要实现类
ByteBuf有很多子类,大致可以按照3个维度来进行分类,分别如下:
- Pooled和UnPooled:池化内存,就是从预分配好的内存空间中提取一段连续的内存封装成一个ByteBuf;类似于线程池、连接池等;netty提供了池化和非池化(UnPooled)的ByteBuf
- Unsafe和非Unsafe:Unsafe是JDK底层的一个负责I/O操作的对象,可以直接获得对象的内存地址,基于内存地址进行读写操作
- Direct和Heap:Direct即堆外内存,直接调用JDK底层的API进行物理内存分配,不在JVM的堆内存中进行分配,需要手动释放;Heap也就是在JVM的堆内存中进行分配空间
4、ByteBuf的零拷贝
4.1 传统意义上的零拷贝
传统意义上的零拷贝是指操作系统层面上的零拷贝,也即避免在用户态与内核态之间来回拷贝数据的技术
4.1.1 读取和写入数据过程
- 内核从磁盘中将数据读取到内核缓冲区
- cpu将内核缓冲数据copy到应用缓冲区
- 当向磁盘中写入数据时,cpu再将应用缓冲区数据copy到内核缓冲区
- 内核缓冲区再进行刷盘操作(或者从内核socket buffer拷贝到网卡接口缓冲区)
4.1.2 解决方案
- java提供的FileChannel.transferTo就可以避免上面的两次copy,其实就是sendFile系统调用,使用的是mmap(虚拟内存映射)
4.2 Netty的零拷贝
- netty的零拷贝完全是用户态的,它的Zero-Copy更多是优化用户态数据操作的概念
- netty零拷贝主要表现在以下几个方面
4.2.1 CompositeByteBuf
- 混合的ByteBuf,既可以有DirectByteBuf也可以有HeapByteBuf
- 当需要将两个ByteBuf进行合并时,NIO的做法是新建一个数组对象,数组的大小是header.size+body.size,然后将两个数组拷贝到新数组中
- netty提供的CompositeByteBuf可以直接将两个数组合并,而且之前的两个ByteBuf还是指向之前的内存地址,避免内存拷贝
ByteBuf header = ...
ByteBuf body = ...
CompositeByteBuf compositeByteBuf = Unpooled.compositeBuffer();
compositeByteBuf.addComponents(true, header, body);
4.2.2 通过wrap操作实现零拷贝
当需要把某个字节封装成ByteBuf时,需要定义一个数组对象,然后把字节数组赋值给新的字节数组,Netty提供了wrapBuffer方法可以直接把bytes赋值给ByteBuf,共享字节数组,避免的内存拷贝
byte[] bytes = ...
ByteBuf byteBuf = Unpooled.wrappedBuffer(bytes);
4.2.3 通过slice实现零拷贝
silce是把一个ByteBuf拆分,共享字节数组
ByteBuf byteBuf = ...
ByteBuf header = byteBuf.slice(0, 5);
ByteBuf body = byteBuf.slice(5, 10);
3、NIO和Netty中buffer的区别
- ByteBuf使用读写两个指针来判断,而ByteBuffer只有一个指针,这使得API操作起来更简单
- ByteBuf支持混合类型,而ByteBuffer只能使用数组
- 容量可以自动扩容,但是ByteBuffer不可以
- 支持池化,并且有引用计数,防止对象被回收