一、ServerBootstrap启动流程
- 逻辑图
- 核心成员
- NioEventLoopGroup 包含1-n个NioEventLoop
- NioEventLoop extends SingleThreadEventLoop 是一个包装了Selector的单线程的线程池,内部维护了多个队列(taskQueue > scheduledTaskQueue > tailTasks)。scheduledTaskQueue中满足执行条件的会加入到taskQueue中的队尾。
- Queue 在NioEventLoop中使用的Mpsc(多生产单消费型队列),在SingleThreadEventLoop中默认是LinkedBlockingQueue
- 核心方法run() 在这里轮询channel的事件及任务处理
- 虽然使用了多线程,但是为了避免跨线程操作。netty的所有channel都是绑定到线程,由固定线程操作,避免上下文切换及规避多线程问题。
- 代码执行流程
ServerBootstrap
1. group(bossGroup, workerGroup)
2. channel(NioServerSocketChannel.class)
//绑定到boss channel上
3. handler(new LoggingHandler(LogLevel.INFO))
//绑定到work channel上
4. childHandler(new ChannelInitializer<SocketChannel>() {
@Override
//什么时候触发的呢,在channel注册后pipeline.invokeHandlerAddedIfNeeded()方法中执行
public void initChannel(SocketChannel ch) {
ch.pipeline().addLast(handler);
}
})
5. b.bind(PORT).sync()
//AbstractBootstrap
1. doBind(ObjectUtil.checkNotNull(localAddress, "localAddress"))
1. initAndRegister()
1. channelFactory.newChannel()
//ServerBootstrap
2. init(channel)
1. setChannelOptions(channel, newOptionsArray(), logger)
2. setAttributes(channel, newAttributesArray())
3. currentChildGroup/currentChildHandler/currentChildOptions/currentChildAttrs
//DefaultChannelPipeline
4. p.addLast(new ChannelInitializer<Channel>() {
@Override
public void initChannel(final Channel ch) {
final ChannelPipeline pipeline = ch.pipeline();
ChannelHandler handler = config.handler();
if (handler != null) {
pipeline.addLast(handler);
}
ch.eventLoop().execute(new Runnable() {
@Override
public void run() {
pipeline.addLast(new ServerBootstrapAcceptor(
ch, currentChildGroup, currentChildHandler, currentChildOptions, currentChildAttrs));
}
});
}
})
// p.addLast添加channelHandle到pipline的逻辑
1. newContext(group, filterName(name, handler), handler)
2. addLast0(newCtx)
//channel未注册时,先加入等待队列,注册成功后再讲handle加入pipline
3. if !registered
1. newCtx.setAddPending()
2. callHandlerCallbackLater(newCtx, true)
1. new PendingHandlerAddedTask(ctx)
2. pendingHandlerCallbackHead = task
3. return
4. executor = newCtx.executor()
//channelHandle绑定的不是当前线程,则提交到绑定上
5. if !executor.inEventLoop()
1. callHandlerAddedInEventLoop(newCtx, executor)
1. newCtx.setAddPending()
2. executor.execute(() ->
callHandlerAdded0(newCtx))
2. return
//将channelHandle加入pipline链中
6. callHandlerAdded0(newCtx)
//MultithreadEventLoopGroup, 将channel注册到select上
3. config().group().register(channel)
1. next().register(channel)
//SingleThreadEventLoop
1. register(final ChannelPromise promise)
//AbstractChannel
2. promise.channel().unsafe().register(this, promise)
//eventLoop线程真正启动的地方
1. eventLoop.execute(() -> register0(promise))
//AbstractNioChannel
1. doRegister()
//将Java channel注册到select上,同时将netty channel作为注册的附件
1. javaChannel().register(eventLoop().unwrappedSelector(), 0, this)
2. pipeline.invokeHandlerAddedIfNeeded()
1. callHandlerAddedForAllHandlers()
1. PendingHandlerCallback task = pendingHandlerCallbackHead
2. while (task != null)
3. task.execute()
1. callHandlerAdded0(ctx)
//AbstractChannelHandlerContext
1. ctx.callHandlerAdded()
//ChannelInitializer
1. handler().handlerAdded(this)
1. ctx.channel().isRegistered()
//ChannelInitializer中用户重写的方法
2. initChannel(ctx)
//真正执行用户覆写的initChannel()方法
1. initChannel((C) ctx.channel())
//删除包装handler的外层ChannelInitializer
3. removeState(ctx)
4. task = task.next
3. pipeline.fireChannelRegistered()
// channel注册到boss线程的select上后,给channel绑定端口
2. doBind0(regFuture, channel, localAddress, promise)
channel.eventLoop().execute(new Runnable() {
@Override
public void run() {
if (regFuture.isSuccess()) {
channel.bind(localAddress, promise).addListener(ChannelFutureListener.CLOSE_ON_FAILURE);
} else {
promise.setFailure(regFuture.cause());
}
}
});
二、事件分发、任务轮询
NioEventLoop
run()
for (;;)
selectStrategy.calculateStrategy(selectNowSupplier, hasTasks())
processSelectedKeys()
processSelectedKeysPlain(selector.selectedKeys())
selectedKeys.iterator()
i.next()
k.attachment()
i.remove()
a instanceof AbstractNioChannel
processSelectedKey(k, (AbstractNioChannel) a)
ch.unsafe()
k.readyOps()
readyOps & SelectionKey.OP_CONNECT) != 0 //CONNECT 连接建立
k.interestOps()
ops &= ~SelectionKey.OP_CONNECT
k.interestOps(ops)
unsafe.finishConnect()
(readyOps & SelectionKey.OP_WRITE) != 0 //WRITE 可写
ch.unsafe().forceFlush() //触发写事件,将ChannelOutboundBuffer缓存的数据写入socket发送
(readyOps & (SelectionKey.OP_READ | SelectionKey.OP_ACCEPT)) != 0 || readyOps == 0 // READ/ACCEPT 可读、可连接
unsafe.read() //触发读事件,将socket的中的数据读入netty接收器RecvByteBufAllocator,接着触发pipline相应的事件链
runAllTasks(ioTime * (100 - ioRatio) / ioRatio) //异步任务,通过EventLoop.execute()提交的任务
三、Netty 网络数据流的读写流程
1. RecvByteBufAllocator读缓存申请器、ChannelOutboundBuffer写缓存(有高低水位线,提供被压机制,设置channel是否可写),这二位均使用了对象池技术
2. 读:// unsafe.read() NioEventLoop
socket.channel触发channel可读事件 -----> 网络数据读入RecvByteBufAllocator -----> 触发pipeline.fireChannelRead(byteBuf) -----> pipeline.fireChannelReadComplete()
1. 新连接建立(boss线程)
```
AbstractNioMessageChannel.NioMessageUnsafe //boss线程用于监听新连接,并将新连接注册到worker线程上
readBuf = new ArrayList<Object>()
read()
config()
pipeline()
unsafe().recvBufAllocHandle()
do {
int localRead = doReadMessages(readBuf);
buf.add(new NioSocketChannel(this, ch)) //NioServerSocketChannel
if (localRead == 0) {
break;
}
if (localRead < 0) {
closed = true;
break;
}
allocHandle.incMessagesRead(localRead);
} while (allocHandle.continueReading());
int size = readBuf.size();
for (int i = 0; i < size; i++) {
readPending = false;
pipeline.fireChannelRead(readBuf.get(i)); //最终会触发ServerBootstrapAcceptor.channelRead
channelRead(ChannelHandlerContext ctx, Object msg) //ServerBootstrapAcceptor
childGroup.register(child).addListener(() -> {...}) //将新连接注册到worker线程的select上
}
readBuf.clear();
allocHandle.readComplete();
pipeline.fireChannelReadComplete();
```
2. 读取已建立连接的channel上的数据(work线程)
```
AbstractNioByteChannel.NioByteUnsafe //连接建立后,读取数据
read()
config()
pipeline()
config.getAllocator()
recvBufAllocHandle()
allocHandle.reset(config)
do {
byteBuf = allocHandle.allocate(allocator);
allocHandle.lastBytesRead(doReadBytes(byteBuf));
if (allocHandle.lastBytesRead() <= 0) {
// nothing was read. release the buffer.
byteBuf.release();
byteBuf = null;
close = allocHandle.lastBytesRead() < 0;
if (close) {
// There is nothing left to read as we received an EOF.
readPending = false;
}
break;
}
allocHandle.incMessagesRead(1);
readPending = false;
pipeline.fireChannelRead(byteBuf);
byteBuf = null;
} while (allocHandle.continueReading());
allocHandle.readComplete();
pipeline.fireChannelReadComplete();
```
3. 写 // ch.unsafe().forceFlush() NioEventLoop
1. 由socket写事件触发,socket.channel触发channel可写事件 -----> 将应用写入ChannelOutboundBuffer的数据写入socket.channel ------> doWriteBytes(ByteBuf buf) -----> buf.readBytes(javaChannel(), expectedWrittenBytes)
```
AbstractNioChannel
forceFlush()
super.flush0() // AbstractChannel
doWrite(outboundBuffer) // AbstractNioByteChannel
int writeSpinCount = config().getWriteSpinCount()
do {
Object msg = in.current();
if (msg == null) {
// Wrote all messages.
clearOpWrite();
// Directly return here so incompleteWrite(...) is not called.
return;
}
writeSpinCount -= doWriteInternal(in, msg)
doWriteBytes(buf)
buf.readBytes(javaChannel(), expectedWrittenBytes) //NioSocketChannel
} while (writeSpinCount > 0);
incompleteWrite(writeSpinCount < 0)
```
2. 由用户flush操作主动触发(socket并不是一定要收到可写事件才能写,可以直接写只要底层写缓冲区队列有空间即可)
```
AbstractChannelHandlerContext
write(Object msg)
write(final Object msg, final ChannelPromise promise)
write(Object msg, boolean flush, ChannelPromise promise)
//当前是eventloop线程直接写
if (executor.inEventLoop())
if (flush)
next.invokeWriteAndFlush(m, promise)
next.invokeWrite(m, promise)
invokeWrite0(msg, promise)
((ChannelOutboundHandler) handler()).write(this, msg, promise)
unsafe.write(msg, promise) //HeadContext 写入ChannelOutboundBuffer
write(Object msg, ChannelPromise promise) //AbstractChannel
filterOutboundMessage(msg)
pipeline.estimatorHandle().size(msg)
outboundBuffer.addMessage(msg, size, promise) //最终写入的地方,此处只是写入ChannelOutboundBuffer并未写入socket
flush()
outboundBuffer.addFlush()
flush0() //这里才会真正将outboundBuffer中的数据写入socket缓冲区
doWrite(outboundBuffer) // AbstractNioByteChannel
int writeSpinCount = config().getWriteSpinCount()
do {
Object msg = in.current()
if (msg == null) {
// Wrote all messages.
clearOpWrite();
// Directly return here so incompleteWrite(...) is not called.
return;
}
writeSpinCount -= doWriteInternal(in, msg)
doWriteBytes(buf)
buf.readBytes(javaChannel(), expectedWrittenBytes) //NioSocketChannel
} while (writeSpinCount > 0);
incompleteWrite(writeSpinCount < 0)
//当前非eventloop线程,写入write task异步写
WriteTask.newInstance(next, m, promise, flush)
safeExecute(executor, task, promise, m, !flush)
if (lazy && executor instanceof AbstractEventExecutor)
((AbstractEventExecutor) executor).lazyExecute(runnable)
executor.execute(runnable)
```
四、Netty 编解码器
1. inbound/outbound
inbound: HeadContext-->...-->TailContext
outbound: TailContext-->...--->HeadContext
注:inbound、outbound并不是根据IO流向定义的,是根据事件触发的源决定的。事件是由应用程序主动请求而触发的事件叫outbound,由外部事件触发的事件叫inbound
inbound(fireXXX系列):
- channelActive / channelInactive
- channelRead
- channelReadComplete
- channelRegistered / channelUnregistered
- channelWritabilityChanged
- exceptionCaught
- userEventTriggered
outbound:
- bind
- close
- connect
- deregister
- disconnect
- flush
- read
- write
2. 解码器
1. 一次完整的数据读入流程,我们的解码器在HeadContext之后,业务handler之前
NioEventLoop.run-->unsafe.read-->pipeline.fireChannelRead-->AbstractChannelHandlerContext.invokeChannelRead(head, msg)-->HeadContext.channelRead-->....自定义inboundHandler-->TailContext.channelRead
2. netty基础解码器抽象
1. ByteToMessageDecoder 原始字节转为一个对象
- FixedLengthFrameDecoder
- DelimiterBasedFrameDecoder
- LineBasedFrameDecoder
- LengthFieldBasedFrameDecoder
ByteToMessageDecoder extends ChannelInboundHandlerAdapter
channelRead(ChannelHandlerContext ctx, Object msg)
out = CodecOutputList.newInstance() //缓存,用于缓存当前解码的内容,ThreadLocal绑定的
msg instanceof ByteBuf
cumulator.cumulate(ctx.alloc(),first ? Unpooled.EMPTY_BUFFER : cumulation, (ByteBuf) msg)
callDecode(ctx, cumulation, out) //调用解码方法,实际是逻辑校验
while (in.isReadable()) {
int outSize = out.size(); //判断是否有余留未处理的已解码数据,有则触发后续链路处理
if (outSize > 0) {
fireChannelRead(ctx, out, outSize);
out.clear();
if (ctx.isRemoved()) {
break;
}
outSize = 0;
}
int oldInputLength = in.readableBytes();
decodeRemovalReentryProtection(ctx, in, out); //解码器入口会调用decode方法
decode(ctx, in, out) //具体解码器的实现
if (ctx.isRemoved()) {
break;
}
if (outSize == out.size()) {
if (oldInputLength == in.readableBytes()) {
break;
} else {
continue;
}
}
if (oldInputLength == in.readableBytes()) {
throw new DecoderException(
StringUtil.simpleClassName(getClass()) +
".decode() did not read anything but decoded a message.");
}
if (isSingleDecode()) {
break;
}
finally
out.recycle() //解码完成,释放缓存
2. MessageToMessageDecoder 对象转其他对象,这个编码器要在ByteToMessageDecoder之后
3. 编码器
1. 写流程
- 用户写:ctx.flush-->ctx.handler.write-->next.ctx....-->headcontext.write-->AbstractChannel.unsafe.flush0-->ChannelOutboundBuffer
- 事件触发写:unsafe.forceFlush-->AbstractChannel.unsafe.flush0-->ChannelOutboundBuffer-->socket buffer
2. 基础编码器抽象
- MessageToMessageEncoder // 将一个对象编码为另一个对象IntegerToStringEncoder
- MessageToByteEncoder //将对象编码为字节,这是编码的最终形态。该编码器在MessageToMessageEncoder之后执行
write(ChannelHandlerContext ctx, Object msg, ChannelPromise promise)
acceptOutboundMessage(msg)
I cast = (I) msg;
buf = allocateBuffer(ctx, cast, preferDirect)
encode(ctx, cast, buf) //子类实现具体的对象到字节的编码
ctx.write(buf, promise)
4. 编解码器
ByteToMessageCodec extends ChannelDuplexHandler,同时支持解码和编码的基础抽象
重写相应的ecode/decode即可
五、FastTheadLocal
1. jdk ThreadLocal
jdk自带的不适合threadLocal多的场景
- Thread.ThreadLocalMap内部的实现上。Thread.ThreadLocalMap内存采用一个Entry[]数组存放threadLocal。通过threadLocal.hashCode()对数组长度取余定位元素存放位置,当发生hash冲突时采用线性探测法定位可用位置。当threadLocal很多时查询性能不好。
- 当Entry[]需要扩容时,jdk Thread.ThreadLocalMap的rehash实现需要将所有元素hash值和新数组长度取余存入指定位置,冲突时线性探测下一可用位置。
2. netty FastThreadLocal
- netty自实现了FastThreadLocal、FastThreadLocalThread、InternalThreadLocalMap,要三者配合才能发挥作用,单独使用会退化到jdk自带的
- InternalThreadLocalMap 内部通过初始化填充的Object[]数组存储threaLocal
FastThreadLocal在创建时会通过自增变量生成一个index(适合长期运行的线程),使用该index作为InternalThreadLocalMap Object[]的下标达到常量存取,即便是扩容index也不会改变。
六、内存池(jemalloc4)
1. 逻辑图
2. 内存池的组成PoolArena、PoolChunk、PoolSubpage、SizeClasses
- SizeClass工具类,定义了多种不同规格的内存块和各规格内存块对应的ID(sizeId),对于满足页整数倍的定义其对应的页ID(pageId),SizeClasses会根据chunkSize、pageSize来生成一张表(sizeId,isMultiPageSize,memorySize方便快速定位,也可以理解为两张表(sizeId,memorySize)、(pageId,memorySize)。netty将内存规格设为三类small、normal、huge,在默认配置下page=8k,chunksize=16m时,small(0,28K],normal(28k, 16M],huge(16M,∞)。
- PoolArena线程申请内存的入口,一个线程只绑定到一个poolArena,不同的线程在申请时会采用轮询的方式绑定到一个poolArena上。poolArena包含N个线程,但一个线程只能绑定一个poolArena,这样来减少单个poolArena并发访问的线程数。poolArena由多个不同级别的poolChunkList(多个PoolChunk的集合)组成:qInit(0-25) -> q000(1-50) -> q025(25-75) -> q050(50-100) -> q075(75-100) -> q100(100-∞),多个级别互有重叠防止频繁移动
- PoolChunk由多个page组成,chunkSize默认是16M,pageSize默认是8k,默认一个poolChunk由2048个page组成
- PoolSubpage是一种small类型的内存块,采用链表结构。不同的small规格对应不同的poolSubpage,poolSubpage的大小是它标识的smallSize和 pageSize的最小公倍数,也就是它的大小是pageSize/smallSize对齐的,其内部采用long数组表示的bitmap标识使用情况。
3. 内存池实际就是对一个数组的管理,如何减少碎片话,高效安全的申请、释放是关键。
4. 内存规格的定义:SizeClasses
- SizeClasses(int pageSize, int pageShifts, int chunkSize, int directMemoryCacheAlignment)
this.pageSize = pageSize;
this.pageShifts = pageShifts;
this.chunkSize = chunkSize;
this.directMemoryCacheAlignment = directMemoryCacheAlignment;
int group = log2(chunkSize) + 1 - LOG2_QUANTUM; //共多少组内存块
//生成一张内存块规格表
//[index, log2Group, log2Delta, nDelta, isMultiPageSize, isSubPage, log2DeltaLookup]
sizeClasses = new short[group << LOG2_SIZE_CLASS_GROUP][7];
nSizes = sizeClasses(); //多少种规格的内存块及sizeClasses初始化
int normalMaxSize = -1;
int index = 0;
int size = 0;
int log2Group = LOG2_QUANTUM; //第一组log2Group固定是4
int log2Delta = LOG2_QUANTUM; //第一组log2Delta固定是4
//ndeltaLimit 每个group组内有多少种size,固定每组4种规格
int ndeltaLimit = 1 << LOG2_SIZE_CLASS_GROUP; // 4
//First small group, nDelta start at 0.
//first size class is 1 << LOG2_QUANTUM
//第一个size大小是 1 << 4 = 16
int nDelta = 0;
// 第一组规格特殊处理,nDelta的值从0开始,且第一个规格大小为16B
while (nDelta < ndeltaLimit) {
//生成具体规格信息的方法,真正的是这个根据这个公式:
//size = (1 << log2Group) + (1 << log2Delta) * nDelta
//sizeClass内部同时会设置isSubpage/isMultiPageSize/nPSizes/nSubpages/smallMaxSizeIdx/lookupMaxSize等信息
size = sizeClass(index++, log2Group, log2Delta, nDelta++);
}
log2Group += LOG2_SIZE_CLASS_GROUP; //第二组log2Group从6开始
//All remaining groups, nDelta start at 1.
//其余的group, nDelta从1开始
while (size < chunkSize) {
nDelta = 1;
while (nDelta <= ndeltaLimit && size < chunkSize) {
size = sizeClass(index++, log2Group, log2Delta, nDelta++);
normalMaxSize = size;
}
//除第一组和第二组外,其余组均每组加1
log2Group++;
log2Delta++;
}
//generate lookup table
sizeIdx2sizeTab = new int[nSizes]; //sizeid -> size的速查表
pageIdx2sizeTab = new int[nPSizes]; //pageId -> size的速查表,只记录isMultiPageSize类型的内存块
idx2SizeTab(sizeIdx2sizeTab, pageIdx2sizeTab);
size2idxTab = new int[lookupMaxSize >> LOG2_QUANTUM]; //size -> sizeid的速查表
size2idxTab(size2idxTab);
- sizeClasses生成的表如下:
| index | log2Group | log2Delta | nDelta | isMultiPageSize | isSubPage | log2DeltaLookup | size | usize |
| 0 | 4 | 4 | 0 | 0 | 1 | 4 | 16 | |
| 1 | 4 | 4 | 1 | 0 | 1 | 4 | 32 | |
| 2 | 4 | 4 | 2 | 0 | 1 | 4 | 48 | |
| 3 | 4 | 4 | 3 | 0 | 1 | 4 | 64 | |
| 4 | 6 | 4 | 1 | 0 | 1 | 4 | 80 | |
| 5 | 6 | 4 | 2 | 0 | 1 | 4 | 96 | |
| 6 | 6 | 4 | 3 | 0 | 1 | 4 | 112 | |
| 7 | 6 | 4 | 4 | 0 | 1 | 4 | 128 | |
| 8 | 7 | 5 | 1 | 0 | 1 | 5 | 160 | |
| 9 | 7 | 5 | 2 | 0 | 1 | 5 | 192 | |
| 10 | 7 | 5 | 3 | 0 | 1 | 5 | 224 | |
| 11 | 7 | 5 | 4 | 0 | 1 | 5 | 256 | |
| 12 | 8 | 6 | 1 | 0 | 1 | 6 | 320 | |
| 13 | 8 | 6 | 2 | 0 | 1 | 6 | 384 | |
| 14 | 8 | 6 | 3 | 0 | 1 | 6 | 448 | |
| 15 | 8 | 6 | 4 | 0 | 1 | 6 | 512 | |
| 16 | 9 | 7 | 1 | 0 | 1 | 7 | 640 | |
| 17 | 9 | 7 | 2 | 0 | 1 | 7 | 768 | |
| 18 | 9 | 7 | 3 | 0 | 1 | 7 | 896 | |
| 19 | 9 | 7 | 4 | 0 | 1 | 7 | 1024 | 1K |
| 20 | 10 | 8 | 1 | 0 | 1 | 8 | 1280 | 1.25K |
| 21 | 10 | 8 | 2 | 0 | 1 | 8 | 1536 | 1.5K |
| 22 | 10 | 8 | 3 | 0 | 1 | 8 | 1792 | 1.75K |
| 23 | 10 | 8 | 4 | 0 | 1 | 8 | 2048 | 2K |
| 24 | 11 | 9 | 1 | 0 | 1 | 9 | 2560 | 2.5K |
| 25 | 11 | 9 | 2 | 0 | 1 | 9 | 3072 | 3K |
| 26 | 11 | 9 | 3 | 0 | 1 | 9 | 3584 | 3.5K |
| 27 | 11 | 9 | 4 | 0 | 1 | 9 | 4096 | 4K |
| 28 | 12 | 10 | 1 | 0 | 1 | 0 | 5120 | 5K |
| 29 | 12 | 10 | 2 | 0 | 1 | 0 | 6144 | 6K |
| 30 | 12 | 10 | 3 | 0 | 1 | 0 | 7168 | 7K |
| 31 | 12 | 10 | 4 | 1 | 1 | 0 | 8192 | 8K |
| 32 | 13 | 11 | 1 | 0 | 1 | 0 | 10240 | 10K |
| 33 | 13 | 11 | 2 | 0 | 1 | 0 | 12288 | 12K |
| 34 | 13 | 11 | 3 | 0 | 1 | 0 | 14336 | 14K |
| 35 | 13 | 11 | 4 | 1 | 1 | 0 | 16384 | 16K |
| 36 | 14 | 12 | 1 | 0 | 1 | 0 | 20480 | 20K |
| 37 | 14 | 12 | 2 | 1 | 1 | 0 | 24576 | 24K |
| 38 | 14 | 12 | 3 | 0 | 1 | 0 | 28672 | 28K |
| 39 | 14 | 12 | 4 | 1 | 0 | 0 | 32768 | 32K |
| 40 | 15 | 13 | 1 | 1 | 0 | 0 | 40960 | 40K |
| 41 | 15 | 13 | 2 | 1 | 0 | 0 | 49152 | 48K |
| 42 | 15 | 13 | 3 | 1 | 0 | 0 | 57344 | 56K |
| 43 | 15 | 13 | 4 | 1 | 0 | 0 | 65536 | 64K |
| 44 | 16 | 14 | 1 | 1 | 0 | 0 | 81920 | 80K |
| 45 | 16 | 14 | 2 | 1 | 0 | 0 | 98304 | 96K |
| 46 | 16 | 14 | 3 | 1 | 0 | 0 | 114688 | 112K |
| 47 | 16 | 14 | 4 | 1 | 0 | 0 | 131072 | 128K |
| 48 | 17 | 15 | 1 | 1 | 0 | 0 | 163840 | 160K |
| 49 | 17 | 15 | 2 | 1 | 0 | 0 | 196608 | 192K |
| 50 | 17 | 15 | 3 | 1 | 0 | 0 | 229376 | 224K |
| 51 | 17 | 15 | 4 | 1 | 0 | 0 | 262144 | 256K |
| 52 | 18 | 16 | 1 | 1 | 0 | 0 | 327680 | 320K |
| 53 | 18 | 16 | 2 | 1 | 0 | 0 | 393216 | 384K |
| 54 | 18 | 16 | 3 | 1 | 0 | 0 | 458752 | 448K |
| 55 | 18 | 16 | 4 | 1 | 0 | 0 | 524288 | 512K |
| 56 | 19 | 17 | 1 | 1 | 0 | 0 | 655360 | 640K |
| 57 | 19 | 17 | 2 | 1 | 0 | 0 | 786432 | 768K |
| 58 | 19 | 17 | 3 | 1 | 0 | 0 | 917504 | 896K |
| 59 | 19 | 17 | 4 | 1 | 0 | 0 | 1048576 | 1M |
| 60 | 20 | 18 | 1 | 1 | 0 | 0 | 1310720 | 1.25M |
| 61 | 20 | 18 | 2 | 1 | 0 | 0 | 1572864 | 1.5M |
| 62 | 20 | 18 | 3 | 1 | 0 | 0 | 1835008 | 1.75M |
| 63 | 20 | 18 | 4 | 1 | 0 | 0 | 2097152 | 2M |
| 64 | 21 | 19 | 1 | 1 | 0 | 0 | 2621440 | 2.5M |
| 65 | 21 | 19 | 2 | 1 | 0 | 0 | 3145728 | 3M |
| 66 | 21 | 19 | 3 | 1 | 0 | 0 | 3670016 | 3.5M |
| 67 | 21 | 19 | 4 | 1 | 0 | 0 | 4194304 | 4M |
| 68 | 22 | 20 | 1 | 1 | 0 | 0 | 5242880 | 5M |
| 69 | 22 | 20 | 2 | 1 | 0 | 0 | 6291456 | 6M |
| 70 | 22 | 20 | 3 | 1 | 0 | 0 | 7340032 | 7M |
| 71 | 22 | 20 | 4 | 1 | 0 | 0 | 8388608 | 8M |
| 72 | 23 | 21 | 1 | 1 | 0 | 0 | 10485760 | 10M |
| 73 | 23 | 21 | 2 | 1 | 0 | 0 | 12582912 | 12M |
| 74 | 23 | 21 | 3 | 1 | 0 | 0 | 14680064 | 14M |
| 75 | 23 | 21 | 4 | 1 | 0 | 0 | 16777216 | 16M |
5. 内存申请
- PoolArena //初始化
numSmallSubpagePools = nSubpages;
smallSubpagePools = newSubpagePoolArray(numSmallSubpagePools); //初始化所有smallSize规格的poolSubpage的Head节点,作为poolSubpage的全局访问入口,数组的下标即为sizeIdx2sizeTab的sizeID及index。
for (int i = 0; i < smallSubpagePools.length; i ++) {
smallSubpagePools[i] = newSubpagePoolHead();
}
q100 = new PoolChunkList<T>(this, null, 100, Integer.MAX_VALUE, chunkSize);
q075 = new PoolChunkList<T>(this, q100, 75, 100, chunkSize);
q050 = new PoolChunkList<T>(this, q075, 50, 100, chunkSize);
q025 = new PoolChunkList<T>(this, q050, 25, 75, chunkSize);
q000 = new PoolChunkList<T>(this, q025, 1, 50, chunkSize);
qInit = new PoolChunkList<T>(this, q000, Integer.MIN_VALUE, 25, chunkSize);
q100.prevList(q075);
q075.prevList(q050);
q050.prevList(q025);
q025.prevList(q000);
q000.prevList(null);
qInit.prevList(qInit);
- PoolArena.allocate(PoolThreadCache cache, int reqCapacity, int maxCapacity) //内存申请入口
newByteBuf(maxCapacity) //从对象池获取PooledByteBuf对象
allocate(cache, buf, reqCapacity) //申请内存
sizeIdx = size2SizeIdx(reqCapacity) //内存对齐,申请的内存匹配sizeClasses中的一个内存块sizeId
if sizeIdx <= smallMaxSizeIdx //smallSize申请(0,28K]
tcacheAllocateSmall(cache, buf, reqCapacity, sizeIdx) //
else if sizeIdx < nSizes //normalSize申请(28K,16M]
tcacheAllocateNormal(cache, buf, reqCapacity, sizeIdx)
else //大于chunkSize(16M)的属于huge块,不使用内存池
allocateHuge(buf, normCapacity) //(16M,~∞)
- small规格内存申请流程
tcacheAllocateSmall(cache, buf, reqCapacity, sizeIdx)
if cache.allocateSmall(this, buf, reqCapacity, sizeIdx) //当前线程是否有该规格的内存块缓存,有则从当前线程缓存中取
return
head = smallSubpagePools[sizeIdx]
s = head.next
needsNormalAllocation = s == head
if !needsNormalAllocation //不需要走normal分配, smallSubpagePools中已有可用内存
handle = s.allocate() //从poolSubpage分配内存
if numAvail == 0 || !doNotDestroy //当前poolSubpage是否可用
renturn -1
bitmapIdx = getNextAvail() //遍历bitmap数组查找可用位
findNextAvail()
findNextAvail0(i, bits)
final int maxNumElems = this.maxNumElems;
//确定当前i位置是第多少位
final int baseVal = i << 6;
//从低位开始遍历,对应的值表示这位已分配
for (int j = 0; j < 64; j ++) {
//(bits & 1) == 0,检查最低bit位是否为0(可用),为0则返回val。
if ((bits & 1) == 0) {
//val等于 (i << 6) | j,即i * 64 + j,该bit位在bitmap中是第几个bit位。
//相当于baseVal + j
int val = baseVal | j;
if (val < maxNumElems) {
return val;
} else {
break;
}
}
//bits >>>= 1,右移一位,处理下一个bit位。
bits >>>= 1;
}
q = bitmapIdx >>> 6 //确定数组下标
r = bitmapIdx & 63 //确定bitmapIdx在一个long中的位置
bitmap[q] |= 1L << r //标该long的(1<<r)位已用
if -- numAvail == 0
removeFromPool() //无可用空间的poolSubpage从arena的poolSubpage链表中移除,对应于addPool(head)方法
toHandle(bitmapIdx)//生成64位的long用来标识占用的内存块
pages = runSize >> pageShifts
(long) runOffset << RUN_OFFSET_SHIFT
| (long) pages << SIZE_SHIFT
| 1L << IS_USED_SHIFT
| 1L << IS_SUBPAGE_SHIFT
| bitmapIdx
//64位handle的含义:15(page的偏移量) + 15(占用的page数) + 1(是否使用) + 1(是否是subPage) + 32(如果是subPage,记录bitmapIdx)
//把chunkSize=16M的内存按pageSize=8K分割成2048页,poolChunk就是一个长度为2048的数组,数组的每个元素为8K。handle中的runoffset就是数组下标,pages就是占用了几个数组元素。通过offset和pages就可以确定占用的内存开始及结束位置。
s.chunk.initBufWithSubpage(buf, null, handle, reqCapacity, cache)
if needsNormalAllocation
allocateNormal(buf, reqCapacity, sizeIdx, cache) //无可用poolSubpage,走normal分配该规格的poolSubpage
incSmallAllocation() //SmallAllocation计数器
- normal规格内存申请流程
tcacheAllocateNormal(cache, buf, reqCapacity, sizeIdx)
1. if (cache.allocateNormal(this, buf, reqCapacity, sizeIdx)) //先从线程本地缓存中取
retrun
2. allocateNormal(buf, reqCapacity, sizeIdx, cache)
1. if qInit|q000|q025.... 从chunkList链中取
return
2. c = newChunk(pageSize, nPSizes, pageShifts, chunkSize) //线程本地缓存、chunklist都取不到时,申请新的chunk并加入chunklist中
1. new PoolChunk(...)
1. unpooled = false //是否为内存池
2. this.arena = arena //所属的arena
3. this.memory = memory //关联的内存数组HeapByteBuffer/DirectByteBuffer
4. this.pageSize = pageSize //初始化页大小 8192B = 8K
5. this.pageShifts = pageShifts //13 2^13=8192
this.chunkSize = chunkSize //默认16M=16*1024*1024B = 16,777,216B
6. this.offset = offset
7. freeBytes = chunkSize
//maxPageIdx=40,poolChunk用来管理内存的数组,数组长度等于最大页数。
//数组的每个位置关联一个队列,队列里面的元素为包含相同页数的内存块队列。
//队列中内存块包含的pages等于队列所在数组的下标。
//所以在内存块使用后会发生队列元素在数组中移动
8. runsAvail = newRunsAvailqueueArray(maxPageIdx)
9. runsAvailMap = new LongLongHashMap(-1)
10. subpages = new PoolSubpage[chunkSize >> pageShifts] //16,777,216 >> 13 = 2048
//insert initial run, offset = 0, pages = chunkSize / pageSize 16,777,216 / 8192=2048
11. int pages = chunkSize >> pageShifts
10. long initHandle = (long) pages << SIZE_SHIFT //初始时,offset=0,pages=2048
12. insertAvailRun(0, pages, initHandle)
13. cachedNioBuffers = new ArrayDeque<ByteBuffer>(8)
3. c.allocate(buf, reqCapacity, sizeIdx, threadCache)
1. if sizeIdx <= arena.smallMaxSizeIdx //small poolSubpage类型的申请
1. handle = allocateSubpage(sizeIdx)
1. head = arena.findSubpagePoolHead(sizeIdx) //获取该规格PoolSubpage的head节点
2. runSize = calculateRunSize(sizeIdx) //找到pageSize和elemSize的最小公倍数作为当前的申请内存大小
3. runHandle = allocateRun(runSize) //申请内存
4. runOffset = runOffset(runHandle) //获取内存在poolChunk中的便宜量
5. elemSize = arena.sizeIdx2size(sizeIdx)
6. subpage = new PoolSubpage<T>(head, this, pageShifts, runOffset,
runSize(pageShifts, runHandle), elemSize) //初始化poolSubpage
7. subpages[runOffset] = subpage
8. subpage.allocate() //从poolSubpage申请内存
2. else
1. runSize = arena.sizeIdx2size(sizeIdx)
2. handle = allocateRun(runSize)
1. pages = runSize >> pageShifts
2. pageIdx = arena.pages2pageIdx(pages)
3. queueIdx = runFirstBestFit(pageIdx) //采用最先匹配算法,从runsAvail获取第一个满足条件的内存块队列
4. LongPriorityQueue queue = runsAvail[queueIdx]
5. handle = queue.poll()
6. removeAvailRun(queue, handle)
7. if handle != -1
1. handle = splitLargeRun(handle, pages) //根据pages切分handle,将多出的部分重新加入runsAvail中
8. freeBytes -= runSize(pageShifts, handle) //剩余可用字节数
3. nioBuffer = cachedNioBuffers != null? cachedNioBuffers.pollLast() : null
4. initBuf(buf, nioBuffer, handle, reqCapacity, cache) //使用申请的内存块初始化buf,不存在内存拷贝,只是将buf的指针移动到handle指定的内存块上
4. qInit.add(c) //加入arena的init chunkList
6. 内存释放
- PoolArena.free(PoolChunk<T> chunk, ByteBuffer nioBuffer, long handle, int normCapacity, PoolThreadCache cache) //内存释放入口
1. sizeClass = sizeClass(handle)
2. if cache != null && cache.add(this, chunk, nioBuffer, handle, normCapacity, sizeClass) return //首先尝试放人线程本地缓存
3. freeChunk(chunk, handle, normCapacity, sizeClass, nioBuffer, false)
1. chunk.parent.free(chunk, handle, normCapacity, nioBuffer)
1. chunk.free(handle, normCapacity, nioBuffer) //small、normal归还内存
2. if chunk.freeBytes > freeMaxThreshold //剩余内存高于最大水位线
1. remove(chunk) //从当前chunkList中移除poolChunk
2. move0(chunk) //移动poolChunk到前一个chunkList中
1. prevList.move(chunk)
- small 归还内存
chunk.free(handle, normCapacity, nioBuffer)
1. if isSubpage(handle)
1. sizeIdx = arena.size2SizeIdx(normCapacity)
2. head = arena.findSubpagePoolHead(sizeIdx)
3. sIdx = runOffset(handle)
4. subpage = subpages[sIdx]
5. if subpage.free(head, bitmapIdx(handle)) return //subPage归还内存的入口
1. q = bitmapIdx >>> 6
2. r = bitmapIdx & 63
3. bitmap[q] ^= 1L << r
4. setNextAvail(bitmapIdx)
5. if numAvail ++ == 0
1. addToPool(head) //添加到subPage链中
6. subpages[sIdx] = null //subPage不在用从subpages清除
//接normal归还内存逻辑
- normal 归还内存
chunk.free(handle, normCapacity, nioBuffer)
1. if isSubpage(handle) //逻辑为上面small归还内存
...
2. pages = runPages(handle)
3. finalRun = collapseRuns(handle) //合并当前offset前后可用的内存块为一个大的内存块
collapseNext(collapsePast(handle))
4. finalRun &= ~(1L << IS_USED_SHIFT) //标识为未使用
5. finalRun &= ~(1L << IS_SUBPAGE_SHIFT) //标识为非subPage
6. insertAvailRun(runOffset(finalRun), runPages(finalRun), finalRun) //释放的可用的内存块放人runAvail中
7. freeBytes += pages << pageShifts //设置可用字节数
8. cachedNioBuffers.offer(nioBuffer) //复用nioBuffer
七、对象池(Recycler)
1. 逻辑图
2. 核心成员
- FastThreadLocal<Stack<T>> 每个线程绑定一个本地Stack
- FastThreadLocal<Map<Stack<?>, WeakOrderQueue>> 每个线程绑定一个Map,用于非当前线程Stack的对象存储、归还
- Stack 对象池真正存对象的地方,内部采用DefaultHandle[]存储数据,内部会维护一个WeakOrderQueue链表用于从其他线程获取流失的对象
- WeekOrderQueue 内部采用Link链存储数据
- Handle 默认实现DefaultHandle,关联stack、value及对象回收的地方
- Link 内部采用DefaultHandle[]存储数据
3. 获取对象
Recycler.get()
1. Stack<T> stack = threadLocal.get() //
2. DefaultHandle<T> handle = stack.pop()
1. size = this.size
2. if size == 0
1. scavenge() //从当前线程的Stack的weakOrderQueue链中拉取对象到Stack中
3. size --
4. DefaultHandle ret = elements[size]
5. elements[size] = null
6. this.size = size
7. if ret.lastRecycledId != ret.recycleId
throw new IllegalStateException("recycled multiple times")
8. ret.recycleId = 0
9. ret.lastRecycledId = 0
3. if handle == null
1. handle = stack.newHandle()
1. new DefaultHandle<T>(stack)
2. handle.value = newObject(handle) //创建对象,并使用DefaultHandle包装
4. return (T) handle.value
4. 归还对象
DefaultHandle.recycle(Object object) //object绑定的DefaultHandle
1. stack = this.stack //handle关联的stack
2. stack.push(this)
1. currentThread = Thread.currentThread()
2. if threadRef.get() == currentThread //当前是stack归属线程
1. pushNow(item)
1. if item.recycleId != 0 || !item.compareAndSetLastRecycledId(0, OWN_THREAD_ID)
throw new IllegalStateException("recycled already")
2. item.recycleId = OWN_THREAD_ID
3. size = this.size
4. if size >= maxCapacity || dropHandle(item) //超过最大容量或未达到回收标准直接丢弃(handleRecycleCount < interval)
return
5. if size == elements.length
1. elements = Arrays.copyOf(elements, min(size << 1, maxCapacity)) //扩容
6. elements[size] = item
7. this.size = size + 1
3. else //当前非stack归属线程,对象跨线程存储FastThreadLocal<Map<Stack<?>
1. pushLater(item, currentThread)
1. Map<Stack<?>, WeakOrderQueue> delayedRecycled = DELAYED_RECYCLED.get()
2. WeakOrderQueue queue = delayedRecycled.get(this) //获取该stack对应的WeakOrderQueue
3. if queue == null
1. if delayedRecycled.size() >= maxDelayedQueues //超过最大容量时存入空队列
1. delayedRecycled.put(this, WeakOrderQueue.DUMMY)
2. return
2. if (queue = newWeakOrderQueue(thread)) == null //核心在newWeakOrderQueue,在FastThreadLocal<Map<Stack<?>, WeakOrderQueue>>存储非当前线程Stack及其weakOrderQueue
1. return
3. delayedRecycled.put(this, queue)
4. else if queue == WeakOrderQueue.DUMMY
1. return
5. queue.add(item)
源码地址:netty 已屏蔽部分测试用例,可直接导入idea运行