前言
书接上文,Hollow的内存实现与编码逻辑密不可分,本计划在一篇文章中全部梳理完成,但是奈何文章长度限制,只能分为两篇文章。在上一篇文章【Netflix Hollow系列】深入Hollow的内存实现以及ByteBuffer的应用 中已经详细介绍了Hollow的内存池的实现以及对ByteBuffer的应用,本文将继续深究Hollow数据的编码逻辑,通过编码,可以进一步优化内存使用,降低内存占用,提升访问速度,进而达到性能优化的目的。@空歌白石
BlobByteBuffer
在上一篇文章【Netflix Hollow系列】深入Hollow的内存实现以及ByteBuffer的应用 中针对encoded类型中大量使用了BlobByteBuffer负责所有在内存中数据的存储,也就是SHARED_MEMORY_LAZY内存模型的最底层实现,可以将BlobByteBuffer认为是对JDK的ByteBuffer在Hollow的一种封装和实现。
BlobByteBuffer实现了将类似于BLOB这样的大文件和MappedByteBuffer之间的桥梁,原因是MappedByteBuffer仅限于映射整数大小的内存。
请注意: JDK 14 将引入改进的 API 来访问外部内存并替换MappedByteBuffer。
BlobByteBuffer不是线程安全的,但共享底层字节缓冲区以进行并行读取是安全的。支持的最大blob大小约为2的16次方。在遇到此限制之前,可能会达到 Hollow 中的其他限制或实际限制。
构造方法和属性
public static final int MAX_SINGLE_BUFFER_CAPACITY = 1 << 30; // largest, positive power-of-two int
private final ByteBuffer[] spine; // array of MappedByteBuffers
private final long capacity; // in bytes
private final int shift;
private final int mask;
private long position; // within index 0 to capacity-1 in the underlying ByteBuffer
private BlobByteBuffer(long capacity, int shift, int mask, ByteBuffer[] spine) {
this(capacity, shift, mask, spine, 0);
}
private BlobByteBuffer(long capacity, int shift, int mask, ByteBuffer[] spine, long position) {
if (!spine[0].order().equals(ByteOrder.BIG_ENDIAN)) {
throw new UnsupportedOperationException("Little endian memory layout is not supported");
}
this.capacity = capacity;
this.shift = shift;
this.mask = mask;
this.position = position;
// The following assignment is purposefully placed *after* the population of all segments (this method is called
// after mmap). The final assignment after the initialization of the array of MappedByteBuffers guarantees that
// no thread will see any of the array elements before assignment.
this.spine = spine;
}
ByteOrder.BIG_ENDIAN
Hollow的ByteBuffer字节顺序仅支持ByteOrder.BIG_ENDIAN,这也是Java默认的字节顺序。
ByteOrder.nativeOrder()方法返回本地JVM运行的硬件的字节顺序,使用和硬件一致的字节顺序可能使buffer更加有效。
存储结构
下图展示了BlobByteBuffer中由ByteBuffer[]和byte[]组成的两级存储结构。
mmapBlob
mmapBlob方法实现了使用MappedByteBuffer读取File的功能。
public static BlobByteBuffer mmapBlob(FileChannel channel, int singleBufferCapacity) throws IOException {
long size = channel.size();
if (size == 0) {
throw new IllegalStateException("File to be mmap-ed has no data");
}
if ((singleBufferCapacity & (singleBufferCapacity - 1)) != 0) { // should be a power of 2
throw new IllegalArgumentException("singleBufferCapacity must be a power of 2");
}
// 分成 bufferCount 个缓冲区,其 int 容量是 2 的幂,向上取最大的2的幂次方。
final int bufferCapacity = size > (long) singleBufferCapacity
? singleBufferCapacity
// 空歌白石:highestOneBit返回小于等于这个数字的一个2的幂次方数。
: Integer.highestOneBit((int) size);
long bufferCount = size % bufferCapacity == 0
? size / (long)bufferCapacity
: (size / (long)bufferCapacity) + 1;
if (bufferCount > Integer.MAX_VALUE)
throw new IllegalArgumentException("file too large; size=" + size);
// 空歌白石:计算占用的位数
int shift = 31 - Integer.numberOfLeadingZeros(bufferCapacity); // log2
// 空歌白石:计算掩码,【2的shift次方-1】
int mask = (1 << shift) - 1;
// 空歌白石:根据bufferCount初始化MappedByteBuffer。
ByteBuffer[] spine = new MappedByteBuffer[(int)bufferCount];
for (int i = 0; i < bufferCount; i++) {
long pos = (long)i * bufferCapacity;
int cap = i == (bufferCount - 1)
? (int)(size - pos)
: bufferCapacity;
// 空歌白石:从position位开始,读取capacity的bytes,映射到ByteBuffer中。
ByteBuffer buffer = channel.map(READ_ONLY, pos, cap);
/*
* if (!((MappedByteBuffer) buffer).isLoaded()) // TODO(timt): make pre-fetching configurable
* ((MappedByteBuffer) buffer).load();
*/
// 空歌白石:将没段的buffer写入到对应的spine中。
spine[i] = buffer;
}
return new BlobByteBuffer(size, shift, mask, spine);
}
numberOfLeadingZeros
numberOfLeadingZeros作用是返回无符号整数i的最高非0位前面的0的个数,包括符号位在内;如果i为负数,这个方法将会返回0,符号位为1。比如说,10的二进制表示为 0000 0000 0000 0000 0000 0000 0000 1010,java的整型长度为32位。那么这个方法返回的就是28。
highestOneBit
Integer.highestOneBit方法,可以给它传入一个数字,它将返回小于等于这个数字的一个2的幂次方数。(最高的一位)
get
BlobByteBuffer提供了两个get方法,分别为getByte和getLong。这两个方法可以认为是读取ByteBuffer的入口。
基于上文中二维存储结构,可以轻松理解getbyte。
/**
* Reads the byte at the given index.
* @param index byte index (from offset 0 in the backing BlobByteBuffer) at which to read byte value
* @return byte at the given index
* @throws IndexOutOfBoundsException if index out of bounds of the backing buffer
*/
public byte getByte(long index) throws BufferUnderflowException {
if (index < capacity) {
int spineIndex = (int)(index >>> (shift));
int bufferIndex = (int)(index & mask);
return spine[spineIndex].get(bufferIndex);
}
else {
assert(index < capacity + Long.BYTES);
// this situation occurs when read for bits near the end of the buffer requires reading a long value that
// extends past the buffer capacity by upto Long.BYTES bytes. To handle this case,
// return 0 for (index >= capacity - Long.BYTES && index < capacity )
// these zero bytes will be discarded anyway when the returned long value is shifted to get the queried bits
return (byte) 0;
}
}
着重介绍getLong方法。getLong返回从给定字节索引开始的long值,getLong仅仅依赖于外部的startByteIndex,不受多线程影响,所以是线程安全的。
bigEndian可以计算给定大端字节顺序,将位置返回到给定字节索引的缓冲区中。Java nio DirectByteBuffers 默认为ByteOrder.BIG_ENDIAN。 Big-endianness 的类型会在BlobByteBuffer的构造函数中进行验证,前文已有提到。
/**
* Return the long value starting from given byte index. This method is thread safe.
* @param startByteIndex byte index (from offset 0 in the backing BlobByteBuffer) at which to start reading long value
* @return long value
*/
public long getLong(long startByteIndex) throws BufferUnderflowException {
// 空歌白石;从StartByteIndex读取的偏移量
int alignmentOffset = (int)(startByteIndex - this.position()) % Long.BYTES;
// 空歌白石;此次读取的long长度边界位置
long nextAlignedPos = startByteIndex - alignmentOffset + Long.BYTES;
// 空歌白石;Long.BYTES始终为8
byte[] bytes = new byte[Long.BYTES];
for (int i = 0; i < Long.BYTES; i ++ ) {
bytes[i] = getByte(bigEndian(startByteIndex + i, nextAlignedPos));
}
// 空歌白石;将8个byte值左移位存入一个long值中。
return ((((long) (bytes[7] )) << 56) |
(((long) (bytes[6] & 0xff)) << 48) |
(((long) (bytes[5] & 0xff)) << 40) |
(((long) (bytes[4] & 0xff)) << 32) |
(((long) (bytes[3] & 0xff)) << 24) |
(((long) (bytes[2] & 0xff)) << 16) |
(((long) (bytes[1] & 0xff)) << 8) |
(((long) (bytes[0] & 0xff)) ));
}
/**
* Given big-endian byte order, returns the position into the buffer for a given byte index. Java nio DirectByteBuffers
* are by default big-endian. Big-endianness is validated in the constructor.
* @param index byte index
* @param boundary index of the next 8-byte aligned byte
* @return position in buffer
*/
private long bigEndian(long index, long boundary) {
long result;
if (index < boundary) {
result = (boundary - Long.BYTES) + (boundary - index) - 1;
} else {
result = boundary + (boundary + Long.BYTES - index) - 1;
}
return result;
}
VarInt
VarInt实现了可变字节整数编码和解码逻辑。解决了定长存储的整数类型绝对值较小时空间浪费大的问题。
write
可变字节编码的写入包含三种类型,写入null、写入int和写入long,数据可以被写入ByteDataArray的byte数组中,也可以直接被写入OutputStream和byte[]数据流中。
VarInt是如何写入数据呢?核心的解决思路是将整数类型由定长存储转为变长存储(能用1个字节存下就坚决不用2个或以上字节存储)。
让我们来具体例子看下,int类型的7,二进制表示:00000000 00000000 00000000 00000111。大家都知道int有32位,4个字节,其中前3个字节完全是浪费的空间,如果可以经过编码后,将7占用的空间从4个字节减少为1个字节,内存占用减少了75%,是相当可观了。
VarInt的实现原理并不复杂,就是将int按 7bit 分段,每个字节的最高位作为标识位,标识后一个字节是否属于该数据。1 代表后面的字节还是属于当前数据,0 代表这是当前数据的最后一个字节。
原始编码
00000000 00000000 00000011 10111011
7bit切分
0000 0000000 0000000 0000111 0111011
varint编
00000111 10111011
下文的源码中会用到的一些十六进制数值,这里列出十六进制与十进制、二进制对照关系。
| 十六进制 | 二级制 | 十进制 |
|---|---|---|
| 0xFFFFFFFFFFFFFF | 00000000 00000001 00000000 00000000 00000000 00000000 00000000 00000000 | 72057594037927940 |
| 0x1FFFFFFFFFFFF | 00000000 00000001 11111111 11111111 11111111 11111111 11111111 11111111 | 562949953421311 |
| 0x3FFFFFFFFFF | 00000000 00000000 00000011 11111111 11111111 11111111 11111111 11111111 | 4398046511103 |
| 0x7FFFFFFFF | 00001111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 | 268435455 |
| 0x0FFFFFFF | 00001111 11111111 11111111 11111111 | 268435455 |
| 0xFFFFFFF | 00001111 11111111 11111111 11111111 | 268435455 |
| 0x1FFFFF | 00000000 00011111 11111111 11111111 | 2097151 |
| 0x3FFF | 00000000 00000000 00111111 11111111 | 16383 |
| 0x7F | 00000000 00000000 00000000 01111111 | 127 |
| 0x80 | 00000000 00000000 00000000 10000000 | 128 |
| 0x4000 | 00000000 00000000 01000000 00000000 | 16384 |
| 0x200000 | 00000000 00100000 00000000 00000000 | 2097152 |
| 0x10000000 | 00001111 11111111 11111111 11111111 | 268435455 |
| 0x800000000 | 00000000 00000000 00000000 00001000 00000000 00000000 00000000 00000000 | 34359738368 |
| 0x40000000000 | 00000000 00000000 00000100 00000000 00000000 00000000 00000000 00000000 | 4398046511104 |
| 0x2000000000000 | 00000000 00000010 00000000 00000000 00000000 00000000 00000000 00000000 | 562949953421312 |
| 0x100000000000000 | 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 | 72057594037927940 |
在线进制转换工具:tool.oschina.net/hexconvert/
write variable null
将可变长度整数的NULL写入提供的 ByteDataArray中,Hollow使用0x80表示字节数组中的null,占用一个byte位。0x80等于十进制的128,二级制表示为1000 0000。
public static void writeVNull(ByteDataArray buf) {
buf.write((byte)0x80);
return;
}
write variable int
writeVInt有三个重载方法,分别可以将int类型的value,写入到ByteDataArray、OutputStream以及给定初始position的byte[]中。其中写入byte[]的方法会返回数据写入后的下一位position,便于后续继续写入数据。
public static void writeVInt(ByteDataArray buf, int value) {
if(value > 0x0FFFFFFF || value < 0) buf.write((byte)(0x80 | ((value >>> 28))));
if(value > 0x1FFFFF || value < 0) buf.write((byte)(0x80 | ((value >>> 21) & 0x7F)));
if(value > 0x3FFF || value < 0) buf.write((byte)(0x80 | ((value >>> 14) & 0x7F)));
if(value > 0x7F || value < 0) buf.write((byte)(0x80 | ((value >>> 7) & 0x7F)));
buf.write((byte)(value & 0x7F));
}
public static void writeVInt(OutputStream out, int value) throws IOException {
if(value > 0x0FFFFFFF || value < 0) out.write((byte)(0x80 | ((value >>> 28))));
if(value > 0x1FFFFF || value < 0) out.write((byte)(0x80 | ((value >>> 21) & 0x7F)));
if(value > 0x3FFF || value < 0) out.write((byte)(0x80 | ((value >>> 14) & 0x7F)));
if(value > 0x7F || value < 0) out.write((byte)(0x80 | ((value >>> 7) & 0x7F)));
out.write((byte)(value & 0x7F));
}
public static int writeVInt(byte data[], int pos, int value) {
if(value > 0x0FFFFFFF || value < 0) data[pos++] = ((byte)(0x80 | ((value >>> 28))));
if(value > 0x1FFFFF || value < 0) data[pos++] = ((byte)(0x80 | ((value >>> 21) & 0x7F)));
if(value > 0x3FFF || value < 0) data[pos++] = ((byte)(0x80 | ((value >>> 14) & 0x7F)));
if(value > 0x7F || value < 0) data[pos++] = ((byte)(0x80 | ((value >>> 7) & 0x7F)));
data[pos++] = (byte)(value & 0x7F);
return pos;
}
write variable long
writeVLong有两种重载方法,分别可以将int类型的value,写入到ByteDataArray、OutputStream,并没有像writeVInt提供写入byte[]的方法。
public static void writeVLong(ByteDataArray buf, long value) {
if(value < 0) buf.write((byte)0x81);
if(value > 0xFFFFFFFFFFFFFFL || value < 0) buf.write((byte)(0x80 | ((value >>> 56) & 0x7FL)));
if(value > 0x1FFFFFFFFFFFFL || value < 0) buf.write((byte)(0x80 | ((value >>> 49) & 0x7FL)));
if(value > 0x3FFFFFFFFFFL || value < 0) buf.write((byte)(0x80 | ((value >>> 42) & 0x7FL)));
if(value > 0x7FFFFFFFFL || value < 0) buf.write((byte)(0x80 | ((value >>> 35) & 0x7FL)));
if(value > 0xFFFFFFFL || value < 0) buf.write((byte)(0x80 | ((value >>> 28) & 0x7FL)));
if(value > 0x1FFFFFL || value < 0) buf.write((byte)(0x80 | ((value >>> 21) & 0x7FL)));
if(value > 0x3FFFL || value < 0) buf.write((byte)(0x80 | ((value >>> 14) & 0x7FL)));
if(value > 0x7FL || value < 0) buf.write((byte)(0x80 | ((value >>> 7) & 0x7FL)));
buf.write((byte)(value & 0x7FL));
}
public static void writeVLong(OutputStream out, long value) throws IOException {
if(value < 0) out.write((byte)0x81);
if(value > 0xFFFFFFFFFFFFFFL || value < 0) out.write((byte)(0x80 | ((value >>> 56) & 0x7FL)));
if(value > 0x1FFFFFFFFFFFFL || value < 0) out.write((byte)(0x80 | ((value >>> 49) & 0x7FL)));
if(value > 0x3FFFFFFFFFFL || value < 0) out.write((byte)(0x80 | ((value >>> 42) & 0x7FL)));
if(value > 0x7FFFFFFFFL || value < 0) out.write((byte)(0x80 | ((value >>> 35) & 0x7FL)));
if(value > 0xFFFFFFFL || value < 0) out.write((byte)(0x80 | ((value >>> 28) & 0x7FL)));
if(value > 0x1FFFFFL || value < 0) out.write((byte)(0x80 | ((value >>> 21) & 0x7FL)));
if(value > 0x3FFFL || value < 0) out.write((byte)(0x80 | ((value >>> 14) & 0x7FL)));
if(value > 0x7FL || value < 0) out.write((byte)(0x80 | ((value >>> 7) & 0x7FL)));
out.write((byte)(value & 0x7FL));
}
read
read是write的逆运算。read不仅仅需要当前位存储的byte[]或InputStream,还需要具体的position。
read variable null
读取对应position的byte值,判断是否为0x80。返回值为是否为null的boolean,并不是null。
public static boolean readVNull(ByteData arr, long position) {
return arr.get(position) == (byte)0x80;
}
read variable int
readVInt的三个重载方法,一个从ByteData的byte[]中直接读取对应position的byte值。另两个方法从InputStream或HollowBlobInput中使用read方法读取字节。
public static int readVInt(ByteData arr, long position) {
byte b = arr.get(position++);
// 空歌白石:判断读取的位是否为null,变长中不允许有空值。
if(b == (byte) 0x80)
throw new RuntimeException("Attempting to read null value as int");
int value = b & 0x7F;
while ((b & 0x80) != 0) {
b = arr.get(position++);
// 空歌白石:左移7位
value <<= 7;
value |= (b & 0x7F);
}
return value;
}
public static int readVInt(InputStream in) throws IOException {
byte b = readByteSafely(in);
if(b == (byte) 0x80)
throw new RuntimeException("Attempting to read null value as int");
int value = b & 0x7F;
while ((b & 0x80) != 0) {
b = readByteSafely(in);
value <<= 7;
value |= (b & 0x7F);
}
return value;
}
public static int readVInt(HollowBlobInput in) throws IOException {
byte b = readByteSafely(in);
if(b == (byte) 0x80)
throw new RuntimeException("Attempting to read null value as int");
int value = b & 0x7F;
while ((b & 0x80) != 0) {
b = readByteSafely(in);
value <<= 7;
value |= (b & 0x7F);
}
return value;
}
read variable long
readVLong的三个重载方法,一个从ByteData的byte[]中直接读取对应position的long值。另两个方法从InputStream或HollowBlobInput中使用read方法读取字节。
public static long readVLong(ByteData arr, long position) {
byte b = arr.get(position++);
if(b == (byte) 0x80)
throw new RuntimeException("Attempting to read null value as long");
long value = b & 0x7F;
while ((b & 0x80) != 0) {
b = arr.get(position++);
value <<= 7;
value |= (b & 0x7F);
}
return value;
}
public static long readVLong(InputStream in) throws IOException {
byte b = readByteSafely(in);
if(b == (byte) 0x80)
throw new RuntimeException("Attempting to read null value as long");
long value = b & 0x7F;
while ((b & 0x80) != 0) {
b = readByteSafely(in);
value <<= 7;
value |= (b & 0x7F);
}
return value;
}
public static long readVLong(HollowBlobInput in) throws IOException {
byte b = readByteSafely(in);
if (b == (byte) 0x80)
throw new RuntimeException("Attempting to read null value as long");
long value = b & 0x7F;
while ((b & 0x80) != 0) {
b = readByteSafely(in);
value <<= 7;
value |= (b & 0x7F);
}
return value;
}
readByteSafely
InputStream的read方法从输入流中读取数据的下一个字节。 值字节作为 int 返回,范围为 0 到 255。如果由于到达流的末尾而没有可用的字节,则返回值 -1。 此方法会一直阻塞的,直到输入数据可用、检测到流结束或引发异常。
public static byte readByteSafely(InputStream is) throws IOException {
int i = is.read();
if (i == -1) {
throw new EOFException("Unexpected end of VarInt record");
}
return (byte)i;
}
HollowBlobInput的read内部基于输入流的类型分为RandomAccessFile和DataInputStream两种读写方式。
DataInputStream是InputStream的一种具体实现,因此使用DataInputStream的read和InputStream并无差别。RandomAccessFile是基于文件的数据流读写,从此文件中读取一个字节的数据。 该字节以0到255(0x00-0x0ff) 范围内的整数形式返回。 如果还没有输入可用,则此方法会阻塞。尽管RandomAccessFile不是InputStream的子类,但此方法的行为方式与InputStream的InputStream.read()方法完全相同。
public static byte readByteSafely(HollowBlobInput in) throws IOException {
int i = in.read();
if (i == -1) {
throw new EOFException("Unexpected end of VarInt record");
}
return (byte)i;
}
next
从指定位置开始确定提供的 ByteData}中可变长度long` 的大小(以字节为单位)。
public static int nextVLongSize(ByteData arr, long position) {
byte b = arr.get(position++);
// 空歌白石:如果为null,只占用一位大小。
if(b == (byte) 0x80)
return 1;
int length = 1;
while((b & 0x80) != 0) {
b = arr.get(position++);
length++;
}
return length;
}
size & count
size两个重载方法分别计算确定int和long两种类型值在编码为可变长度整数时指定值的大小(以字节为单位)。
public static int sizeOfVInt(int value) {
if(value < 0)
return 5;
if(value < 0x80)
return 1;
if(value < 0x4000)
return 2;
if(value < 0x200000)
return 3;
if(value < 0x10000000)
return 4;
return 5;
}
public static int sizeOfVLong(long value) {
if(value < 0L)
return 10;
if(value < 0x80L)
return 1;
if(value < 0x4000L)
return 2;
if(value < 0x200000L)
return 3;
if(value < 0x10000000L)
return 4;
if(value < 0x800000000L)
return 5;
if(value < 0x40000000000L)
return 6;
if(value < 0x2000000000000L)
return 7;
if(value < 0x100000000000000L)
return 8;
return 9;
}
countVarIntsInRange计算指定范围内提供的 ByteData 中编码的可变长度整数的数量。
public static int countVarIntsInRange(ByteData byteData, long fieldPosition, int length) {
int numInts = 0;
boolean insideInt = false;
for(int i=0;i<length;i++) {
byte b = byteData.get(fieldPosition + i);
if((b & 0x80) == 0) {
numInts++;
insideInt = false;
} else if(!insideInt && b == (byte)0x80) {
numInts++;
} else {
insideInt = true;
}
}
return numInts;
}
size和count方法一般会用于计算具体字段占用的byte位数,用于准确初始化或从回收站申请segments大小。
IOUtils
IOUtils是VarInt的一个典型的应用场景,我们以此进一步展开说明如何使用VarInt。
package com.netflix.hollow.core.util;
import com.netflix.hollow.core.memory.encoding.VarInt;
import com.netflix.hollow.core.read.HollowBlobInput;
import java.io.DataOutputStream;
import java.io.IOException;
public class IOUtils {
public static void copyBytes(HollowBlobInput in, DataOutputStream[] os, long numBytes) throws IOException {
byte buf[] = new byte[4096];
while(numBytes > 0) {
int numBytesToRead = 4096;
if(numBytes < 4096)
numBytesToRead = (int)numBytes;
int bytesRead = in.read(buf, 0, numBytesToRead);
for(int i=0;i<os.length;i++) {
os[i].write(buf, 0, bytesRead);
}
numBytes -= bytesRead;
}
}
public static void copySegmentedLongArray(HollowBlobInput in, DataOutputStream[] os) throws IOException {
long numLongsToWrite = VarInt.readVLong(in);
for(int i=0;i<os.length;i++)
VarInt.writeVLong(os[i], numLongsToWrite);
copyBytes(in, os, numLongsToWrite * 8);
}
public static int copyVInt(HollowBlobInput in, DataOutputStream[] os) throws IOException {
int value = VarInt.readVInt(in);
for(int i=0;i<os.length;i++)
VarInt.writeVInt(os[i], value);
return value;
}
public static long copyVLong(HollowBlobInput in, DataOutputStream[] os) throws IOException {
long value = VarInt.readVLong(in);
for(int i=0;i<os.length;i++)
VarInt.writeVLong(os[i], value);
return value;
}
}
VarInt总结
VarInt解决了定长存储的整数类型绝对值较小时空间浪费大的问题,但是,VarInt 编码同样存在缺陷,那就是存储大数的时候,反而会比 直接存储binary 的空间开销更大:由于需要接着用8bit中的首位作为标记位,因此本来 4 个字节存下的数可能需要 5 个字节,8 个字节存下的数可能需要 10个字节。有一部分问题可以有下文将要介绍的ZigZag编码方式得到解决,但是也并不能完全规避掉,因此在使用时需要特别的小心。
ZigZag
Zig-zag编码解决了绝对值较小的负数经过varint编码后空间开销较大的问题。
原码: 10000000 00000000 00000000 00001011
反码: 11111111 11111111 11111111 11110100
补码: 11111111 11111111 11111111 11110101
VarInt编码: 00001111 11111111 11111111 11111111 11110101
显然,对于绝对值较小的负数,用 varint 编码以后前导 1 过多,难以压缩,空间开销比 binary 编码还大。那么Zigzag如何解决这个问题呢?Zigzag的解决思路是将负数转正数,从而把前导 1 转成前导 0,便于 varint 压缩。
ZigZag的算法步骤可以总结为:
- 不分正负:符号位后置,数值位前移
- 对于负数:符号位不变,数值位取反
public class ZigZag {
public static long encodeLong(long l) {
return (l << 1) ^ (l >> 63);
}
public static long decodeLong(long l) {
return (l >>> 1) ^ ((l << 63) >> 63);
}
public static int encodeInt(int i) {
return (i << 1) ^ (i >> 31);
}
public static int decodeInt(int i) {
return (i >>> 1) ^ ((i << 31) >> 31);
}
}
我们用一个实际的例子来分析下:
负数(-11)
补码: 11111111 11111111 11111111 11110101
符号位后置,数值位前移: 11111111 11111111 11111111 11101011
符号位不变,数值位取反(21): 00000000 00000000 00000000 00010101
正数(11)
补码: 00000000 00000000 00000000 00010101
符号位后置,数值位前移(22): 00000000 00000000 00000000 00101010
Zig-zag编码方式,在Hollow中主要用于对 com.netflix.hollow.core.schema.HollowObjectSchema.FieldType.INT 和 com.netflix.hollow.core.schema.HollowObjectSchema.FieldType.LONG 两种类型字段进行编码,这样做可以使用更少的位对更小的绝对值进行编码。
更为详细的分析可以参见之前写的一篇文章一种编码方式:ZigZag。
ByteDataArray
ByteDataArray封装了对SegmentedByteArray的部分操作,可以完成写数据到SegmentedByteArray,同时用于追踪SegmentedByteArray写入的索引。
在使用VarInt算法编码数据是,就是从ByteDataArray获取需要编码的数据。ByteDataArray,并不太复杂,这里值贴出构造方法。
private final SegmentedByteArray buf;
private long position;
public ByteDataArray() {
this(WastefulRecycler.DEFAULT_INSTANCE);
}
public ByteDataArray(ArraySegmentRecycler memoryRecycler) {
buf = new SegmentedByteArray(memoryRecycler);
}
ByteArrayOrdinalMap
ByteArrayOrdinalMap负责byte与Ordinal之间映射关系的管理和维护。ByteArrayOrdinalMap数据结构将字节序列映射到Ordinal。
ByteArrayOrdinalMap可以看成一个哈希表(Map),名为pointersAndOrdinals的AtomicLongArray存储key,而 ByteDataArray 存储value。 每个key有两个组件。
AtomicLongArray的高 29 位代表Ordinal,低35位表示指向 ByteDataBuffer 中字节序列起始位置的指针,每个字节序列前面都有一个可变长度整数(参见 VarInt),表示序列的长度。
Ordinal:本意是序号,在Hollow中可以理解为一条具体的记录序号。
构造方法和属性
private static final long EMPTY_BUCKET_VALUE = -1L;
private static final int BITS_PER_ORDINAL = 29;
private static final int BITS_PER_POINTER = Long.SIZE - BITS_PER_ORDINAL;
private static final long POINTER_MASK = (1L << BITS_PER_POINTER) - 1;
private static final long ORDINAL_MASK = (1L << BITS_PER_ORDINAL) - 1;
private static final long MAX_BYTE_DATA_LENGTH = 1L << BITS_PER_POINTER;
/// Thread safety: We need volatile access semantics to the individual elements in the
/// pointersAndOrdinals array.
/// Ordinal is the high 29 bits. Pointer to byte data is the low 35 bits.
/// In addition need volatile access to the reference when resize occurs
private volatile AtomicLongArray pointersAndOrdinals;
private final ByteDataArray byteData;
private final FreeOrdinalTracker freeOrdinalTracker;
// 空歌白石:Map大小
private int size;
// 空歌白石:Map的加载因子,当前map数量百分比超过此值时,会进行扩容
private int sizeBeforeGrow;
private BitSet unusedPreviousOrdinals;
// 空歌白石:将`Ordinal`映射到`Pointer`的数组,以便在写入 `blob` 流时可以轻松查找它们。
private long[] pointersByOrdinal;
/**
* 空歌白石:初始化一个拥有256个元素的数据,加载因子是70%。
*/
public ByteArrayOrdinalMap() {
this(256);
}
/**
* 空歌白石:创建一个字节数组序数映射,其初始容量为给定大小,向上舍入到最接近的 2 次幂,负载因子为 70%。
*/
public ByteArrayOrdinalMap(int size) {
size = bucketSize(size);
this.freeOrdinalTracker = new FreeOrdinalTracker();
this.byteData = new ByteDataArray(WastefulRecycler.DEFAULT_INSTANCE);
this.pointersAndOrdinals = emptyKeyArray(size);
this.sizeBeforeGrow = (int) (((float) size) * 0.7); /// 70% load factor
this.size = 0;
}
// 空歌白石:向上舍入到最接近的 2 次幂
private static int bucketSize(int x) {
// See Hackers Delight Fig. 3-3
x = x - 1;
x = x | (x >> 1);
x = x | (x >> 2);
x = x | (x >> 4);
x = x | (x >> 8);
x = x | (x >> 16);
return (x < 256) ? 256 : (x >= 1 << 30) ? 1 << 30 : x + 1;
}
/**
* 空歌白石:创建一个指定大小的 AtomicLongArray,数组中的每个值都是 EMPTY_BUCKET_VALUE,也就是-1
*/
private AtomicLongArray emptyKeyArray(int size) {
AtomicLongArray arr = new AtomicLongArray(size);
// Volatile store not required, could use plain store
// See VarHandles for JDK >= 9
for (int i = 0; i < arr.length(); i++) {
arr.lazySet(i, EMPTY_BUCKET_VALUE);
}
return arr;
}
put
put方法会预先分配一个Ordinal给到已经序列化的byte[]。
- 这个方法不是线程安全的
- 这个方法不会更新
FreeOrdinalTracker的freeOrdinal
public void put(ByteDataArray serializedRepresentation, int ordinal) {
if (ordinal < 0 || ordinal > ORDINAL_MASK) {
throw new IllegalArgumentException(String.format(
"The given ordinal %s is out of bounds and not within the closed interval [0, %s]",
ordinal, ORDINAL_MASK));
}
if (size > sizeBeforeGrow) {
growKeyArray();
}
int hash = HashCodes.hashCode(serializedRepresentation);
AtomicLongArray pao = pointersAndOrdinals;
int modBitmask = pao.length() - 1;
int bucket = hash & modBitmask;
long key = pao.get(bucket);
while (key != EMPTY_BUCKET_VALUE) {
bucket = (bucket + 1) & modBitmask;
key = pao.get(bucket);
}
long pointer = byteData.length();
VarInt.writeVInt(byteData, (int) serializedRepresentation.length());
serializedRepresentation.copyTo(byteData);
if (byteData.length() > MAX_BYTE_DATA_LENGTH) {
throw new IllegalStateException(String.format(
"The number of bytes for the serialized representations, %s, is too large and is greater than the maximum of %s bytes",
byteData.length(), MAX_BYTE_DATA_LENGTH));
}
key = ((long) ordinal << BITS_PER_POINTER) | pointer;
size++;
pao.set(bucket, key);
}
get ordinal
get方法返回先前添加的字节序列的ordinal。如果此字节序列尚未添加到map中,则返回 -1。这适用于客户端堆安全双快照加载。
/**
* Returns the ordinal for a previously added byte sequence. If this byte sequence has not been added to the map, then -1 is returned.<p>
* <p>
* This is intended for use in the client-side heap-safe double snapshot load.
*
* @param serializedRepresentation the serialized representation
* @return The ordinal for this serialized representation, or -1.
*/
public int get(ByteDataArray serializedRepresentation) {
return get(serializedRepresentation, HashCodes.hashCode(serializedRepresentation));
}
private int get(ByteDataArray serializedRepresentation, int hash) {
AtomicLongArray pao = pointersAndOrdinals;
int modBitmask = pao.length() - 1;
// 空歌白石:Hash与掩码计算得到具体的bucket
int bucket = hash & modBitmask;
long key = pao.get(bucket);
// Linear probing to resolve collisions
// Given the load factor it is guaranteed that the loop will terminate
// as there will be at least one empty bucket
// To ensure this is the case it is important that pointersAndOrdinals
// is read into a local variable and thereafter used, otherwise a concurrent
// size increase may break this invariant
while (key != EMPTY_BUCKET_VALUE) {
if (compare(serializedRepresentation, key)) {
return (int) (key >>> BITS_PER_POINTER);
}
bucket = (bucket + 1) & modBitmask;
key = pao.get(bucket);
}
return -1;
}
ByteDataArray to Ordinal
通过getOrAssignOrdinal方法,可以从ByteDataArray转换为对应的Ordinal。
public int getOrAssignOrdinal(ByteDataArray serializedRepresentation) {
return getOrAssignOrdinal(serializedRepresentation, -1);
}
/**
* Adds a sequence of bytes to this map. If the sequence of bytes has previously been added
* to this map then its assigned ordinal is returned.
* If the sequence of bytes has not been added to this map then a new ordinal is assigned
* and returned.
* <p>
* This operation is thread-safe.
*
* @param serializedRepresentation the sequence of bytes
* @param preferredOrdinal the preferred ordinal to assign, if not already assigned to
* another sequence of bytes and the given sequence of bytes has not previously been added
* @return the assigned ordinal
*/
public int getOrAssignOrdinal(ByteDataArray serializedRepresentation, int preferredOrdinal) {
int hash = HashCodes.hashCode(serializedRepresentation);
int ordinal = get(serializedRepresentation, hash);
return ordinal != -1 ? ordinal : assignOrdinal(serializedRepresentation, hash, preferredOrdinal);
}
/// acquire the lock before writing.
private synchronized int assignOrdinal(ByteDataArray serializedRepresentation, int hash, int preferredOrdinal) {
if (preferredOrdinal < -1 || preferredOrdinal > ORDINAL_MASK) {
throw new IllegalArgumentException(String.format(
"The given preferred ordinal %s is out of bounds and not within the closed interval [-1, %s]",
preferredOrdinal, ORDINAL_MASK));
}
if (size > sizeBeforeGrow) {
growKeyArray();
}
/// check to make sure that after acquiring the lock, the element still does not exist.
/// this operation is akin to double-checked locking which is 'fixed' with the JSR 133 memory model in JVM >= 1.5.
/// Note that this also requires pointersAndOrdinals be volatile so resizes are also visible
AtomicLongArray pao = pointersAndOrdinals;
int modBitmask = pao.length() - 1;
int bucket = hash & modBitmask;
long key = pao.get(bucket);
while (key != EMPTY_BUCKET_VALUE) {
if (compare(serializedRepresentation, key)) {
return (int) (key >>> BITS_PER_POINTER);
}
bucket = (bucket + 1) & modBitmask;
key = pao.get(bucket);
}
/// the ordinal for this object still does not exist in the list, even after the lock has been acquired.
/// it is up to this thread to add it at the current bucket position.
int ordinal = findFreeOrdinal(preferredOrdinal);
if (ordinal > ORDINAL_MASK) {
throw new IllegalStateException(String.format(
"Ordinal cannot be assigned. The to be assigned ordinal, %s, is greater than the maximum supported ordinal value of %s",
ordinal, ORDINAL_MASK));
}
long pointer = byteData.length();
VarInt.writeVInt(byteData, (int) serializedRepresentation.length());
/// Copying might cause a resize to the segmented array held by byteData
/// A reading thread may observe a null value for a segment during the creation
/// of a new segments array (see SegmentedByteArray.ensureCapacity).
serializedRepresentation.copyTo(byteData);
if (byteData.length() > MAX_BYTE_DATA_LENGTH) {
throw new IllegalStateException(String.format(
"The number of bytes for the serialized representations, %s, is too large and is greater than the maximum of %s bytes",
byteData.length(), MAX_BYTE_DATA_LENGTH));
}
key = ((long) ordinal << BITS_PER_POINTER) | pointer;
size++;
/// this set on the AtomicLongArray has volatile semantics (i.e. behaves like a monitor release).
/// Any other thread reading this element in the AtomicLongArray will have visibility to all memory writes this thread has made up to this point.
/// This means the entire byte sequence is guaranteed to be visible to any thread which reads the pointer to that data.
pao.set(bucket, key);
return ordinal;
}
/**
* If the preferredOrdinal has not already been used, mark it and use it. Otherwise,
* delegate to the FreeOrdinalTracker.
*/
private int findFreeOrdinal(int preferredOrdinal) {
if (preferredOrdinal != -1 && unusedPreviousOrdinals.get(preferredOrdinal)) {
unusedPreviousOrdinals.clear(preferredOrdinal);
return preferredOrdinal;
}
return freeOrdinalTracker.getFreeOrdinal();
}
recalculateFreeOrdinals
recalculateFreeOrdinals逻辑较为复杂,首先看下是如何使用的使用mapOrdinal方法,将HollowWriteRecord映射到Ordinal。在使用recalculateFreeOrdinals方法重新计算FreeOrdinals。
protected final ByteArrayOrdinalMap ordinalMap;
public void mapOrdinal(HollowWriteRecord rec, int newOrdinal, boolean markPreviousCycle, boolean markCurrentCycle) {
if(!ordinalMap.isReadyForAddingObjects())
throw new RuntimeException("The HollowWriteStateEngine is not ready to add more Objects. Did you remember to call stateEngine.prepareForNextCycle()?");
ByteDataArray scratch = scratch();
rec.writeDataTo(scratch);
ordinalMap.put(scratch, newOrdinal);
if(markPreviousCycle)
previousCyclePopulated.set(newOrdinal);
if(markCurrentCycle)
currentCyclePopulated.set(newOrdinal);
scratch.reset();
}
/**
* Correct the free ordinal list after using mapOrdinal()
*/
public void recalculateFreeOrdinals() {
ordinalMap.recalculateFreeOrdinals();
}
借助于FreeOrdinalTracker重新分配Ordinal占用的空间,将内存占用更加紧凑。
public void recalculateFreeOrdinals() {
BitSet populatedOrdinals = new BitSet();
AtomicLongArray pao = pointersAndOrdinals;
for (int i = 0; i < pao.length(); i++) {
long key = pao.get(i);
if (key != EMPTY_BUCKET_VALUE) {
int ordinal = (int) (key >>> BITS_PER_POINTER);
populatedOrdinals.set(ordinal);
}
}
recalculateFreeOrdinals(populatedOrdinals);
}
private void recalculateFreeOrdinals(BitSet populatedOrdinals) {
freeOrdinalTracker.reset();
int length = populatedOrdinals.length();
int ordinal = populatedOrdinals.nextClearBit(0);
while (ordinal < length) {
freeOrdinalTracker.returnOrdinalToPool(ordinal);
ordinal = populatedOrdinals.nextClearBit(ordinal + 1);
}
freeOrdinalTracker.setNextEmptyOrdinal(length);
}
maxOrdinal
计算当前pointersAndOrdinals中最大的序号ordinal。
public int maxOrdinal() {
int maxOrdinal = -1;
AtomicLongArray pao = pointersAndOrdinals;
for (int i = 0; i < pao.length(); i++) {
long key = pao.get(i);
if (key != EMPTY_BUCKET_VALUE) {
int ordinal = (int) (key >>> BITS_PER_POINTER);
if (ordinal > maxOrdinal) {
maxOrdinal = ordinal;
}
}
}
return maxOrdinal;
}
prepareForWrite
prepareForWrite创建一个将Ordinal映射到Pointer的数组,以便在写入 blob 流时可以轻松查找它们。
public void prepareForWrite() {
int maxOrdinal = 0;
AtomicLongArray pao = pointersAndOrdinals;
for (int i = 0; i < pao.length(); i++) {
long key = pao.get(i);
if (key != EMPTY_BUCKET_VALUE) {
int ordinal = (int) (key >>> BITS_PER_POINTER);
if (ordinal > maxOrdinal) {
maxOrdinal = ordinal;
}
}
}
long[] pbo = new long[maxOrdinal + 1];
// 空歌白石:在新的数组中全部填充-1
Arrays.fill(pbo, -1);
for (int i = 0; i < pao.length(); i++) {
long key = pao.get(i);
if (key != EMPTY_BUCKET_VALUE) {
int ordinal = (int) (key >>> BITS_PER_POINTER);
pbo[ordinal] = key & POINTER_MASK;
}
}
pointersByOrdinal = pbo;
}
compact
compact顾名思义,此方法可以充分利用内存池,调整已经释放的空间,让内存占用更加紧凑。compact回收上一个循环使用的字节数组中的在这个循环中没有被引用空间。这是通过在字节数组中向下移动所有使用的字节序列来实现的,然后更新键数组以反映新指针并排除已删除的条目。此方法将未使用的``Ordinal`放回回收站中。
/**
* Reclaim space in the byte array used in the previous cycle, but not referenced in this cycle.<p>
* <p>
* This is achieved by shifting all used byte sequences down in the byte array, then updating
* the key array to reflect the new pointers and exclude the removed entries. This is also where ordinals
* which are unused are returned to the pool.<p>
*
* @param usedOrdinals a bit set representing the ordinals which are currently referenced by any image.
*/
public void compact(ThreadSafeBitSet usedOrdinals, int numShards, boolean focusHoleFillInFewestShards) {
long[] populatedReverseKeys = new long[size];
int counter = 0;
AtomicLongArray pao = pointersAndOrdinals;
for (int i = 0; i < pao.length(); i++) {
long key = pao.get(i);
if (key != EMPTY_BUCKET_VALUE) {
populatedReverseKeys[counter++] = key << BITS_PER_ORDINAL | key >>> BITS_PER_POINTER;
}
}
Arrays.sort(populatedReverseKeys);
SegmentedByteArray arr = byteData.getUnderlyingArray();
long currentCopyPointer = 0;
for (int i = 0; i < populatedReverseKeys.length; i++) {
int ordinal = (int) (populatedReverseKeys[i] & ORDINAL_MASK);
if (usedOrdinals.get(ordinal)) {
long pointer = populatedReverseKeys[i] >>> BITS_PER_ORDINAL;
int length = VarInt.readVInt(arr, pointer);
length += VarInt.sizeOfVInt(length);
if (currentCopyPointer != pointer) {
arr.copy(arr, pointer, currentCopyPointer, length);
}
populatedReverseKeys[i] = populatedReverseKeys[i] << BITS_PER_POINTER | currentCopyPointer;
currentCopyPointer += length;
} else {
freeOrdinalTracker.returnOrdinalToPool(ordinal);
populatedReverseKeys[i] = EMPTY_BUCKET_VALUE;
}
}
byteData.setPosition(currentCopyPointer);
if(focusHoleFillInFewestShards && numShards > 1)
freeOrdinalTracker.sort(numShards);
else
freeOrdinalTracker.sort();
// Reset the array then fill with compacted values
// Volatile store not required, could use plain store
// See VarHandles for JDK >= 9
for (int i = 0; i < pao.length(); i++) {
pao.lazySet(i, EMPTY_BUCKET_VALUE);
}
populateNewHashArray(pao, populatedReverseKeys);
size = usedOrdinals.cardinality();
pointersByOrdinal = null;
unusedPreviousOrdinals = null;
}
resize
resize通过增加容量来调整ordinalMap的大小。如果当前容量足以满足给定大小,则不采取任何操作。
特别注意
resize不是线程安全的。- 要增加到的大小
size,需要四舍五入到最接近的 2 次方。
/**
* Resize the ordinal map by increasing its capacity.
* <p>
* No action is take if the current capacity is sufficient for the given size.
* <p>
* WARNING: THIS OPERATION IS NOT THREAD-SAFE.
*
* @param size the size to increase to, rounded up to the nearest power of two.
*/
public void resize(int size) {
size = bucketSize(size);
if (pointersAndOrdinals.length() < size) {
growKeyArray(size);
}
}
growKeyArray
扩容数组长度到原数组的长度的2倍,当前数组中的所有值都必须重新散列并添加到新数组中。
private void growKeyArray() {
int newSize = pointersAndOrdinals.length() << 1;
if (newSize < 0) {
throw new IllegalStateException("New size computed to grow the underlying array for the map is negative. " +
"This is most likely due to the total number of keys added to map has exceeded the max capacity of the keys map can hold. "
+
"Current array size :" + pointersAndOrdinals.length() + " and size to grow :" + newSize);
}
growKeyArray(newSize);
}
private void growKeyArray(int newSize) {
AtomicLongArray pao = pointersAndOrdinals;
assert (newSize & (newSize - 1)) == 0; // power of 2
assert pao.length() < newSize;
// 空歌白石:重新初始化一个新的数组
AtomicLongArray newKeys = emptyKeyArray(newSize);
long[] valuesToAdd = new long[size];
int counter = 0;
/// do not iterate over these values in the same order in which they appear in the hashed array.
/// if we do so, we cause large clusters of collisions to appear (because we resolve collisions with linear probing).
// 空歌白石:不要以它们在散列数组中出现的相同顺序迭代这些值。如果我们这样做,我们会导致出现大量的碰撞(因为我们用线性探测解决了碰撞)。
for (int i = 0; i < pao.length(); i++) {
long key = pao.get(i);
if (key != EMPTY_BUCKET_VALUE) {
valuesToAdd[counter++] = key;
}
}
// 空歌白石:升序排列values
Arrays.sort(valuesToAdd);
// 空歌白石:重新计算Hash
populateNewHashArray(newKeys, valuesToAdd, counter);
/// 70% load factor
sizeBeforeGrow = (int) (((float) newSize) * 0.7);
// 空歌白石:将新数组指向原地址
pointersAndOrdinals = newKeys;
}
populateNewHashArray
populateNewHashArray实现了重新Hash所有已有元素的能力。
private void populateNewHashArray(AtomicLongArray newKeys, long[] valuesToAdd) {
populateNewHashArray(newKeys, valuesToAdd, valuesToAdd.length);
}
private void populateNewHashArray(AtomicLongArray newKeys, long[] valuesToAdd, int length) {
assert length <= valuesToAdd.length;
int modBitmask = newKeys.length() - 1;
for (int i = 0; i < length; i++) {
long value = valuesToAdd[i];
if (value != EMPTY_BUCKET_VALUE) {
int hash = rehashPreviouslyAddedData(value);
int bucket = hash & modBitmask;
while (newKeys.get(bucket) != EMPTY_BUCKET_VALUE) {
bucket = (bucket + 1) & modBitmask;
}
// Volatile store not required, could use plain store
// See VarHandles for JDK >= 9
newKeys.lazySet(bucket, value);
}
}
}
/**
* 空歌白石:获取指定键指向的字节数组的哈希码。
*/
private int rehashPreviouslyAddedData(long key) {
long position = key & POINTER_MASK;
int sizeOfData = VarInt.readVInt(byteData.getUnderlyingArray(), position);
position += VarInt.sizeOfVInt(sizeOfData);
return HashCodes.hashCode(byteData.getUnderlyingArray(), position, sizeOfData);
}
FreeOrdinalTracker
FreeOrdinalTracker负责管理一堆未使用的Ordinal。ByteArrayOrdinalMap 使用此数据结构来跟踪未使用的Ordinal并将其分配给新记录。设计这个类的目标是确保在服务器处理期间通过删除未使用的Ordinal产生的“空洞”在随后的周期中被重用,而不是无限地增加Ordinal占用的空间,达到节省内存以及降低GC的目的。
构造函数与属性
// 空歌白石:被释放的Ordinal,初始化64个Ordinal
private int freeOrdinals[];
// 空歌白石:size为freeOrdinals的大小+1。
private int size;
// 空歌白石:下一个空的Ordinal的指针
private int nextEmptyOrdinal;
public FreeOrdinalTracker() {
this(0);
}
private FreeOrdinalTracker(int nextEmptyOrdinal) {
this.freeOrdinals = new int[64];
this.nextEmptyOrdinal = nextEmptyOrdinal;
this.size = 0;
}
getOrdinal & returnOrdinal
这里分析下如何取Ordinal和如何将Ordinal放回pool中。序列 0-n 中先前已释放的序数或下一个空的、先前未分配的序数。
在回收使用时,向后检索以获取被释放的Ordinal。
public int getFreeOrdinal() {
if(size == 0)
return nextEmptyOrdinal++;
return freeOrdinals[--size];
}
将需要释放的Ordinal放回Pool中时,如果已释放的空间已满,会按照1.5倍扩容。
public void returnOrdinalToPool(int ordinal) {
if(size == freeOrdinals.length) {
freeOrdinals = Arrays.copyOf(freeOrdinals, freeOrdinals.length * 3 / 2);
}
freeOrdinals[size] = ordinal;
size++;
}
sort
FreeOrdinalTracker需要保证已有的Ordinal是升序排列的,因此在发生回收等情况下,需要进行重新排序。通过重新排序,可以做到尽可能少减少内存的碎片中返回Ordinal。
先看第一种sort方法,此方法很简单,使用Arrays排序方法升序重排freeOrdinals。
public void sort() {
Arrays.sort(freeOrdinals, 0, size);
reverseFreeOrdinalPool();
}
在分析sort方法先,先看下定义的内部类Shard,Shard定义了一种分片格式,包含当前释放的Ordinal数量,以及当前的Position。
private static class Shard {
private int freeOrdinalCount;
private int currentPos;
}
sort基于shards排序freeOrdinals。numShards是什么呢?可以这样理解,numShards是存储一个Fields可能用到的分片数量。
public void sort(int numShards) {
int shardNumberMask = numShards - 1;
Shard shards[] = new Shard[numShards];
for(int i=0;i<shards.length;i++)
shards[i] = new Shard();
for(int i=0;i<size;i++)
shards[freeOrdinals[i] & shardNumberMask].freeOrdinalCount++;
Shard orderedShards[] = Arrays.copyOf(shards, shards.length);
Arrays.sort(orderedShards, (s1, s2) -> s2.freeOrdinalCount - s1.freeOrdinalCount);
for(int i=1;i<numShards;i++)
orderedShards[i].currentPos = orderedShards[i-1].currentPos + orderedShards[i-1].freeOrdinalCount;
/// each shard will receive the ordinals in ascending order.
Arrays.sort(freeOrdinals, 0, size);
int newFreeOrdinals[] = new int[freeOrdinals.length];
for(int i=0;i<size;i++) {
Shard shard = shards[freeOrdinals[i] & shardNumberMask];
newFreeOrdinals[shard.currentPos] = freeOrdinals[i];
shard.currentPos++;
}
freeOrdinals = newFreeOrdinals;
reverseFreeOrdinalPool();
}
两个sort方法都会依赖与reverseFreeOrdinalPool,在已经升序排列好的Ordinal数组,调整为降序排列。这样有什么作用呢?通过getFreeOrdinal()方法可以看出,取值是从后向前取值,有效地避免了内存的空心化。
private void reverseFreeOrdinalPool() {
int midpoint = size / 2;
for(int i=0;i<midpoint;i++) {
int temp = freeOrdinals[i];
freeOrdinals[i] = freeOrdinals[size-i-1];
freeOrdinals[size-i-1] = temp;
}
}
ThreadSafeBitSet
ThreadSafeBitSet是 BitSet 的无锁、线程安全版本的实现。此实现使用 AtomicLongArray 代替long[]来保存byte,然后在赋值时执行适当的比较和交换操作。
构造方法和属性
public static final int DEFAULT_LOG2_SEGMENT_SIZE_IN_BITS = 14;
private final int numLongsPerSegment;
private final int log2SegmentSize;
private final int segmentMask;
private final AtomicReference<ThreadSafeBitSetSegments> segments;
public ThreadSafeBitSet() {
this(DEFAULT_LOG2_SEGMENT_SIZE_IN_BITS); /// 16384 bits, 2048 bytes, 256 longs per segment
}
public ThreadSafeBitSet(int log2SegmentSizeInBits) {
this(log2SegmentSizeInBits, 0);
}
public ThreadSafeBitSet(int log2SegmentSizeInBits, int numBitsToPreallocate) {
if(log2SegmentSizeInBits < 6)
throw new IllegalArgumentException("Cannot specify fewer than 64 bits in each segment!");
this.log2SegmentSize = log2SegmentSizeInBits;
this.numLongsPerSegment = (1 << (log2SegmentSizeInBits - 6));
this.segmentMask = numLongsPerSegment - 1;
long numBitsPerSegment = numLongsPerSegment * 64;
int numSegmentsToPreallocate = numBitsToPreallocate == 0 ? 1 : (int)(((numBitsToPreallocate - 1) / numBitsPerSegment) + 1);
segments = new AtomicReference<ThreadSafeBitSetSegments>();
segments.set(new ThreadSafeBitSetSegments(numSegmentsToPreallocate, numLongsPerSegment));
}
set
public void set(int position) {
int segmentPosition = position >>> log2SegmentSize; /// which segment -- div by num bits per segment
int longPosition = (position >>> 6) & segmentMask; /// which long in the segment -- remainder of div by num bits per segment
int bitPosition = position & 0x3F; /// which bit in the long -- remainder of div by num bits in long (64)
AtomicLongArray segment = getSegment(segmentPosition);
long mask = 1L << bitPosition;
// Thread safety: we need to loop until we win the race to set the long value.
while(true) {
// determine what the new long value will be after we set the appropriate bit.
long currentLongValue = segment.get(longPosition);
long newLongValue = currentLongValue | mask;
// if no other thread has modified the value since we read it, we won the race and we are done.
if(segment.compareAndSet(longPosition, currentLongValue, newLongValue))
break;
}
}
get
public boolean get(int position) {
int segmentPosition = position >>> log2SegmentSize; /// which segment -- div by num bits per segment
int longPosition = (position >>> 6) & segmentMask; /// which long in the segment -- remainder of div by num bits per segment
int bitPosition = position & 0x3F; /// which bit in the long -- remainder of div by num bits in long (64)
AtomicLongArray segment = getSegment(segmentPosition);
long mask = 1L << bitPosition;
return ((segment.get(longPosition) & mask) != 0);
}
getSegment
/**
* Get the segment at <code>segmentIndex</code>. If this segment does not yet exist, create it.
*
* @param segmentIndex the segment index
* @return the segment
*/
private AtomicLongArray getSegment(int segmentIndex) {
ThreadSafeBitSetSegments visibleSegments = segments.get();
while(visibleSegments.numSegments() <= segmentIndex) {
/// Thread safety: newVisibleSegments contains all of the segments from the currently visible segments, plus extra.
/// all of the segments in the currently visible segments are canonical and will not change.
ThreadSafeBitSetSegments newVisibleSegments = new ThreadSafeBitSetSegments(visibleSegments, segmentIndex + 1, numLongsPerSegment);
/// because we are using a compareAndSet, if this thread "wins the race" and successfully sets this variable, then the segments
/// which are newly defined in newVisibleSegments become canonical.
if(segments.compareAndSet(visibleSegments, newVisibleSegments)) {
visibleSegments = newVisibleSegments;
} else {
/// If we "lose the race" and are growing the ThreadSafeBitSet segments larger,
/// then we will gather the new canonical sets from the update which we missed on the next iteration of this loop.
/// Newly defined segments in newVisibleSegments will be discarded, they do not get to become canonical.
visibleSegments = segments.get();
}
}
return visibleSegments.getSegment(segmentIndex);
}
maxSetBit
maxSetBit 表示写入到全量或增量的所需数据。
public long maxSetBit() {
ThreadSafeBitSetSegments segments = this.segments.get();
int segmentIdx = segments.numSegments() - 1;
for (; segmentIdx >= 0; segmentIdx--) {
AtomicLongArray segment = segments.getSegment(segmentIdx);
for (int longIdx = segment.length() - 1; longIdx >= 0; longIdx--) {
long l = segment.get(longIdx);
if (l != 0)
return (segmentIdx << log2SegmentSize) + (longIdx * 64) + (63 - Long.numberOfLeadingZeros(l));
}
}
return -1;
}
nextSetBit
public int nextSetBit(int fromIndex) {
if (fromIndex < 0)
throw new IndexOutOfBoundsException("fromIndex < 0: " + fromIndex);
int segmentPosition = fromIndex >>> log2SegmentSize; /// which segment -- div by num bits per segment
ThreadSafeBitSetSegments segments = this.segments.get();
if(segmentPosition >= segments.numSegments())
return -1;
int longPosition = (fromIndex >>> 6) & segmentMask; /// which long in the segment -- remainder of div by num bits per segment
int bitPosition = fromIndex & 0x3F; /// which bit in the long -- remainder of div by num bits in long (64)
AtomicLongArray segment = segments.getSegment(segmentPosition);
long word = segment.get(longPosition) & (0xffffffffffffffffL << bitPosition);
while (true) {
if (word != 0)
return (segmentPosition << (log2SegmentSize)) + (longPosition << 6) + Long.numberOfTrailingZeros(word);
if (++longPosition > segmentMask) {
segmentPosition++;
if(segmentPosition >= segments.numSegments())
return -1;
segment = segments.getSegment(segmentPosition);
longPosition = 0;
}
word = segment.get(longPosition);
}
}
cardinality
计算整个位集合中数量,也就是计算基数。
/**
* @return the number of bits which are set in this bit set.
*/
public int cardinality() {
ThreadSafeBitSetSegments segments = this.segments.get();
int numSetBits = 0;
for(int i=0;i<segments.numSegments();i++) {
AtomicLongArray segment = segments.getSegment(i);
for(int j=0;j<segment.length();j++) {
numSetBits += Long.bitCount(segment.get(j));
}
}
return numSetBits;
}
clear
基于AtomicLongArray的compareAndSet,也就是CAS算法保证线程安全,清空position位置的数据。
public void clear(int position) {
int segmentPosition = position >>> log2SegmentSize; /// which segment -- div by num bits per segment
int longPosition = (position >>> 6) & segmentMask; /// which long in the segment -- remainder of div by num bits per segment
int bitPosition = position & 0x3F; /// which bit in the long -- remainder of div by num bits in long (64)
AtomicLongArray segment = getSegment(segmentPosition);
long mask = ~(1L << bitPosition);
// Thread safety: we need to loop until we win the race to set the long value.
while(true) {
// determine what the new long value will be after we set the appropriate bit.
long currentLongValue = segment.get(longPosition);
long newLongValue = currentLongValue & mask;
// if no other thread has modified the value since we read it, we won the race and we are done.
if(segment.compareAndSet(longPosition, currentLongValue, newLongValue))
break;
}
}
clearAll
将所有的bit位都设置为0,并不清空内存。
/**
* Clear all bits to 0.
*/
public void clearAll() {
ThreadSafeBitSetSegments segments = this.segments.get();
for(int i=0;i<segments.numSegments();i++) {
AtomicLongArray segment = segments.getSegment(i);
for(int j=0;j<segment.length();j++) {
segment.set(j, 0L);
}
}
}
andNot
返回一个新的bit集合,它是按位的,并且按位不是另一个位集。
/**
* Return a new bit set which contains all bits which are contained in this bit set, and which are NOT contained in the <code>other</code> bit set.<p>
*
* In other words, return a new bit set, which is a bitwise and with the bitwise not of the other bit set.
*
* @param other the other bit set
* @return the resulting bit set
*/
public ThreadSafeBitSet andNot(ThreadSafeBitSet other) {
if(other.log2SegmentSize != log2SegmentSize)
throw new IllegalArgumentException("Segment sizes must be the same");
ThreadSafeBitSetSegments thisSegments = this.segments.get();
ThreadSafeBitSetSegments otherSegments = other.segments.get();
ThreadSafeBitSetSegments newSegments = new ThreadSafeBitSetSegments(thisSegments.numSegments(), numLongsPerSegment);
for(int i=0;i<thisSegments.numSegments();i++) {
AtomicLongArray thisArray = thisSegments.getSegment(i);
AtomicLongArray otherArray = (i < otherSegments.numSegments()) ? otherSegments.getSegment(i) : null;
AtomicLongArray newArray = newSegments.getSegment(i);
for(int j=0;j<thisArray.length();j++) {
long thisLong = thisArray.get(j);
long otherLong = (otherArray == null) ? 0 : otherArray.get(j);
newArray.set(j, thisLong & ~otherLong);
}
}
ThreadSafeBitSet andNot = new ThreadSafeBitSet(log2SegmentSize);
andNot.segments.set(newSegments);
return andNot;
}
orAll
/**
* Return a new bit set which contains all bits which are contained in *any* of the specified bit sets.
*
* @param bitSets the other bit sets
* @return the resulting bit set
*/
public static ThreadSafeBitSet orAll(ThreadSafeBitSet... bitSets) {
if(bitSets.length == 0)
return new ThreadSafeBitSet();
int log2SegmentSize = bitSets[0].log2SegmentSize;
int numLongsPerSegment = bitSets[0].numLongsPerSegment;
ThreadSafeBitSetSegments segments[] = new ThreadSafeBitSetSegments[bitSets.length];
int maxNumSegments = 0;
for(int i=0;i<bitSets.length;i++) {
if(bitSets[i].log2SegmentSize != log2SegmentSize)
throw new IllegalArgumentException("Segment sizes must be the same");
segments[i] = bitSets[i].segments.get();
if(segments[i].numSegments() > maxNumSegments)
maxNumSegments = segments[i].numSegments();
}
ThreadSafeBitSetSegments newSegments = new ThreadSafeBitSetSegments(maxNumSegments, numLongsPerSegment);
AtomicLongArray segment[] = new AtomicLongArray[segments.length];
for(int i=0;i<maxNumSegments;i++) {
for(int j=0;j<segments.length;j++) {
segment[j] = i < segments[j].numSegments() ? segments[j].getSegment(i) : null;
}
AtomicLongArray newSegment = newSegments.getSegment(i);
for(int j=0;j<numLongsPerSegment;j++) {
long value = 0;
for(int k=0;k<segments.length;k++) {
if(segment[k] != null)
value |= segment[k].get(j);
}
newSegment.set(j, value);
}
}
ThreadSafeBitSet or = new ThreadSafeBitSet(log2SegmentSize);
or.segments.set(newSegments);
return or;
}
ThreadSafeBitSetSegments
private static class ThreadSafeBitSetSegments {
private final AtomicLongArray segments[];
private ThreadSafeBitSetSegments(int numSegments, int segmentLength) {
AtomicLongArray segments[] = new AtomicLongArray[numSegments];
for(int i=0;i<numSegments;i++) {
segments[i] = new AtomicLongArray(segmentLength);
}
/// Thread safety: Because this.segments is final, the preceding operations in this constructor are guaranteed to be visible to any
/// other thread which accesses this.segments.
this.segments = segments;
}
private ThreadSafeBitSetSegments(ThreadSafeBitSetSegments copyFrom, int numSegments, int segmentLength) {
AtomicLongArray segments[] = new AtomicLongArray[numSegments];
for(int i=0;i<numSegments;i++) {
segments[i] = i < copyFrom.numSegments() ? copyFrom.getSegment(i) : new AtomicLongArray(segmentLength);
}
/// see above re: thread-safety of this assignment
this.segments = segments;
}
public int numSegments() {
return segments.length;
}
public AtomicLongArray getSegment(int index) {
return segments[index];
}
}
serializeBitsTo
public void serializeBitsTo(DataOutputStream os) throws IOException {
ThreadSafeBitSetSegments segments = this.segments.get();
os.writeInt(segments.numSegments() * numLongsPerSegment);
for(int i=0;i<segments.numSegments();i++) {
AtomicLongArray arr = segments.getSegment(i);
for(int j=0;j<arr.length();j++) {
os.writeLong(arr.get(j));
}
}
}
equals & hashCode & toString
@Override
public boolean equals(Object obj) {
if(!(obj instanceof ThreadSafeBitSet))
return false;
ThreadSafeBitSet other = (ThreadSafeBitSet)obj;
if(other.log2SegmentSize != log2SegmentSize)
throw new IllegalArgumentException("Segment sizes must be the same");
ThreadSafeBitSetSegments thisSegments = this.segments.get();
ThreadSafeBitSetSegments otherSegments = other.segments.get();
for(int i=0;i<thisSegments.numSegments();i++) {
AtomicLongArray thisArray = thisSegments.getSegment(i);
AtomicLongArray otherArray = (i < otherSegments.numSegments()) ? otherSegments.getSegment(i) : null;
for(int j=0;j<thisArray.length();j++) {
long thisLong = thisArray.get(j);
long otherLong = (otherArray == null) ? 0 : otherArray.get(j);
if(thisLong != otherLong)
return false;
}
}
for(int i=thisSegments.numSegments();i<otherSegments.numSegments();i++) {
AtomicLongArray otherArray = otherSegments.getSegment(i);
for(int j=0;j<otherArray.length();j++) {
long l = otherArray.get(j);
if(l != 0)
return false;
}
}
return true;
}
@Override
public int hashCode() {
int result = log2SegmentSize;
result = 31 * result + Arrays.hashCode(segments.get().segments);
return result;
}
/**
* @return a new BitSet with same bits set
*/
public BitSet toBitSet() {
BitSet resultSet = new BitSet();
int ordinal = this.nextSetBit(0);
while(ordinal!=-1) {
resultSet.set(ordinal);
ordinal = this.nextSetBit(ordinal + 1);
}
return resultSet;
}
@Override
public String toString() {
return toBitSet().toString();
}
HashCodes
HashCodes主要用于Hollow的在Set、Map等集合类型的Hash值计算。
hashcode
// 空歌白石:Hash时的干扰因子`0xeab524b9`。murmurhash_seed
private static final int MURMURHASH_SEED = 0xeab524b9;
public static int hashCode(ByteDataArray data) {
return hashCode(data.getUnderlyingArray(), 0, (int) data.length());
}
public static int hashCode(final String data) {
if(data == null)
return -1;
int arrayLen = calculateByteArrayLength(data);
if(arrayLen == data.length()) {
return hashCode(new ByteData() {
@Override
public byte get(long position) {
return (byte)(data.charAt((int)position) & 0x7F);
}
}, 0, data.length());
} else {
byte[] array = createByteArrayFromString(data, arrayLen);
return hashCode(array);
}
}
public static int hashCode(byte[] data) {
return hashCode(new ArrayByteData(data), 0, data.length);
}
private static int calculateByteArrayLength(String data) {
int length = data.length();
for(int i=0;i<data.length();i++) {
if(data.charAt(i) > 0x7F)
length += VarInt.sizeOfVInt(data.charAt(i)) - 1;
}
return length;
}
private static byte[] createByteArrayFromString(String data, int arrayLen) {
byte array[] = new byte[arrayLen];
int pos = 0;
for(int i=0;i<data.length();i++) {
pos = VarInt.writeVInt(array, pos, data.charAt(i));
}
return array;
}
/**
* MurmurHash3. Adapted from:<p>
*
* https://github.com/yonik/java_util/blob/master/src/util/hash/MurmurHash3.java<p>
*
* On 11/19/2013 the license for this file read:<p>
*
* The MurmurHash3 algorithm was created by Austin Appleby. This java port was authored by
* Yonik Seeley and is placed into the public domain. The author hereby disclaims copyright
* to this source code.
* <p>
* This produces exactly the same hash values as the final C++
* version of MurmurHash3 and is thus suitable for producing the same hash values across
* platforms.
* <p>
* The 32 bit x86 version of this hash should be the fastest variant for relatively short keys like ids.
* <p>
* Note - The x86 and x64 versions do _not_ produce the same results, as the
* algorithms are optimized for their respective platforms.
* <p>
* See http://github.com/yonik/java_util for future updates to this file.
*
* @param data the data to hash
* @param offset the offset
* @param len the length
* @return the hash code
*/
public static int hashCode(ByteData data, long offset, int len) {
final int c1 = 0xcc9e2d51;
final int c2 = 0x1b873593;
int h1 = MURMURHASH_SEED;
long roundedEnd = offset + (len & 0xfffffffffffffffcL); // round down to
// 4 byte block
for (long i = offset; i < roundedEnd; i += 4) {
// little endian load order
int k1 = (data.get(i) & 0xff) | ((data.get(i + 1) & 0xff) << 8) | ((data.get(i + 2) & 0xff) << 16) | (data.get(i + 3) << 24);
k1 *= c1;
k1 = (k1 << 15) | (k1 >>> 17); // ROTL32(k1,15);
k1 *= c2;
h1 ^= k1;
h1 = (h1 << 13) | (h1 >>> 19); // ROTL32(h1,13);
h1 = h1 * 5 + 0xe6546b64;
}
// tail
int k1 = 0;
switch (len & 0x03) {
case 3:
k1 = (data.get(roundedEnd + 2) & 0xff) << 16;
// fallthrough
case 2:
k1 |= (data.get(roundedEnd + 1) & 0xff) << 8;
// fallthrough
case 1:
k1 |= (data.get(roundedEnd) & 0xff);
k1 *= c1;
k1 = (k1 << 15) | (k1 >>> 17); // ROTL32(k1,15);
k1 *= c2;
h1 ^= k1;
}
// finalization
h1 ^= len;
// fmix(h1);
h1 ^= h1 >>> 16;
h1 *= 0x85ebca6b;
h1 ^= h1 >>> 13;
h1 *= 0xc2b2ae35;
h1 ^= h1 >>> 16;
return h1;
}
hash
public static int hashLong(long key) {
key = (~key) + (key << 18);
key ^= (key >>> 31);
key *= 21;
key ^= (key >>> 11);
key += (key << 6);
key ^= (key >>> 22);
return (int) key;
}
public static int hashInt(int key) {
key = ~key + (key << 15);
key = key ^ (key >>> 12);
key = key + (key << 2);
key = key ^ (key >>> 4);
key = key * 2057;
key = key ^ (key >>> 16);
return key;
}
hashTableSize
/**
* Determine size of hash table capable of storing the specified number of elements with a load
* factor applied.
*
* @param numElements number of elements to be stored in the table
* @return size of hash table, always a power of 2
* @throws IllegalArgumentException when numElements is negative or exceeds
* {@link com.netflix.hollow.core.HollowConstants#HASH_TABLE_MAX_SIZE}
*/
public static int hashTableSize(int numElements) throws IllegalArgumentException {
if (numElements < 0) {
throw new IllegalArgumentException("cannot be negative; numElements="+numElements);
} else if (numElements > HASH_TABLE_MAX_SIZE) {
throw new IllegalArgumentException("exceeds maximum number of buckets; numElements="+numElements);
}
if (numElements == 0)
return 1;
if (numElements < 3)
return numElements * 2;
// Apply load factor to number of elements and determine next
// largest power of 2 that fits in an int
int sizeAfterLoadFactor = (int)((long)numElements * 10 / 7);
int bits = 32 - Integer.numberOfLeadingZeros(sizeAfterLoadFactor - 1);
return 1 << bits;
}
结束语
本文讲解了Hollow中对于Byte进行压缩的工作,其中VarInt算法和ZigZag算法是编码的一些基础,此算法在众多的压缩算法中都有使用。ByteDataArray、ThreadSafeBitSet、HashCodes则是充分利用了上述压缩算法的长处。