背景
SortedDocValues和BinaryDocValues存储的都是二进制,并且都是每个文档同名的最多一个,但是SortedDocValues在存储的时候会全局排序,所以存储的结构会复杂很多,后面我们会重点分析。
public class DocValueDemo {
public static void main(String[] args) throws IOException {
Directory directory = FSDirectory.open(new File("D:\\code\\lucene-9.1.0-learning\\data").toPath());
WhitespaceAnalyzer analyzer = new WhitespaceAnalyzer();
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);
indexWriterConfig.setUseCompoundFile(false);
IndexWriter indexWriter = new IndexWriter(directory, indexWriterConfig);
Document document = new Document();
// 一个doc不能有同名的BinaryDocValuesField,存储二进制
document.add(new BinaryDocValuesField("name", new BytesRef("zjc".getBytes(StandardCharsets.UTF_8))));
// 一个doc不能有同名的SortedDocValuesField,存储二进制,存储的时候会全局排序
document.add(new SortedDocValuesField("nickname", new BytesRef("alpha".getBytes(StandardCharsets.UTF_8))));
indexWriter.addDocument(document);
indexWriter.flush();
indexWriter.commit();
indexWriter.close();
}
}
前置知识
本文涉及到的一些知识在之前的文章中都做了详细的介绍,后续碰到不会重复介绍。
- DirectMonotonicWriter:用来压缩存储单调递增的long集合,详见《多值编码压缩算法》
- BytesRefHash:存储字符串,并且按字符串出现的顺序分配唯一的id,相同的字符串不会重复存储,BytesRefHash的具体介绍详见《内存中倒排信息的构建》
存储方案
SortedDocValues的存储涉及到两个概念:TermDict和TermsIndex。
因为SortedDocValues是有序的,所以支持根据序号获取value,和根据value查询序号。TermDict是用来根据序号获取value的,TermsIndex是加速根据value获取序号的索引。从逻辑上理解,整个存储结构可以类比跳表,如下图所示:
TermDict
从上面的示意图可以看出,如果没有其他的辅助结构,根据序号获取value就要遍历来获取。为了避免遍历操作,可以为每个value建立一个索引信息,指向每个value,这样可以直接根据序号获取索引信息然后定位到value,但是对全部的value构建索引,索引大小受数据量的影响比较直接。因此Lucene做了权衡,对数据进行分块,为每个块建立索引。此时,根据序号先计算在哪个块,然后在块中使用遍历的方式。
不仅如此,为了提高空间利用率,Lucene还对block进行压缩存储。这里并没有直接使用诸如L4等压缩算法对block中的所有value进行压缩,而是考虑特殊的场景,在SortedDocValues中,block的数据都是有序的,因此可以通过共享前缀方式避免了公共前缀的重复存储。因此,在block中,只有完整存储第一个value,后面的value都是存储跟前一个value的公共前缀长度,后缀长度以及后缀内容。
TermsIndex
TermsIndex其实是TermDict的一种补充的结构,加速根据value获取序号的索引。在TermDict中,查询某个value需要遍历,遍历是比较慢的,因此TermsDict的block是比较小的,在当前版本中是64。如果文档数量比较小的话,TermDict本身是可以根据每个block的第一个value来当做索引的,但是文档数量较大的时候,每64一个会导致索引数据量比较大,因此有了TermsIndex。
TermsIndex来加速定位目标value在TermDict的哪个block中,或者是哪几个block中。每1024个value构建一个索引,为了压缩索引的大小,每个索引只记录当前索引和前一个索引的最长公共前缀加上suffix的一个字节(做排序),这是在不影响排序的前提下索引最小化。
a
ab
abcd
abcdef
abddef
abdeee
abefff
假设我们是每3个value构建一个索引,得到:
value | index |
---------|----------|
a | a |
ab | |
abcd | |
abcdef | ab |
abddef | |
abdeee | |
abefff | abe |
文件格式
dvm
整体结构
字段详解
- Numeric:binary的id存储和NumericDocValues一模一样,因此和NumericDocValues的dvm字段一样。
- TermDict
- ValueCount:SortedDocValues的个数
- TermDictAddressBlockShift:存储TermDict各个block的起始位置的DirectMonotonicWriter的参数
- TermDictAddressBlockMetas:DirectMonotonicWriter的元信息
- MaxLength:最长的binary
- MaxBlockLength:最大的block的大小
- TermDictValueBlocksOffset:TermDict数据在dvd文件的起始位置
- TermDictValueBlocksLength:TermDict数据在dvd文件中的长度
- TermDictAddressBlocksOffset:TermDict各个block的起始位置的数据在dvd文件中的起始位置
- TermDictAddressBlocksLength:TermDoct各个block的起始位置的数据在dvd文件中的长度
- TermsIndex
- TermsIndexShift:决定TermsIndex的block的大小
- TermsIndexAddressBlockMetas:存储TermsIndex各个block的起始位置的DirectMonotonicWriter的元信息
- TermsIndexValueBlocksOffset:TermsIndex数据在dvd文件的起始位置
- TermIndexValueBlocksLength:TermsIndex数据在dvd文件中的长度
- TermIndexAddressBlocksOffset:TermsIndex各个block的起始位置的数据在dvd文件中的起始位置
- TermsIndexAddressBlocksLength:TermsIndex各个block的起始位置的数据在dvd文件中的长度
dvd
整体结构
字段详解
- Numeric:binary的id集合,使用和NumericDocValues一样的存储方式
- TermDictBlock:TermDict数据block
- FirstTermLength:block中第一个term的的长度
- FirstTerm:block中第一个term
- Length:block剩下的数据的长度
- Value:其他的term
- LengthInfo:长度信息
- LengthCode:byte类型,高四位suffixLength,低四位是prefixLength
- PrefixLength:如果prefixLength大于等于15,则存储prefixLength - 15
- SuffixLength:如果suffixLength大宇等于16,则存储suffixLength - 16
- Suffix:suffix的内容
- LengthInfo:长度信息
- TermDictBlockAddress:TermDictBlock在dvd文件中的起始位置
- TermsIndexBlock:TermsIndex数据block
- PrefixWithOneByteOffSuffix:block第一个binary和前一个binary的公共前缀再加上后缀的第一个字节
- TermsIndexBlockAddress:TermsIndexBlock在dvd文件的起始位置
源码解析
构建
数据收集
临时存储BinaryDocValues的工具是PagedBytes,它逻辑上就是一个byte数组,只是为了避免大数组对内存不友好,所以分成多个block,或者说page。
在持久化的时候,会把BinaryDocValues的值和包含该field的docID集合会根据是否需要对doc进行排序都封装到BufferedBinaryDocValues或者SortingBinaryDocValues中。
SortedDocValuesWriter
class SortedDocValuesWriter extends DocValuesWriter<SortedDocValues> {
// 存储所有的binary,并且会为每个binary分配一个唯一id,id是按binary出现顺序分配
final BytesRefHash hash;
// 临时存储所有的id
private final PackedLongValues.Builder pending;
// 包含此字段的doc集合
private final DocsWithFieldSet docsWithField;
private final Counter iwBytesUsed;
private long bytesUsed;
private final FieldInfo fieldInfo;
private int lastDocID = -1;
// 存储所有的id
private PackedLongValues finalOrds;
// 根据binary排序得到的id列表
private int[] finalSortedValues;
// 下标是id,值是其排序的序号
private int[] finalOrdMap;
public SortedDocValuesWriter(FieldInfo fieldInfo, Counter iwBytesUsed, ByteBlockPool pool) {
this.fieldInfo = fieldInfo;
this.iwBytesUsed = iwBytesUsed;
hash =
new BytesRefHash(
pool,
BytesRefHash.DEFAULT_CAPACITY,
new DirectBytesStartArray(BytesRefHash.DEFAULT_CAPACITY, iwBytesUsed));
pending = PackedLongValues.deltaPackedBuilder(PackedInts.COMPACT);
docsWithField = new DocsWithFieldSet();
bytesUsed = pending.ramBytesUsed() + docsWithField.ramBytesUsed();
iwBytesUsed.addAndGet(bytesUsed);
}
// 新增一个binary
public void addValue(int docID, BytesRef value) {
if (docID <= lastDocID) { // 说明一个docID同个字段只能有一个value
throw new IllegalArgumentException(
"DocValuesField \""
+ fieldInfo.name
+ "\" appears more than once in this document (only one value is allowed per field)");
}
if (value == null) {
throw new IllegalArgumentException(
"field \"" + fieldInfo.name + "\": null value not allowed");
}
if (value.length > (BYTE_BLOCK_SIZE - 2)) { // binary的长度有限制
throw new IllegalArgumentException(
"DocValuesField \""
+ fieldInfo.name
+ "\" is too large, must be <= "
+ (BYTE_BLOCK_SIZE - 2));
}
// 新增一个docValue
addOneValue(value);
// 记录包含此Field的docID
docsWithField.add(docID);
lastDocID = docID;
}
private void addOneValue(BytesRef value) {
// 如果value是第一次出现,就会为其分配一个唯一的id,如果value之前出现过,则返回 -(id + 1)
int termID = hash.add(value);
if (termID < 0) { // 说明之前出现过这个value
termID = -termID - 1;
} else {
iwBytesUsed.addAndGet(2 * Integer.BYTES);
}
// 记录id
pending.add(termID);
updateBytesUsed();
}
private void updateBytesUsed() {
final long newBytesUsed = pending.ramBytesUsed() + docsWithField.ramBytesUsed();
iwBytesUsed.addAndGet(newBytesUsed - bytesUsed);
bytesUsed = newBytesUsed;
}
@Override
SortedDocValues getDocValues() {
int valueCount = hash.size();
if (finalSortedValues == null) {
updateBytesUsed();
assert finalOrdMap == null && finalOrds == null;
// 根据binary排序得到的id列表
finalSortedValues = hash.sort();
finalOrds = pending.build();
finalOrdMap = new int[valueCount];
}
for (int ord = 0; ord < valueCount; ord++) {
finalOrdMap[finalSortedValues[ord]] = ord;
}
return new BufferedSortedDocValues(
hash, finalOrds, finalSortedValues, finalOrdMap, docsWithField.iterator());
}
private int[] sortDocValues(int maxDoc, Sorter.DocMap sortMap, SortedDocValues oldValues)
throws IOException {
int[] ords = new int[maxDoc];
Arrays.fill(ords, -1);
int docID;
while ((docID = oldValues.nextDoc()) != NO_MORE_DOCS) {
int newDocID = sortMap.oldToNew(docID);
ords[newDocID] = oldValues.ordValue();
}
return ords;
}
@Override
public void flush(SegmentWriteState state, Sorter.DocMap sortMap, DocValuesConsumer dvConsumer)
throws IOException {
final int valueCount = hash.size();
if (finalOrds == null) {
updateBytesUsed();
finalSortedValues = hash.sort();
finalOrds = pending.build();
finalOrdMap = new int[valueCount];
for (int ord = 0; ord < valueCount; ord++) {
finalOrdMap[finalSortedValues[ord]] = ord;
}
}
final int[] sorted;
if (sortMap != null) {
sorted =
sortDocValues(
state.segmentInfo.maxDoc(),
sortMap,
new BufferedSortedDocValues(
hash, finalOrds, finalSortedValues, finalOrdMap, docsWithField.iterator()));
} else {
sorted = null;
}
dvConsumer.addSortedField(
fieldInfo,
new EmptyDocValuesProducer() {
@Override
public SortedDocValues getSorted(FieldInfo fieldInfoIn) {
if (fieldInfoIn != fieldInfo) {
throw new IllegalArgumentException("wrong fieldInfo");
}
final SortedDocValues buf =
new BufferedSortedDocValues(
hash, finalOrds, finalSortedValues, finalOrdMap, docsWithField.iterator());
if (sorted == null) {
return buf;
}
return new SortingSortedDocValues(buf, sorted);
}
});
}
}
持久化
相比于之前介绍的几种DocValues,SortedDocValues的存储结构会比较复杂一些:
- 使用和NumericDocValues一样的存储结构,存储所有binary的id
- 构建TermDict
- 构建TermsIndex
@Override
public void addSortedField(FieldInfo field, DocValuesProducer valuesProducer) throws IOException {
meta.writeInt(field.number);
meta.writeByte(Lucene90DocValuesFormat.SORTED);
doAddSortedField(field, valuesProducer);
}
private void doAddSortedField(FieldInfo field, DocValuesProducer valuesProducer)
throws IOException {
// 每个doc的SortedDocValues只存储其id,id也是整型,所以和NumericDocValues的存储方案一样
writeValues(
field,
new EmptyDocValuesProducer() {
@Override
public SortedNumericDocValues getSortedNumeric(FieldInfo field) throws IOException {
SortedDocValues sorted = valuesProducer.getSorted(field);
NumericDocValues sortedOrds =
new NumericDocValues() {
@Override
public long longValue() throws IOException {
return sorted.ordValue();
}
@Override
public boolean advanceExact(int target) throws IOException {
return sorted.advanceExact(target);
}
@Override
public int docID() {
return sorted.docID();
}
@Override
public int nextDoc() throws IOException {
return sorted.nextDoc();
}
@Override
public int advance(int target) throws IOException {
return sorted.advance(target);
}
@Override
public long cost() {
return sorted.cost();
}
};
return DocValues.singleton(sortedOrds);
}
},
true);
addTermsDict(DocValues.singleton(valuesProducer.getSorted(field)));
}
private void addTermsDict(SortedSetDocValues values) throws IOException {
// value总数
final long size = values.getValueCount();
meta.writeVLong(size);
// 63
int blockMask = Lucene90DocValuesFormat.TERMS_DICT_BLOCK_LZ4_MASK;
// 6
int shift = Lucene90DocValuesFormat.TERMS_DICT_BLOCK_LZ4_SHIFT;
meta.writeInt(DIRECT_MONOTONIC_BLOCK_SHIFT);
ByteBuffersDataOutput addressBuffer = new ByteBuffersDataOutput();
ByteBuffersIndexOutput addressOutput =
new ByteBuffersIndexOutput(addressBuffer, "temp", "temp");
long numBlocks = (size + blockMask) >>> shift;
DirectMonotonicWriter writer =
DirectMonotonicWriter.getInstance(
meta, addressOutput, numBlocks, DIRECT_MONOTONIC_BLOCK_SHIFT);
// 记录前一个处理的value,用来算公共前缀
BytesRefBuilder previous = new BytesRefBuilder();
long ord = 0;
long start = data.getFilePointer();
int maxLength = 0, maxBlockLength = 0;
TermsEnum iterator = values.termsEnum();
LZ4.FastCompressionHashTable ht = new LZ4.FastCompressionHashTable();
ByteArrayDataOutput bufferedOutput = new ByteArrayDataOutput(termsDictBuffer);
// 遍历所有的value
for (BytesRef term = iterator.next(); term != null; term = iterator.next()) {
if ((ord & blockMask) == 0) { // 每逢64的倍数,也就是block的第一个value
if (bufferedOutput.getPosition() > 0) {
maxBlockLength =
Math.max(maxBlockLength, compressAndGetTermsDictBlockLength(bufferedOutput, ht));
bufferedOutput.reset(termsDictBuffer);
}
// block的起始位置
writer.add(data.getFilePointer() - start);
// block第一个value都是完整存储的
data.writeVInt(term.length);
data.writeBytes(term.bytes, term.offset, term.length);
} else {
// 公共前缀长度
final int prefixLength = StringHelper.bytesDifference(previous.get(), term);
// 剩下的后缀长度
final int suffixLength = term.length - prefixLength;
// Will write (suffixLength + 1 byte + 2 vint) bytes. Grow the buffer in need.
bufferedOutput = maybeGrowBuffer(bufferedOutput, suffixLength + 11);
// 低4位存储prefixLength,高4位存储suffixLength
bufferedOutput.writeByte(
(byte) (Math.min(prefixLength, 15) | (Math.min(15, suffixLength - 1) << 4)));
if (prefixLength >= 15) { // 如果前缀长度大于等于15,需要额外存储前缀长度差值
bufferedOutput.writeVInt(prefixLength - 15);
}
if (suffixLength >= 16) {// 如果后缀长度大于等于15,需要额外存储前缀长度差值
bufferedOutput.writeVInt(suffixLength - 16);
}
// 写入后缀
bufferedOutput.writeBytes(term.bytes, term.offset + prefixLength, suffixLength);
}
maxLength = Math.max(maxLength, term.length);
previous.copyBytes(term);
++ord;
}
// Compress and write out the last block
if (bufferedOutput.getPosition() > 0) {
maxBlockLength =
Math.max(maxBlockLength, compressAndGetTermsDictBlockLength(bufferedOutput, ht));
}
writer.finish();
meta.writeInt(maxLength);
// Write one more int for storing max block length.
meta.writeInt(maxBlockLength);
meta.writeLong(start);
meta.writeLong(data.getFilePointer() - start);
start = data.getFilePointer();
addressBuffer.copyTo(data);
meta.writeLong(start);
meta.writeLong(data.getFilePointer() - start);
// Now write the reverse terms index
writeTermsIndex(values);
}
private int compressAndGetTermsDictBlockLength(
ByteArrayDataOutput bufferedOutput, LZ4.FastCompressionHashTable ht) throws IOException {
int uncompressedLength = bufferedOutput.getPosition();
data.writeVInt(uncompressedLength);
long before = data.getFilePointer();
LZ4.compress(termsDictBuffer, 0, uncompressedLength, data, ht);
int compressedLength = (int) (data.getFilePointer() - before);
// Block length will be used for creating buffer for decompression, one corner case is that
// compressed length might be bigger than un-compressed length, so just return the bigger one.
return Math.max(uncompressedLength, compressedLength);
}
private ByteArrayDataOutput maybeGrowBuffer(ByteArrayDataOutput bufferedOutput, int termLength) {
int pos = bufferedOutput.getPosition(), originalLength = termsDictBuffer.length;
if (pos + termLength >= originalLength - 1) {
termsDictBuffer = ArrayUtil.grow(termsDictBuffer, originalLength + termLength);
bufferedOutput = new ByteArrayDataOutput(termsDictBuffer, pos, termsDictBuffer.length - pos);
}
return bufferedOutput;
}
private void writeTermsIndex(SortedSetDocValues values) throws IOException {
final long size = values.getValueCount();
// 10
meta.writeInt(Lucene90DocValuesFormat.TERMS_DICT_REVERSE_INDEX_SHIFT);
long start = data.getFilePointer();
long numBlocks =
1L
+ ((size + Lucene90DocValuesFormat.TERMS_DICT_REVERSE_INDEX_MASK)
>>> Lucene90DocValuesFormat.TERMS_DICT_REVERSE_INDEX_SHIFT);
ByteBuffersDataOutput addressBuffer = new ByteBuffersDataOutput();
DirectMonotonicWriter writer;
try (ByteBuffersIndexOutput addressOutput =
new ByteBuffersIndexOutput(addressBuffer, "temp", "temp")) {
writer =
DirectMonotonicWriter.getInstance(
meta, addressOutput, numBlocks, DIRECT_MONOTONIC_BLOCK_SHIFT);
TermsEnum iterator = values.termsEnum();
BytesRefBuilder previous = new BytesRefBuilder();
long offset = 0;
long ord = 0;
for (BytesRef term = iterator.next(); term != null; term = iterator.next()) {
if ((ord & Lucene90DocValuesFormat.TERMS_DICT_REVERSE_INDEX_MASK) == 0) { // 每1024个
writer.add(offset);
// 和前一个term最长公共前缀+1
final int sortKeyLength;
if (ord == 0) {
sortKeyLength = 0;
} else {
sortKeyLength = StringHelper.sortKeyLength(previous.get(), term);
}
offset += sortKeyLength;
data.writeBytes(term.bytes, term.offset, sortKeyLength);
} else if ((ord & Lucene90DocValuesFormat.TERMS_DICT_REVERSE_INDEX_MASK)
== Lucene90DocValuesFormat.TERMS_DICT_REVERSE_INDEX_MASK) {
previous.copyBytes(term);
}
++ord;
}
writer.add(offset);
writer.finish();
meta.writeLong(start);
meta.writeLong(data.getFilePointer() - start);
start = data.getFilePointer();
addressBuffer.copyTo(data);
meta.writeLong(start);
meta.writeLong(data.getFilePointer() - start);
}
}
读取
读取逻辑比较简单,最核心的是构建TermDict和TermsIndex,下面我们简单看下这个过程的源码,至于TermDict怎么用,我们后面来说。
private SortedEntry readSorted(IndexInput meta) throws IOException {
SortedEntry entry = new SortedEntry();
entry.ordsEntry = new NumericEntry();
readNumeric(meta, entry.ordsEntry);
entry.termsDictEntry = new TermsDictEntry();
readTermDict(meta, entry.termsDictEntry);
return entry;
}
private static void readTermDict(IndexInput meta, TermsDictEntry entry) throws IOException {
entry.termsDictSize = meta.readVLong();
final int blockShift = meta.readInt();
final long addressesSize =
(entry.termsDictSize + (1L << TERMS_DICT_BLOCK_LZ4_SHIFT) - 1)
>>> TERMS_DICT_BLOCK_LZ4_SHIFT;
entry.termsAddressesMeta = DirectMonotonicReader.loadMeta(meta, addressesSize, blockShift);
entry.maxTermLength = meta.readInt();
entry.maxBlockLength = meta.readInt();
entry.termsDataOffset = meta.readLong();
entry.termsDataLength = meta.readLong();
entry.termsAddressesOffset = meta.readLong();
entry.termsAddressesLength = meta.readLong();
entry.termsDictIndexShift = meta.readInt();
final long indexSize =
(entry.termsDictSize + (1L << entry.termsDictIndexShift) - 1) >>> entry.termsDictIndexShift;
entry.termsIndexAddressesMeta = DirectMonotonicReader.loadMeta(meta, 1 + indexSize, blockShift);
entry.termsIndexOffset = meta.readLong();
entry.termsIndexLength = meta.readLong();
entry.termsIndexAddressesOffset = meta.readLong();
entry.termsIndexAddressesLength = meta.readLong();
}
@Override
public SortedDocValues getSorted(FieldInfo field) throws IOException {
SortedEntry entry = sorted.get(field.name);
return getSorted(entry);
}
private SortedDocValues getSorted(SortedEntry entry) throws IOException {
// Specialize the common case for ordinals: single block of packed integers.
final NumericEntry ordsEntry = entry.ordsEntry;
if (ordsEntry.blockShift < 0 // single block
&& ordsEntry.bitsPerValue > 0) { // more than 1 value
if (ordsEntry.gcd != 1 || ordsEntry.minValue != 0 || ordsEntry.table != null) {
throw new IllegalStateException("Ordinals shouldn't use GCD, offset or table compression");
}
final RandomAccessInput slice =
data.randomAccessSlice(ordsEntry.valuesOffset, ordsEntry.valuesLength);
final LongValues values =
getDirectReaderInstance(slice, ordsEntry.bitsPerValue, 0L, ordsEntry.numValues);
if (ordsEntry.docsWithFieldOffset == -1) { // dense
return new BaseSortedDocValues(entry) {
private final int maxDoc = Lucene90DocValuesProducer.this.maxDoc;
private int doc = -1;
@Override
public int ordValue() throws IOException {
return (int) values.get(doc);
}
@Override
public boolean advanceExact(int target) throws IOException {
doc = target;
return true;
}
@Override
public int docID() {
return doc;
}
@Override
public int nextDoc() throws IOException {
return advance(doc + 1);
}
@Override
public int advance(int target) throws IOException {
if (target >= maxDoc) {
return doc = NO_MORE_DOCS;
}
return doc = target;
}
@Override
public long cost() {
return maxDoc;
}
};
} else if (ordsEntry.docsWithFieldOffset >= 0) { // sparse but non-empty
final IndexedDISI disi =
new IndexedDISI(
data,
ordsEntry.docsWithFieldOffset,
ordsEntry.docsWithFieldLength,
ordsEntry.jumpTableEntryCount,
ordsEntry.denseRankPower,
ordsEntry.numValues);
return new BaseSortedDocValues(entry) {
@Override
public int ordValue() throws IOException {
return (int) values.get(disi.index());
}
@Override
public boolean advanceExact(int target) throws IOException {
return disi.advanceExact(target);
}
@Override
public int docID() {
return disi.docID();
}
@Override
public int nextDoc() throws IOException {
return disi.nextDoc();
}
@Override
public int advance(int target) throws IOException {
return disi.advance(target);
}
@Override
public long cost() {
return disi.cost();
}
};
}
}
final NumericDocValues ords = getNumeric(entry.ordsEntry);
return new BaseSortedDocValues(entry) {
@Override
public int ordValue() throws IOException {
return (int) ords.longValue();
}
@Override
public boolean advanceExact(int target) throws IOException {
return ords.advanceExact(target);
}
@Override
public int docID() {
return ords.docID();
}
@Override
public int nextDoc() throws IOException {
return ords.nextDoc();
}
@Override
public int advance(int target) throws IOException {
return ords.advance(target);
}
@Override
public long cost() {
return ords.cost();
}
};
}
TermsDict
在TermDict中重点是支持两种查询方式:
-
根据value查询序号
首先利用TermsIndex索引,快速判断value所在的序号范围,然后遍历这个序号范围内所有的TermDict的block,可以根据block的一个value确定在哪个block中,最后遍历value所在的block,获取序号。
-
根据序号获取value
这个比较简单,可以通过序号,直接计算序号所在的TermDict的block,然后遍历这个block就可以。
private class TermsDict extends BaseTermsEnum {
static final int LZ4_DECOMPRESSOR_PADDING = 7;
final TermsDictEntry entry;
final LongValues blockAddresses;
final IndexInput bytes;
final long blockMask;
// TermsIndex的索引信息
final LongValues indexAddresses;
// TermsIndex的数据
final IndexInput indexBytes;
final BytesRef term;
long ord = -1;
BytesRef blockBuffer = null;
// TermDictBlock的数据
ByteArrayDataInput blockInput = null;
long currentCompressedBlockStart = -1;
long currentCompressedBlockEnd = -1;
TermsDict(TermsDictEntry entry, IndexInput data) throws IOException {
this.entry = entry;
RandomAccessInput addressesSlice =
data.randomAccessSlice(entry.termsAddressesOffset, entry.termsAddressesLength);
blockAddresses =
DirectMonotonicReader.getInstance(entry.termsAddressesMeta, addressesSlice, merging);
bytes = data.slice("terms", entry.termsDataOffset, entry.termsDataLength);
blockMask = (1L << TERMS_DICT_BLOCK_LZ4_SHIFT) - 1;
RandomAccessInput indexAddressesSlice =
data.randomAccessSlice(entry.termsIndexAddressesOffset, entry.termsIndexAddressesLength);
indexAddresses =
DirectMonotonicReader.getInstance(
entry.termsIndexAddressesMeta, indexAddressesSlice, merging);
indexBytes = data.slice("terms-index", entry.termsIndexOffset, entry.termsIndexLength);
term = new BytesRef(entry.maxTermLength);
// add 7 padding bytes can help decompression run faster.
int bufferSize = entry.maxBlockLength + LZ4_DECOMPRESSOR_PADDING;
blockBuffer = new BytesRef(new byte[bufferSize], 0, bufferSize);
}
@Override
public BytesRef next() throws IOException {
if (++ord >= entry.termsDictSize) {
return null;
}
if ((ord & blockMask) == 0L) { // 如果当前block已经结束了,需要加载下一个block
decompressBlock();
} else {
DataInput input = blockInput;
final int token = Byte.toUnsignedInt(input.readByte());
int prefixLength = token & 0x0F;
int suffixLength = 1 + (token >>> 4);
if (prefixLength == 15) {
prefixLength += input.readVInt();
}
if (suffixLength == 16) {
suffixLength += input.readVInt();
}
term.length = prefixLength + suffixLength;
// 读取suffix的内容,拼接在prefix的后面
input.readBytes(term.bytes, prefixLength, suffixLength);
}
return term;
}
// 根据序号获取value
public void seekExact(long ord) throws IOException {
if (ord < 0 || ord >= entry.termsDictSize) {
throw new IndexOutOfBoundsException();
}
// 根据序号得到在哪个block
final long blockIndex = ord >>> TERMS_DICT_BLOCK_LZ4_SHIFT;
// 获取block的起始位置
final long blockAddress = blockAddresses.get(blockIndex);
// 定位到block的起始位置
bytes.seek(blockAddress);
// block的起始序号,要开始遍历了
this.ord = (blockIndex << TERMS_DICT_BLOCK_LZ4_SHIFT) - 1;
do {
next();
} while (this.ord < ord); // 遍历获取指定的序号对应的value
}
// 根据index获取TermsIndex中的value
private BytesRef getTermFromIndex(long index) throws IOException {
assert index >= 0 && index <= (entry.termsDictSize - 1) >>> entry.termsDictIndexShift;
final long start = indexAddresses.get(index);
term.length = (int) (indexAddresses.get(index + 1) - start);
indexBytes.seek(start);
indexBytes.readBytes(term.bytes, 0, term.length);
return term;
}
// 获取text在TermsIndex的哪个block中
private long seekTermsIndex(BytesRef text) throws IOException {
long lo = 0L;
long hi = (entry.termsDictSize - 1) >> entry.termsDictIndexShift;
while (lo <= hi) {
final long mid = (lo + hi) >>> 1;
getTermFromIndex(mid);
final int cmp = term.compareTo(text);
if (cmp <= 0) {
lo = mid + 1;
} else {
hi = mid - 1;
}
}
assert hi < 0 || getTermFromIndex(hi).compareTo(text) <= 0;
assert hi == ((entry.termsDictSize - 1) >> entry.termsDictIndexShift)
|| getTermFromIndex(hi + 1).compareTo(text) > 0;
return hi;
}
// 获取TermDict中block的第一个term,第一个term是完整存储的
private BytesRef getFirstTermFromBlock(long block) throws IOException {
assert block >= 0 && block <= (entry.termsDictSize - 1) >>> TERMS_DICT_BLOCK_LZ4_SHIFT;
final long blockAddress = blockAddresses.get(block);
bytes.seek(blockAddress);
term.length = bytes.readVInt();
bytes.readBytes(term.bytes, 0, term.length);
return term;
}
// 获取text所在的TermDict的block编号
private long seekBlock(BytesRef text) throws IOException {
// 获取text所在TermsIndex的block编号
long index = seekTermsIndex(text);
if (index == -1L) {
return -1L;
}
// 根据TermsIndex的block编号反推可能的序号区间
long ordLo = index << entry.termsDictIndexShift;
long ordHi = Math.min(entry.termsDictSize, ordLo + (1L << entry.termsDictIndexShift)) - 1L;
// 根据序号区间得到TermDict的区间
long blockLo = ordLo >>> TERMS_DICT_BLOCK_LZ4_SHIFT;
long blockHi = ordHi >>> TERMS_DICT_BLOCK_LZ4_SHIFT;
while (blockLo <= blockHi) { // 二分查找
final long blockMid = (blockLo + blockHi) >>> 1;
getFirstTermFromBlock(blockMid);
final int cmp = term.compareTo(text);
if (cmp <= 0) {
blockLo = blockMid + 1;
} else {
blockHi = blockMid - 1;
}
}
assert blockHi < 0 || getFirstTermFromBlock(blockHi).compareTo(text) <= 0;
assert blockHi == ((entry.termsDictSize - 1) >>> TERMS_DICT_BLOCK_LZ4_SHIFT)
|| getFirstTermFromBlock(blockHi + 1).compareTo(text) > 0;
return blockHi;
}
@Override
public SeekStatus seekCeil(BytesRef text) throws IOException {
// 定位text所在的TermDict的block
final long block = seekBlock(text);
if (block == -1) {
if (entry.termsDictSize == 0) {
ord = 0;
return SeekStatus.END;
} else {
seekExact(0L);
return SeekStatus.NOT_FOUND;
}
}
final long blockAddress = blockAddresses.get(block);
this.ord = block << TERMS_DICT_BLOCK_LZ4_SHIFT;
bytes.seek(blockAddress);
decompressBlock();
while (true) { // 遍历block
int cmp = term.compareTo(text);
if (cmp == 0) {
return SeekStatus.FOUND;
} else if (cmp > 0) {
return SeekStatus.NOT_FOUND;
}
if (next() == null) {
return SeekStatus.END;
}
}
}
private void decompressBlock() throws IOException {
// 第一个value的长度
term.length = bytes.readVInt();
// 读取第一个value
bytes.readBytes(term.bytes, 0, term.length);
long offset = bytes.getFilePointer();
if (offset < entry.termsDataLength - 1) {
// Avoid decompress again if we are reading a same block.
if (currentCompressedBlockStart != offset) {
int decompressLength = bytes.readVInt();
// Decompress the remaining of current block
LZ4.decompress(bytes, decompressLength, blockBuffer.bytes, 0);
currentCompressedBlockStart = offset;
currentCompressedBlockEnd = bytes.getFilePointer();
} else {
// Skip decompression but need to re-seek to block end.
bytes.seek(currentCompressedBlockEnd);
}
// Reset the buffer.
blockInput = new ByteArrayDataInput(blockBuffer.bytes, 0, blockBuffer.length);
}
}
@Override
public BytesRef term() throws IOException {
return term;
}
@Override
public long ord() throws IOException {
return ord;
}
@Override
public long totalTermFreq() throws IOException {
return -1L;
}
@Override
public PostingsEnum postings(PostingsEnum reuse, int flags) throws IOException {
throw new UnsupportedOperationException();
}
@Override
public ImpactsEnum impacts(int flags) throws IOException {
throw new UnsupportedOperationException();
}
@Override
public int docFreq() throws IOException {
throw new UnsupportedOperationException();
}
}