ShardRequestCache是ES层级的实现,缓存机制为LRU, 访问一次,就会考虑缓存,主要用途是对聚合结果进行缓存NodeQueryCache是Lucene层级的实现,缓存机制为LRU, 访问达到一定频率,才会考虑缓存,主要用途是对filter子查询的缓存
fielddata
对于 text类型的字段聚合查询,会使用 fielddata获取该字段的字段值,
(1) fielddata会占用大量的内存,默认是关闭
(2) text类型的聚合查询,一般没有意义,慎用
Shard request cache
分片级别的查询缓存,每个分片都有自己的缓存
缓存策略
并不是所有的分片级查询都会被缓存
缓存设置
Node Query Cache
NodeQueryCache是在Lucene层面实现的,默认开启,ES层面会进行一些策略控制和信息统计
缓存策略
并不是所有的filter查询都会被缓存
缓存设置
代码分析
Filter Cache实例化
Elasticsearch的Node在实例化时,会new一个IndicesService,其在构造函数中实例化了一个 IndicesQueryCache
ES IndicesQueryCache ElasticsearchLRUQueryCache 实例化
public class IndicesQueryCache implements QueryCache, Closeable {
// Cache内存占用空间上限
public static final Setting<ByteSizeValue> INDICES_CACHE_QUERY_SIZE_SETTING =
Setting.memorySizeSetting("indices.queries.cache.size", "10%", Property.NodeScope);
// Cache 元素个数上限
public static final Setting<Integer> INDICES_CACHE_QUERY_COUNT_SETTING =
Setting.intSetting("indices.queries.cache.count", 10_000, 1, Property.NodeScope);
// LRU形式的cache
private final LRUQueryCache cache;
public IndicesQueryCache(Settings settings) {
final ByteSizeValue size = INDICES_CACHE_QUERY_SIZE_SETTING.get(settings);
final int count = INDICES_CACHE_QUERY_COUNT_SETTING.get(settings);
logger.debug("using [node] query cache with size [{}] max filter count [{}]",
size, count);
// 实例化LRU Cache, 上限10000个元素或者10%的堆内存大小
cache = new ElasticsearchLRUQueryCache(count, size.getBytes());
sharedRamBytesUsed = 0;
}
}
IndicesQueryCache 在实例化时会指定内存空间上限以及元素个数上限,通过这两个指标来达到LRU缓存淘汰的效果。这两个指标是Node级别的,也就是所有 Index,Shard 都共用的。
IndicesQueryCache 实现了 QueryCache 接口,这个接口是Lucene提供的用户缓存查询结果的接口
Lucene QueryCache,QueryCachingPolicy ,缓存类,缓存策略
public interface QueryCache {
/**
* Return a wrapper around the provided <code>weight</code> that will cache
* matching docs per-segment accordingly to the given <code>policy</code>.
* NOTE: The returned weight will only be equivalent if scores are not needed.
* @see Collector#scoreMode()
*/
Weight doCache(Weight weight, QueryCachingPolicy policy);
}
public interface QueryCachingPolicy {
/** Callback that is called every time that a cached filter is used.
* This is typically useful if the policy wants to track usage statistics
* in order to make decisions. */
void onUse(Query query);
/** Whether the given {@link Query} is worth caching.
* This method will be called by the {@link QueryCache} to know whether to
* cache. It will first attempt to load a {@link DocIdSet} from the cache.
* If it is not cached yet and this method returns <tt>true</tt> then a
* cache entry will be generated. Otherwise an uncached scorer will be
* returned. */
boolean shouldCache(Query query) throws IOException;
}
Lucene提供了 QueryCache 和 QueryCachingPolicy 两个接口来处理检索结果缓存,ES中用ElasticsearchLRUQueryCache 作为 QueryCache 的实现,默认使用Lucene提供的UsageTrackingQueryCachingPolicy 作为QueryCachingPolicy的实现,处理是否缓存以及检索时的操作。
Lucene IndexSearcher 实例化
Elasticsearch在构建Lucene的IndexSearcher时,指定了 QueryCache 和 QueryCachingPolicy,QueryCache使用了装饰模式,OptOutQueryCache 是Security相关的操作,IndicesQueryCache 就是上面代码实例化的,内部持有 ElasticsearchLRUQueryCache 。
IndexSearcher 检索
Create Cache Weight
public class org.apache.lucene.search.IndexSearcher {
// 查询缓存
private QueryCache queryCache ;
// 查询缓存策略
private QueryCachingPolicy queryCachingPolicy ;
public Weight createWeight(Query query, boolean needsScores, float boost) throws IOException {
final QueryCache queryCache = this.queryCache;
Weight weight = query.createWeight(this, needsScores, boost);
// 只有不需要score的检索且缓存有配置 才能缓存结果
if (needsScores == false && queryCache != null) {
// 查询缓存包装Weight, 返回CachingWrapperWeight
weight = queryCache.doCache(weight, queryCachingPolicy);
}
return weight;
}
}
想要缓存检索结果,需要满足两个前提
- 有设置
QueryCache,这点在ES实例化IndexSearcher时已经制定了 - 当前检索不需要实时的计算评分,不需要评分就代表着不需要计算
idf,这样document的增删改就不会对当前缓存结果有影响。
缓存的结果是基于segment的,倒排索引不可变,也就会生成的segment不可变,这是缓存生效的前提。当对当前segment的数据做修改和删除时,变更信息记录在 .liv 文件内,不会直接修改segment原始文件。
QueryCachingPolicy#onUse( Query query),前置处理Query类型
public class UsageTrackingQueryCachingPolicy implements QueryCachingPolicy {
public void onUse(Query query) {
// 如果从不缓存
if (shouldNeverCache(query)) {
return;
}
// 记录query的hash值
int hashCode = query.hashCode();
synchronized (this) {
// 添加最近使用的Filter, 通过记录hash值形式记录请求的频率
recentlyUsedFilters.add(hashCode);
}
}
private static boolean shouldNeverCache(Query query) {
// 如果是通过Term查询的,不缓存, 因为查询效率很高,没有缓存的必要
if (query instanceof TermQuery) {
// We do not bother caching term queries since they are already plenty fast.
return true;
}
// 如果查询所有DOC, 结果集比Bit Set 遍历更快
if (query instanceof MatchAllDocsQuery) {
// MatchAllDocsQuery has an iterator that is faster than what a bit set could do.
return true;
}
// 不查询数据的query
if (query instanceof MatchNoDocsQuery) {
return true;
}
if (query instanceof BooleanQuery) {
BooleanQuery bq = (BooleanQuery)query;
// 没有查询条件从句,变相等于Match All
if (bq.clauses().isEmpty()) {
return true;
}
}
if (query instanceof DisjunctionMaxQuery) {
DisjunctionMaxQuery dmq = (DisjunctionMaxQuery)query;
if (dmq.getDisjuncts().isEmpty()) {
return true;
}
}
return false;
}
}
QueryCache对Query的类型有要求,当满足条件时,通过记录hashcode的形式来记录query的使用次数,这对后续是否缓存有影响。
LRUQueryCache#shouldCache,前置处理segment的数据量
public class LRUQueryCache implements QueryCache, Accountable {
// 检查segment是否符合缓存条件,而不管查询如何
private boolean shouldCache(LeafReaderContext context) throws IOException {
// 缓存数据不能超过RAM上限
return cacheEntryHasReasonableWorstCaseSize(ReaderUtil.getTopLevelContext(context).reader().maxDoc())
// 缓存的segment的doc个数超过10000, 且占全部doc个数比例超过3%
&& leavesToCache.test(context);
}
private boolean cacheEntryHasReasonableWorstCaseSize(int maxDoc) {
// The worst-case (dense) is a bit set which needs one bit per document
// 最坏情况下, 使用BitSet缓存docID, 每个docID占用一个比特, 8个docID占用1字节
final long worstCaseRamUsage = maxDoc / 8;
// Cache指定的内存上限, 默认10%堆内存
final long totalRamAvailable = maxRamBytesUsed;
// 当前segment的docID的数据量不能超过Cache上限的20%,也就是堆内存的2%
// 因为Cache是Node级别的, 如果一次性缓存数据太多, 会淘汰大量的已缓存数据, 类似于Mysql的全表扫描对缓存池的影响
return worstCaseRamUsage * 5 < totalRamAvailable;
}
public LRUQueryCache(int maxSize, long maxRamBytesUsed) {
this(maxSize, maxRamBytesUsed, new MinSegmentSizePredicate(10000, .03f), 10);
}
// pkg-private for testing
static class MinSegmentSizePredicate implements Predicate<LeafReaderContext> {
private final int minSize;
private final float minSizeRatio;
MinSegmentSizePredicate(int minSize, float minSizeRatio) {
this.minSize = minSize;
this.minSizeRatio = minSizeRatio;
}
public boolean test(LeafReaderContext context) {
// 当前segment的最大doc序号, 也就是当前segment有多少个document
final int maxDoc = context.reader().maxDoc();
// 如果当前segment的doc个数 < 10000, 那么就不缓存了
if (maxDoc < minSize) {
return false;
}
final IndexReaderContext topLevelContext = ReaderUtil.getTopLevelContext(context);
// 当前segment的doc个数 / 所有segment的doc个数
final float sizeRatio = (float)context.reader().maxDoc() / topLevelContext.reader().maxDoc();
// 当前segment的doc个数与全部doc个数的比例超过 3%
return sizeRatio >= minSizeRatio;
}
}
}
QueryCache还对Segment的数据量有要求,太多和太少的都不缓存。数据量太多,可能会触发热点缓存被大量淘汰,导致后续需要重新查询;太少则缓存没有意义,重新查询一样很快。
QueryCachingPolicy#shouldCache,当前检索结果是否该被缓存
public class UsageTrackingQueryCachingPolicy implements QueryCachingPolicy {
public boolean shouldCache(Query query) throws IOException {
// 上文代码中已经看过
if (shouldNeverCache(query)) {
return false;
}
// 之前使用此query的次数, 通过query的hashcode来赋值取值, 在onUse代码里记录了query的hashcode
final int frequency = frequency(query);
// 对于需要对整个索引进行评估以构建DocIdSetIterator的过滤器
//(如MultiTermQuery、point-based查询或TermInSetQuery)返回2,对于其他过滤器返回5
final int minFrequency = minFrequencyToCache(query);
// 最近使用的次数 >= 最小被缓存的次数
return frequency >= minFrequency;
}
int frequency(Query query) {
int hashCode = query.hashCode();
synchronized (this) {
// 取hashcode被记录的次数, 就是Query次数
return recentlyUsedFilters.frequency(hashCode);
}
}
protected int minFrequencyToCache(Query query) {
// 如果是耗费比较大的查询请求
if (isCostly(query)) {
return 2;
} else {
// default: cache after the filter has been seen 5 times
int minFrequency = 5;
if (query instanceof BooleanQuery || query instanceof DisjunctionMaxQuery) {
// 假如你一直重用一个看起来像"A OR B"的布尔查询,并且永远不在该上下文之外使用A和B查询
// 使用5次后,我们会同时缓存A、B和 A OR B,这是浪费。因此,我们提前缓存复合查询,以便在这种情况下只缓存"A OR B"。
minFrequency--;
}
return minFrequency;
}
}
}
当QueryCache未命中时,需要判断是否要缓存这次检索结果,如果符合要求,则执行检索,再缓存。
LRUQueryCache#buildCache, 构建缓存数据
public class LRUQueryCache implements QueryCache, Accountable {
/**
* Default cache implementation: uses {@link RoaringDocIdSet} for sets that have a density < 1% and a {@link BitDocIdSet} over a {@link FixedBitSet}
* otherwise.
*/
protected DocIdSet cacheImpl(BulkScorer scorer, int maxDoc) throws IOException {
// scorer.cost() 得到的是在当前segment里命中的doc数量
if (scorer.cost() * 100 >= maxDoc) {
// FixedBitSet is faster for dense sets and will enable the random-access
// optimization in ConjunctionDISI
// 当命中doc数量占当前segment的比例超过1%时, 用BitSet存储
return cacheIntoBitSet(scorer, maxDoc);
} else {
// 用DocId数组存储
return cacheIntoRoaringDocIdSet(scorer, maxDoc);
}
}
// 使用BitSet结构缓存数据
private static DocIdSet cacheIntoBitSet(BulkScorer scorer, int maxDoc) throws IOException {
final FixedBitSet bitSet = new FixedBitSet(maxDoc);
long cost[] = new long[1];
scorer.score(new LeafCollector() {
@Override
public void setScorer(Scorer scorer) throws IOException {}
@Override
public void collect(int doc) throws IOException {
cost[0]++;
// 缓存数据仅docId
bitSet.set(doc);
}
}, null);
return new BitDocIdSet(bitSet, cost[0]);
}
// https://www.elastic.co/cn/blog/frame-of-reference-and-roaring-bitmaps
private static CacheAndCount cacheIntoRoaringDocIdSet(BulkScorer scorer, int maxDoc)
throws IOException {
RoaringDocIdSet.Builder builder = new RoaringDocIdSet.Builder(maxDoc);
scorer.score(
new LeafCollector() {
@Override
public void setScorer(Scorable scorer) throws IOException {}
@Override
public void collect(int doc) throws IOException {
builder.add(doc);
}
},
null);
RoaringDocIdSet cache = builder.build();
return new CacheAndCount(cache, cache.cardinality());
}
}
构建缓存数据时,当命中的doc数量超过当前segment的1%时,使用BitSet结构存储,因为其占用空间小,每个docId仅占用1bit;当小于1%时,使用long数组形式缓存docId,因为这样解析快,数据不大时很方便
经过上述步骤就能缓存检索的DocIdSet,下次重复检索时就能实现复用。
查询代码
public class LRUQueryCache implements QueryCache, Accountable {
@Override
public ScorerSupplier scorerSupplier(LeafReaderContext context) throws IOException {
if (used.compareAndSet(false, true)) {
policy.onUse(getQuery());
}
if (in.isCacheable(context) == false) {
// this segment is not suitable for caching
return in.scorerSupplier(context);
}
// Short-circuit: Check whether this segment is eligible for caching
// before we take a lock because of #get
if (shouldCache(context) == false) {
return in.scorerSupplier(context);
}
final IndexReader.CacheHelper cacheHelper = context.reader().getCoreCacheHelper();
if (cacheHelper == null) {
// this reader has no cache helper
return in.scorerSupplier(context);
}
// If the lock is already busy, prefer using the uncached version than waiting
if (lock.tryLock() == false) {
return in.scorerSupplier(context);
}
CacheAndCount cached;
try {
cached = get(in.getQuery(), cacheHelper);
} finally {
lock.unlock();
}
if (cached == null) {
if (policy.shouldCache(in.getQuery())) {
final ScorerSupplier supplier = in.scorerSupplier(context);
if (supplier == null) {
putIfAbsent(in.getQuery(), CacheAndCount.EMPTY, cacheHelper);
return null;
}
final long cost = supplier.cost();
return new ScorerSupplier() {
@Override
public Scorer get(long leadCost) throws IOException {
// skip cache operation which would slow query down too much
if (cost / skipCacheFactor > leadCost) {
return supplier.get(leadCost);
}
Scorer scorer = supplier.get(Long.MAX_VALUE);
CacheAndCount cached =
cacheImpl(new DefaultBulkScorer(scorer), context.reader().maxDoc());
putIfAbsent(in.getQuery(), cached, cacheHelper);
DocIdSetIterator disi = cached.iterator();
if (disi == null) {
// docIdSet.iterator() is allowed to return null when empty but we want a non-null
// iterator here
disi = DocIdSetIterator.empty();
}
return new ConstantScoreScorer(
CachingWrapperWeight.this, 0f, ScoreMode.COMPLETE_NO_SCORES, disi);
}
@Override
public long cost() {
return cost;
}
};
} else {
return in.scorerSupplier(context);
}
}
assert cached != null;
if (cached == CacheAndCount.EMPTY) {
return null;
}
final DocIdSetIterator disi = cached.iterator();
if (disi == null) {
return null;
}
return new ScorerSupplier() {
@Override
public Scorer get(long LeadCost) throws IOException {
return new ConstantScoreScorer(
CachingWrapperWeight.this, 0f, ScoreMode.COMPLETE_NO_SCORES, disi);
}
@Override
public long cost() {
return disi.cost();
}
};
}
}
缓存监控
节点级别监控
通过 hitCount / (hitCount + missCount) 可以计算命中率
GET /_cat/nodes?v&h=name,queryCacheMemory,queryCacheHitCount,queryCacheMissCount,fielddataMemory,requestCacheMemory,requestCacheHitCount,requestCacheMissCount
name queryCacheMemory queryCacheHitCount queryCacheMissCount fielddataMemory requestCacheMemory requestCacheHitCount requestCacheMissCount
81 1.7gb 1086360487 4899825499 320.4mb 189.9mb 3317905 87650679
84 1.6gb 1032360913 4583644806 294.9mb 128.3mb 3121582 87026805
82 1.5gb 1026487641 4691926256 318.5mb 131.3mb 3122880 86956484
89 1.7gb 1098099554 5015683302 324mb 193.3mb 3497803 88120856
80 1.6gb 998874442 4422520272 316.9mb 171.4mb 3122396 86982810
85 1.6gb 1048961594 4710553367 288.7mb 193.6mb 3282795 87465086
86 1.7gb 1054740975 4748200509 305mb 175mb 3349060 87793603
87 1.6gb 1056530219 4783851200 350.7mb 191.8mb 3360473 87024550
83 1.8gb 1082470724 5060248077 309.4mb 190.2mb 3440978 87846063
88 1.6gb 1076018640 5045370721 315.9mb 187.7mb 3649191 88161705
索引级别监控
GET test_index/_stats/query_cache,fielddata,request_cache?pretty&human
{
"_shards": {
"total": 4,
"successful": 4,
"failed": 0
},
"_all": {
"primaries": {
"query_cache": {
"memory_size": "44.5mb", //使用的size
"memory_size_in_bytes": 46740615,
"total_count": 65615435, //历史查询总条数 total=hit+miss
"hit_count": 13057809,//命中的
"miss_count": 52557626,//未命中的
"cache_size": 661,//当前缓存的条数
"cache_count": 55634,//历史缓存总条数
"evictions": 54973//被驱逐的条数
},
"fielddata": {
"memory_size": "9.5mb",
"memory_size_in_bytes": 9998504,
"evictions": 0
},
"request_cache": {
"memory_size": "152.8kb",
"memory_size_in_bytes": 156472,
"evictions": 42,
"hit_count": 57560,
"miss_count": 759391
}
},
"total": {
"query_cache": {
"memory_size": "91mb",
"memory_size_in_bytes": 95466199,
"total_count": 133520906,
"hit_count": 26688002,
"miss_count": 106832904,
"cache_size": 1269,
"cache_count": 115109,
"evictions": 113840
},
"fielddata": {
"memory_size": "18.4mb",
"memory_size_in_bytes": 19327272,
"evictions": 0
},
"request_cache": {
"memory_size": "304.1kb",
"memory_size_in_bytes": 311424,
"evictions": 81,
"hit_count": 115871,
"miss_count": 1518861
}
}
},
"indices": {
"test_index": {
"uuid": "4hgWxos1ShKO7a5xGFVkwQ",
"health": "green",
"status": "open",
"primaries": {
"query_cache": {
"memory_size": "44.5mb",
"memory_size_in_bytes": 46740615,
"total_count": 65615435,
"hit_count": 13057809,
"miss_count": 52557626,
"cache_size": 661,
"cache_count": 55634,
"evictions": 54973
},
"fielddata": {
"memory_size": "9.5mb",
"memory_size_in_bytes": 9998504,
"evictions": 0
},
"request_cache": {
"memory_size": "152.8kb",
"memory_size_in_bytes": 156472,
"evictions": 42,
"hit_count": 57560,
"miss_count": 759391
}
},
"total": {
"query_cache": {
"memory_size": "91mb",
"memory_size_in_bytes": 95466199,
"total_count": 133520906,
"hit_count": 26688002,
"miss_count": 106832904,
"cache_size": 1269,
"cache_count": 115109,
"evictions": 113840
},
"fielddata": {
"memory_size": "18.4mb",
"memory_size_in_bytes": 19327272,
"evictions": 0
},
"request_cache": {
"memory_size": "304.1kb",
"memory_size_in_bytes": 311424,
"evictions": 81,
"hit_count": 115871,
"miss_count": 1518861
}
}
}
}
}
系列技术文章:java面试(持续更新中)