上一篇:Elasticsearch8.5.3 源码分析(2)-TransportAction映射和分片路由模型 - 掘金 (juejin.cn)
综述
仍然是以GET /{index}/_doc/{id}
接口为基础来分析Elasticsearch交互流程。上一篇分析到请求最后转交给TransportService,本篇分析TransportService.sendRequest
方法及后续处理流程。
如上图所示TransportService.sendRequest
方法,首先通过getConnection方法获取Transport.Connection
。代码如下:
public Transport.Connection getConnection(DiscoveryNode node) {
if (isLocalNode(node)) {
return localNodeConnection;
} else {
return connectionManager.getConnection(node);
}
}
判断目标通信节点是否为本地节点,如果为本地节点返回localNodeConnection。localNodeConnection是Transport.Connection的匿名类实例。
如果通信目标是远程节点,则返回的是TcpTransport.NodeChannels实例。
localNodeConnection的实现:
private final Transport.Connection localNodeConnection = new Transport.Connection() {
......
@Override
public void sendRequest(long requestId, String action, TransportRequest request, TransportRequestOptions options) {
sendLocalRequest(requestId, action, request, options);
}
......
}
可以看到,其sendRequest方法的实现为直接调用sendLocalRequest方法。
Get请求读数据流程:
- 客户端发Get请求至某个ES节点
- 接收请求的ES节点充当协调节点角色,根据分片路由规则判断数据在哪个分片。
- 如果数据分片在当前节点,则通过分片数据读取模块获取数据。
- 如果数据分片在其它节点,通过Netty远程通信模块,将请求转发至分片所在节点,再通过分片数据读取模块获取数据返回给协调节点。
- 协调节点拿到数据后,返回响应给客户端
本地节点请求处理过程
下面分析TransportService.sendLocalRequest
方法实现流程。
TransportService.sendLocalRequest
方法源码:
private void sendLocalRequest(long requestId, final String action, final TransportRequest request, TransportRequestOptions options) {
//构建ResponseChannel后续用来发送Response信息给客户端
final DirectResponseChannel channel = new DirectResponseChannel(localNode, action, requestId, this, threadPool);
try {
onRequestSent(localNode, requestId, action, request, options);
onRequestReceived(requestId, action);
@SuppressWarnings("unchecked")
//获取当前action对应的RequestHandlerRegistry处理类
final RequestHandlerRegistry<TransportRequest> reg = (RequestHandlerRegistry<TransportRequest>) getRequestHandler(action);
if (reg == null) {
assert false : action;
throw new ActionNotFoundTransportException("Action [" + action + "] not found");
}
final String executor = reg.getExecutor();
if (ThreadPool.Names.SAME.equals(executor)) {
//使用当前线程处理
try (var ignored = threadPool.getThreadContext().newTraceContext()) {
try {
//处理请求
reg.processMessageReceived(request, channel);
} catch (Exception e) {
handleSendToLocalException(channel, e, action);
}
}
} else {
//创建新的线程处理
...
//处理请求
reg.processMessageReceived(request, channel);
...
}
...
TransportGetAction.asyncShardOperation
protected void asyncShardOperation(GetRequest request, ShardId shardId, ActionListener<GetResponse> listener) throws IOException {
IndexService indexService = indicesService.indexServiceSafe(shardId.getIndex());
//获取数据所在目标分片
IndexShard indexShard = indexService.getShard(shardId.id());
if (request.realtime()) {
//实时执行,即数据未刷盘也可从 translog 读取
super.asyncShardOperation(request, shardId, listener);
} else {
//等待数据刷入内核缓冲区后读取(默认刷新间隔为1s)
indexShard.awaitShardSearchActive(b -> {
try {
super.asyncShardOperation(request, shardId, listener);
} catch (Exception ex) {
listener.onFailure(ex);
}
});
}
}
TransportSingleShardAction.asyncShardOperation
方法使用线程池异步执行,先调用TransportGetAction.shardOperation
获取GetResponse,然后调用listener.onResponse方法发送客户端响应。
protected void asyncShardOperation(Request request, ShardId shardId, ActionListener<Response> listener) throws IOException {
threadPool.executor(getExecutor(request, shardId)).execute(ActionRunnable.supply(listener, () -> shardOperation(request, shardId)));
}
TransportGetAction.shardOperation
方法调用ShardGetService.get
获取文档数据GetResult,并包装成GetResponse返回。
protected GetResponse shardOperation(GetRequest request, ShardId shardId) throws IOException {
IndexService indexService = indicesService.indexServiceSafe(shardId.getIndex());
IndexShard indexShard = indexService.getShard(shardId.id());
//request.refresh()表示是否在读取前执行刷新操作
if (request.refresh() && request.realtime() == false) {
indexShard.refresh("refresh_flag_get");
}
//获取分片数据
GetResult result = indexShard.getService()
.get(
request.id(),
request.storedFields(),
request.realtime(),
request.version(),
request.versionType(),
request.fetchSourceContext(),
request.isForceSyntheticSource()
);
return new GetResponse(result);
}
远程节点请求处理过程
如果请求的数据不在当前节点上,则会将数据请求转发到对应的远程目标节点。与远程节点进行通信的方式仍然是通过netty组件。集群节点之间的通信Netty配置初始化类为Netty4Transport.java
,其绑定的业务处理Handler为Netty4MessageInboundHandler.java
。
与远程节点通信的核心代码在OutboundHandler中,sendMessage方法序列化通信消息.internalSend方法调用Netty4TcpChannel处理通信事宜。
远程节点监听接收到消息后,核心处理逻辑在InboundPipeline和InboundHandler类中,后续处理流程则和本地节点一致。
获取文档数据过程
前面分析到本地节点和远程节点最终都是通过ShardGetService.get
方法拿到数据。下面分析ShardGetService.get
方法处理过程。
InternalEngine.get
方法。
参考IndexShard.getEngine
方法,一个索引分片对应一个Engine实例,Engine中的方法均为分片级别的操作。
public GetResult get(
Get get,
MappingLookup mappingLookup,
DocumentParser documentParser,
Function<Engine.Searcher, Engine.Searcher> searcherWrapper
) {
assert Objects.equals(get.uid().field(), IdFieldMapper.NAME) : get.uid().field();
//开启分片读锁。当分片在执行Flush操作时,此处将被阻塞。
//因为Get操作,有可能需要从Translog中读取最新数据。而Flush操作会将内核中Segment刷写到磁盘,同时截断Translog。
try (ReleasableLock ignored = readLock.acquire()) {
ensureOpen();
//默认情况为实时读取数据
if (get.realtime()) {
final VersionValue versionValue;
//开启文档版本锁。
try (Releasable ignore = versionMap.acquireLock(get.uid().bytes())) {
// 锁定版本以实时获取更新过的最新版本
versionValue = getVersionFromMap(get.uid().bytes());
}
if (versionValue != null) {
if (versionValue.isDelete()) {
//最新版本数据被删除
return GetResult.NOT_EXISTS;
}
//并发场景下可能存在版本冲突
//根据当前版本类型检查当前版本是否与预期版本冲突
if (get.versionType().isVersionConflictForReads(versionValue.version, get.version())) {
throw new VersionConflictEngineException(
shardId,
"[" + get.id() + "]",
get.versionType().explainConflictForReads(versionValue.version, get.version())
);
}
if (get.getIfSeqNo() != SequenceNumbers.UNASSIGNED_SEQ_NO
&& (get.getIfSeqNo() != versionValue.seqNo || get.getIfPrimaryTerm() != versionValue.term)) {
throw new VersionConflictEngineException(
shardId,
get.id(),
get.getIfSeqNo(),
get.getIfPrimaryTerm(),
versionValue.seqNo,
versionValue.term
);
}
//默认从Translog中读取最新数据。isReadFromTranslog和readtime属性值一致。
if (get.isReadFromTranslog()) {
//文档有更新操作时,首先被写入Translog.一秒后才会执行refresh操作,刷至内核缓冲区。所以这1秒内文档数据是没有更新,处于不一致状态。只能先从Translog中读取最新数据。
if (versionValue.getLocation() != null) {
try {
final Translog.Operation operation = translog.readOperation(versionValue.getLocation());
if (operation != null) {
//返回Translog中的最新数据
return getFromTranslog(get, (Translog.Index) operation, mappingLookup, documentParser, searcherWrapper);
}
} catch (IOException e) {
maybeFailEngine("realtime_get", e); // lets check if the translog has failed with a tragic event
throw new EngineException(shardId, "failed to read operation from translog", e);
}
} else {
trackTranslogLocation.set(true);
}
}
assert versionValue.seqNo >= 0 : versionValue;
//不从Translog中读取,则调用refresh,将数据刷新至内核,从而对读操作可见。
refreshIfNeeded("realtime_get", versionValue.seqNo);
}
//Searcher.Reader读取分片数据
return getFromSearcher(get, acquireSearcher("realtime_get", SearcherScope.INTERNAL, searcherWrapper), false);
} else {
// we expose what has been externally expose in a point in time snapshot via an explicit refresh
return getFromSearcher(get, acquireSearcher("get", SearcherScope.EXTERNAL, searcherWrapper), false);
}
}
}
校验并发版本冲突时,源码中作了两处校验:
- version
- ifSeqNo和ifPrimaryTerm
- version针对单个文档,文档的每次修改(包括删除)操作版本号都会自增 version=version+1。
- ifSeqNo针对索引单个分片,对分片内文档的每次修改(包括删除)操作ifSeqNo都会累加,ifSeqNo=ifSeqNo+1。
- ifPrimaryTerm在每次主分片切换时累加,如下线重启,故障切换重新选举新的主分片等。
InternalEngine.getFromSearcher
protected final GetResult getFromSearcher(Get get, Engine.Searcher searcher, boolean uncachedLookup) throws EngineException {
final DocIdAndVersion docIdAndVersion;
try {
if (uncachedLookup) {
//不缓存PerThreadIDVersionAndSeqNoLookup
docIdAndVersion = VersionsAndSeqNoResolver.loadDocIdAndVersionUncached(searcher.getIndexReader(), get.uid(), true);
} else {
docIdAndVersion = VersionsAndSeqNoResolver.loadDocIdAndVersion(searcher.getIndexReader(), get.uid(), true);
}
} catch (Exception e) {
Releasables.closeWhileHandlingException(searcher);
// TODO: A better exception goes here
throw new EngineException(shardId, "Couldn't resolve version", e);
}
//并发版本冲突检测
if (docIdAndVersion != null) {
if (get.versionType().isVersionConflictForReads(docIdAndVersion.version, get.version())) {
Releasables.close(searcher);
throw new VersionConflictEngineException(
shardId,
"[" + get.id() + "]",
get.versionType().explainConflictForReads(docIdAndVersion.version, get.version())
);
}
if (get.getIfSeqNo() != SequenceNumbers.UNASSIGNED_SEQ_NO
&& (get.getIfSeqNo() != docIdAndVersion.seqNo || get.getIfPrimaryTerm() != docIdAndVersion.primaryTerm)) {
Releasables.close(searcher);
throw new VersionConflictEngineException(
shardId,
get.id(),
get.getIfSeqNo(),
get.getIfPrimaryTerm(),
docIdAndVersion.seqNo,
docIdAndVersion.primaryTerm
);
}
}
if (docIdAndVersion != null) {
// don't release the searcher on this path, it is the
// responsibility of the caller to call GetResult.release
return new GetResult(searcher, docIdAndVersion);
} else {
Releasables.close(searcher);
return GetResult.NOT_EXISTS;
}
}
VersionsAndSeqNoResolver.loadDocIdAndVersion
方法。
通过LeafReader获取文档的version、seqNo和primaryTerm,然后构建DocIdAndVersion实例返回。
public static DocIdAndVersion loadDocIdAndVersion(IndexReader reader, Term term, boolean loadSeqNo) throws IOException {
//每个LeafReadContext对应一个PerThreadIDVersionAndSeqNoLookup,使用TreadLocal缓存PerThreadIDVersionAndSeqNoLookup数组,而不是每次调用(如mget操作)都重新创建PerThreadIDVersionAndSeqNoLookup。
PerThreadIDVersionAndSeqNoLookup[] lookups = getLookupState(reader, term.field());
List<LeafReaderContext> leaves = reader.leaves();
// 从后向前迭代,最近的文档更新操作通常发生在最后一个段上
for (int i = leaves.size() - 1; i >= 0; i--) {
final LeafReaderContext leaf = leaves.get(i);
PerThreadIDVersionAndSeqNoLookup lookup = lookups[leaf.ord];
//通过LeafReader获取文档的version、seqNo和primaryTerm。然后构建DocIdAndVersion
DocIdAndVersion result = lookup.lookupVersion(term.bytes(), loadSeqNo, leaf);
if (result != null) {
return result;
}
}
return null;
}
IndexReader和IndexReaderContext
IndexReader是一个抽象类,提供用于访问索引数据的接口。在打开新IndexReader之前,对索引所做的任何更改都是不可见的。
IndexReader通常通过调用DirectoryReader.open(Directory)
方法来构造实例.
有两种类型的IndexReader:
- LeafReader:叶子读取器。LeafReader是一个抽象类,提供用于访问索引的接口。索引的搜索完全通过这个抽象接口完成,因此实现它的任何子类都是可搜索的。支持检索存储(stored)字段、文档值、词条(terms)和词条倒排索引记录(postings)。
- CompositeReader:复合读取器。CompositeReader也是抽像类,它由多个子读取器LeafReader组成,CompositeReader本身是不具备直接检索功能,只能从LeafReader子读取器进行检索。CompositeReader通过
LeafReaderContext.leaves()
获取所有子读取器。
注意: IndexReader 实例是完全线程安全的,这意味着多个线程可以同时调用其任何方法。