1. 代码版本说明
源码分析基于4.2.0版本。
2. 核心逻辑执行路径
本节内容,只把impala be,fe 执行的调用主路径做一个梳理。fe模块的sql语法解析,语义解析,查询计划生成;be模块的资源调度,查询任务执行在后续的内容中进行展开分析。
3. 主进程入口
be/src/service/daemon-main.cc 程序入口。impalad,catalogd,statestored为同一个二进制程序,根据启动参数的类型,选择启动不通的角色功能。后续代码分析,以impalad的执行逻辑来分析。
int main(int argc, char** argv) {
path cmd_line_path(argv[0]);
string daemon = cmd_line_path.filename().string();
if (daemon == "statestored") {
return StatestoredMain(argc, argv);
}
if (daemon == "impalad") {
return ImpaladMain(argc, argv);
}
if (daemon == "catalogd") {
return CatalogdMain(argc, argv);
}
if (daemon == "admissiond") {
return AdmissiondMain(argc, argv);
}
cerr << "Unknown daemon name: " << daemon
<< " (valid options: impalad, catalogd, statestored)" << endl;
exit(1);
}
4. 服务启动流程
sequenceDiagram
daemon-main.cc->>impalad-main.cc: ImpaladMain()
impalad-main.cc->>init.cc: InitCommonRuntime(argc,argv,init_jvm=true)
init.cc->>jni-util.cc: InitLibhdfs() <br/>通过调用hdfs方法创建fe jvm。
impalad-main.cc->>impala-server.cc:Start()
impala-server.cc->> exec-env.cc: StartStatestoreSubscriberService()<br/>订阅statestore数据同步
impala-server.cc->> exec-env.cc:StartKrpcService()<br/>启动集群内部节点RPC
impala-server.cc->> impala-server.cc: 启动服务监听
5. 请求处理流程
sequenceDiagram
cmd client ->> impala-server : ExecuteStatement()
impala-server ->> impala-server : ExecuteStatementCommon()
impala-server ->> impala-server : Execute()
impala-server ->> impala-server :ExecuteInternal()
impala-server ->> QueryDriver : RunFrontendPlanner()
QueryDriver ->> frontend.cc : GetExecRequest()
frontend.cc ->> JniFrontend: createExecRequest()
JniFrontend ->> Frontend.java : createExecRequest()<br/>fe部分生成查询计划
impala-server ->> ClientRequestState :set_result_metadata()设置结果头
impala-server ->> ClientRequestState : Exec()
ClientRequestState ->> ClientRequestState :ExecQueryOrDmlRequest()
ClientRequestState ->>AdmissionControlClient :SubmitForAdmission()
ClientRequestState ->> Coordinator : Exec()
impala-hs2-server ->> ClientRequestState : WaitAsync()等待结果返回
impala-hs2-server ->> impala-hs2-server : FetchInternal()
impala-hs2-server ->> ClientRequestState: FetchRows()
ClientRequestState ->> ClientRequestState :FetchRowsInternal()
ClientRequestState ->> Coordinator :GetNext()
Coordinator ->> PlanRootSink:GetNext()
impala-hs2-server ->> impala-server : WaitForResults()
impala-server ->> ClientRequestState : BlockOnWait()等待结果
be请求处理堆栈信息如下, 在GetExecRequest中执行jni调用,就进入到fe进行执行计划生成。
Status Frontend::GetExecRequest(
const TQueryCtx& query_ctx, TExecRequest* result) {
return JniUtil::CallJniMethod(fe_, create_exec_request_id_, query_ctx, result);
}