Presto - Coordinator 查询流程-1

866 阅读5分钟

1. presto Coordinator整体流程介绍

先来看一下Coordinator执行的整体流程图,如下所示: 未命名表单.png

  1. client 通过restful接口/v1/statement将sql提交给Coordinator的QueuedStatementResource

  2. QueuedStatementResource 构建一个Query对象并缓存到queries Map中,并生成nextQueryUri返回给client;

  3. client获取到nextUri后访问 queued/{queryId}/{slug}/{token}接口,向Coordinator查询Query的查询的状态。

  4. Coordinator会从第2步缓存的queries中获得查询对象,开始真正查询执行过程;

  5. 语法分析:开始使用Antrl4语法解析statement,解析的结果是一个Ast表达式(Ast表达式也是树形结构的,每一个节点是Statement 对象,在Statement中并没有区分节点类型和属性,假设输入statement是select field from table,那么selectfield是同等地位)

  6. 语义分析:使用Analyzer对statment进行重写(explian,describe,show), 使用StatementAnalyzer对statement进行语法分析(语法分析包括校验,包括catalog、schema、table是否存在;校验where、select、group by、order by、offset、limit、窗口子句是否合法,等等); 语义分析后会输出Analysis对象(包括分析的结果和statement)

  7. 构建逻辑并且优化计划:LogicalPlanner通过Analysis对象构建LogicalPlan(构建的LogicalPlan是数据库表达式在代码中对应的实现,拥有Project,Filter, Scan等节点,节点上还有Symbol属性);优化逻辑计划包含基于规则的优化和基于统计分析的优化,最常见的优化有投影、过滤下推;

  8. 逻辑计划的切分(PlanFragmenter):LogicalPlan只是在逻辑上表达如何计算结果;但是真实执行是在Worker上执行,并且通常为了查询速度,会很worker一起执行一个查询计划,切分查询计划正是为了LogicalPlan能够在多台机器上运行;生成的分布式逻辑计划SubPlan是将原来整体的LogicalPlan切分成一段一段的可以在一个Worker执行的Plan+ExchangeNode(交换数据)

  9. 结合数据源信息,生成StageExecutionPlan:真实的查询LogicalPlan执行需要在真实数据上执行,DistributedExecutionPlanner将查询的逻辑计划与数据源的分片绑定,规划好将要读取的分片信息;

  10. 将逻辑计划封装成可调度的对象(这里的调度指将逻辑计划提交给worker节点,并不是真正的执行),执行分布式逻辑计划的调度

2. 提交statment查询

2.1 入口

客户端向QueuedStatementResource#postStatement提交查询,核心代码如下:

public Response postStatement(
        String statement,
        @Context HttpServletRequest servletRequest,
        @Context HttpHeaders httpHeaders,
        @Context UriInfo uriInfo)
{
    // 构建会话Context
    String remoteAddress = servletRequest.getRemoteAddr();
    Optional<Identity> identity = Optional.ofNullable((Identity) servletRequest.getAttribute(AUTHENTICATED_IDENTITY));
    MultivaluedMap<String, String> headers = httpHeaders.getRequestHeaders();
    SessionContext sessionContext = new HttpRequestSessionContext(headers, alternateHeaderName, remoteAddress, identity, groupProvider);
    
    //新建Query对象,并且缓存到queries
    Query query = new Query(statement, sessionContext, dispatchManager);
    queries.put(query.getQueryId(), query);

    servletRequest.setAttribute(AUTHENTICATED_IDENTITY, null);
    // 构建nextUri返回给client
    return createQueryResultsResponse(query.getQueryResults(query.getLastToken(), uriInfo), compressionEnabled);
}

2.2 Query#getQueryResult

再看一下Query#getQueryResult

public QueryResults getQueryResults(long token, UriInfo uriInfo)
{
    ... 
    synchronized (this) {
        // 如果query查询没有完成,返回结果
        if (querySubmissionFuture == null || !querySubmissionFuture.isDone()) {
            return createQueryResults(
                    token + 1,
                    uriInfo,
                    DispatchInfo.queued(NO_DURATION, NO_DURATION));
        }
    }
    ...
    // 否则获取数据,返回真实的调度查询后的结果
    return createQueryResults(token + 1, uriInfo, dispatchInfo.get());
}

2.3 QueuedStatementResource#createQueryResults

继续查看QueuedStatementResource#createQueryResults,可以发现最后将获取nextUri然后返回给client端;

private QueryResults createQueryResults(long token, UriInfo uriInfo, DispatchInfo dispatchInfo)
{   // 生成的nextUri,example:http://localhost:8080/v1/statement/queued/20220421_024338_00005_d8sfy/ydb35816e9373477bf403fb4e4796242165e8a17f/1
    URI nextUri = getNextUri(token, uriInfo, dispatchInfo);

    Optional<QueryError> queryError = dispatchInfo.getFailureInfo()
            .map(this::toQueryError);
    // 构建结果并且返回
    return QueuedStatementResource.createQueryResults(
            queryId,
            nextUri,
            queryError,
            uriInfo,
            dispatchInfo.getElapsedTime(),
            dispatchInfo.getQueuedTime());
}

2.4 QueuedStatementResource#getQueuedUri

最后再看下如何生成nextUri的,下图代码所示:/v1/statement/queued/+queryId+slug+token

private static URI getQueuedUri(QueryId queryId, Slug slug, long token, UriInfo uriInfo)
{
    return uriInfo.getBaseUriBuilder()
            .replacePath("/v1/statement/queued/")
            .path(queryId.toString())
            .path(slug.makeSlug(QUEUED_QUERY, token))
            .path(String.valueOf(token))
            .replaceQuery("")
            .build();
}

3. Client向Coordinator查询Query状态(触发Query的执行)

3.1 查询状态入口

查询状态时会向线程池提交任务query.waitForDispatched(), 并设置任务超时时间

public void getStatus(
        @PathParam("queryId") QueryId queryId,
        @PathParam("slug") String slug,
        @PathParam("token") long token,
        @QueryParam("maxWait") Duration maxWait,
        @Context UriInfo uriInfo,
        @Suspended AsyncResponse asyncResponse)
{
    // 从queries中获取根据queryId获取Query对象
    Query query = getQuery(queryId, slug, token);

    // wait for query to be dispatched, up to the wait timeout
    ListenableFuture<?> futureStateChange = addTimeout(
            query.waitForDispatched(), // 等待query对象分配资源调度
            () -> null,
            WAIT_ORDERING.min(MAX_WAIT_TIME, maxWait),
            timeoutExecutor);
    ...
}

3.2 Query#waitFortDispatched

查看query.waitFortDispatched,query的执行被托管到DistpachManager;若query没有被创建,使用DispatchManager#createQuery 创建query,等待Query执行成;

private ListenableFuture<?> waitForDispatched()
{
    // if query query submission has not finished, wait for it to finish
    synchronized (this) {
        if (querySubmissionFuture == null) {
            querySubmissionFuture = dispatchManager.createQuery(queryId, slug, sessionContext, query);
        }
        if (!querySubmissionFuture.isDone()) {
            return querySubmissionFuture;
        }
    }

    // otherwise, wait for the query to finish
    return dispatchManager.waitForDispatched(queryId);
}

2.3 DispatchManager#createQuery

查看DispatchManager#createQuery,通过异步继续调用 createQueryInternal

public ListenableFuture<?> createQuery(QueryId queryId, Slug slug, SessionContext sessionContext, String query)
{
    DispatchQueryCreationFuture queryCreationFuture = new DispatchQueryCreationFuture(); // 同步两个线程
    dispatchExecutor.execute(() -> {
        try {
            createQueryInternal(queryId, slug, sessionContext, query, resourceGroupManager);
        }
        finally {
            queryCreationFuture.set(null);
        }
    });
    return queryCreationFuture;
}

2.4 DispatchManager#createQueryInternal

查看 DispatchManager#createQueryInternal(删除主流程无关代码);首先通过QueryParser#prepareQuery解析Query得到preparedQuery;preparedQuery其实便是语法解析的后的查询,包含一个Statement对象,statment便是解析后的Ast

private <C> void createQueryInternal(QueryId queryId, Slug slug, SessionContext sessionContext, String query, ResourceGroupManager<C> resourceGroupManager)
{
    ...
    // prepare query;执行的Query的准备工作
    preparedQuery = queryPreparer.prepareQuery(session, query);
    ...
    // 创建dispatchQuery,并且将Query提交给 resourceGroupManager执行
    DispatchQuery dispatchQuery = dispatchQueryFactory.createDispatchQuery(
                    session,
                    query,
                    preparedQuery,
                    slug,
                    selectionContext.getResourceGroupId());
    ...
    resourceGroupManager.submit(dispatchQuery, selectionContext, dispatchExecutor);
    ...
}

public static class PreparedQuery
{
    private final Statement statement; // 语法解析树
    private final List<Expression> parameters; 
    private final Optional<String> prepareSql;
}

2.5 QueryParser#prepareQuery

查看QueryParser#prepareQuery: 可以看到这里解析的工作交给SqlParser#createStatement,自此代码执行到语法解析部分;下节继续分析语法解析树的生成;

public PreparedQuery prepareQuery(Session session, String query)
        throws ParsingException, TrinoException
{
    Statement wrappedStatement = sqlParser.createStatement(query, createParsingOptions(session));
    return prepareQuery(session, wrappedStatement);
}