GaussDB-Stream方式的Hint

37 阅读4分钟

GaussDB-Stream方式的Hint

功能描述

指明stream使用的方法,可以为broadcast和redistribute,或者直接指定生成gather计划。

语法格式

| ``` [no] broadcast|redistribute|local_roundrobin( [@queryblock] table_list) gather( [@queryblock] REL|JOIN|ALL)

| -------------------------------------------------------------------------------------------------------------------- |

#### 参数说明

-   @queryblock请参见[指定Hint所处的查询块Queryblock](https://doc.hcs.huawei.com/db/zh-cn/gaussdbqlh/24.1.30/devg-dist/gaussdb-12-0272.html#ZH-CN_TOPIC_0000001911585745)章节,可省略,表示在当前查询块生效。

-   broadcast、redistribute和local_roundrobin表示数据分布方法。

    -   no表示hint的stream方式不使用。
    -   table_list为进行stream操作的单表或多表join结果集,请参见[参数说明](https://doc.hcs.huawei.com/db/zh-cn/gaussdbqlh/24.1.30/devg-dist/gaussdb-12-0275.html#ZH-CN_TOPIC_0000001865746472__zh-cn_topic_0000001656059552_section35948678143011)。

<!---->

-   gather

    []()[]()gather hint可以指定三种计划生成方式:

    -   REL:只生成基于基表的gather路径,然后再在CN上执行剩余计划。

    -   JOIN:尽可能生成基于join的gather路径,在能下推的join子计划上面(join下面不包含重分布节点)添加gather路径,剩余计划在CN上执行。对于需要重分布节点的join计划则无法生成基于join的gather路径,会回退生成基于基表的gather路径。

        ![](https://p6-xtjj-sign.byteimg.com/tos-cn-i-73owjymdk6/4a1f7aa7a4154a6ea02ea3930671923c~tplv-73owjymdk6-jj-mark-v1:0:0:0:0:5o6Y6YeR5oqA5pyv56S-5Yy6IEAgamVycnl3YW5nMTk4Mw==:q75.awebp?rk3s=f64ab15b&x-expires=1771398322&x-signature=snYuyJTEsdfont%2BxQPfAAPgfgRs%3D)

        在指定hint(JOIN)后,对于分布表和复制表做连接的情况会导致无法生成hint(JOIN)期望的计划,因为优化器已经寻找更优的计划进行替代。

    -   ALL:基于最优方式选择Gather Rel或Gather Join路径。

#### 示例

对[示例](https://doc.hcs.huawei.com/db/zh-cn/gaussdbqlh/24.1.30/devg-dist/gaussdb-12-0271.html#ZH-CN_TOPIC_0000001911666285__zh-cn_topic_0000001656058848_section671421102912)中原语句使用如下hint:

| ```
explain select /*+ no redistribute(store_sales store_returns item store) leading(((store_sales store_returns item store) customer)) */ i_product_name product_name ... 
``` |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |

原计划中,(store_sales store_returns item store)和customer做join时,前者做了重分布,此hint表示禁止前者混合表做重分布,但仍然保持join顺序,则生成计划如下所示:

![](https://p6-xtjj-sign.byteimg.com/tos-cn-i-73owjymdk6/a83b18569fb44b219590c8d55b424d54~tplv-73owjymdk6-jj-mark-v1:0:0:0:0:5o6Y6YeR5oqA5pyv56S-5Yy6IEAgamVycnl3YW5nMTk4Mw==:q75.awebp?rk3s=f64ab15b&x-expires=1771398322&x-signature=e%2BRxUXrJu00VilnQsIqtstQGEY8%3D)

对语句进行Gather Hint指定:

1.  生成基表Gather计划 /* +GATHER(REL)*/。

    ```
    gaussdb=# explain select /*+ GATHER(REL)*/* from t1, t2, t3 where t1.c2 = t2.c2 and t2.c2 = t3.c2;
     id |               operation               | E-rows | E-width | E-costs 
    ----+---------------------------------------+--------+---------+---------
      1 | ->  Hash Join (2,8)                   |     20 |      36 | 44.10
      2 |    ->  Hash Join (3,5)                |     20 |      24 | 29.22
      3 |       ->  Streaming (type: GATHER)    |     20 |      12 | 14.35
      4 |          ->  Seq Scan on t1           |     20 |      12 | 13.13
      5 |       ->  Hash                        |     20 |      12 | 14.35
      6 |          ->  Streaming (type: GATHER) |     20 |      12 | 14.35
      7 |             ->  Seq Scan on t2        |     20 |      12 | 13.13
      8 |    ->  Hash                           |     20 |      12 | 14.35
      9 |       ->  Streaming (type: GATHER)    |     20 |      12 | 14.35
     10 |          ->  Seq Scan on t3           |     20 |      12 | 13.13
    (10 rows)

     Predicate Information (identified by plan id) 
    -----------------------------------------------
       1 --Hash Join (2,8)
             Hash Cond: (t1.c2 = t3.c2)
       2 --Hash Join (3,5)
             Hash Cond: (t1.c2 = t2.c2)
    (4 rows)
    ```

1.  生成可下推计划的Join Gather计划 /*+ GATHER(REL)*/。

    ```
    gaussdb=# explain select /*+ GATHER(JOIN)*/* from t1, t2, t3 where t1.c1 = t2.c1 and t2.c2 = t3.c2;
     id |             operation              | E-rows | E-width | E-costs 
    ----+------------------------------------+--------+---------+---------
      1 | ->  Hash Join (2,7)                |     20 |      36 | 42.37
      2 |    ->  Streaming (type: GATHER)    |     20 |      24 | 27.49
      3 |       ->  Hash Join (4,5)          |     20 |      24 | 26.56
      4 |          ->  Seq Scan on t1        |     20 |      12 | 13.13
      5 |          ->  Hash                  |     21 |      12 | 13.13
      6 |             ->  Seq Scan on t2     |     20 |      12 | 13.13
      7 |    ->  Hash                        |     20 |      12 | 14.35
      8 |       ->  Streaming (type: GATHER) |     20 |      12 | 14.35
      9 |          ->  Seq Scan on t3        |     20 |      12 | 13.13
    (9 rows)

     Predicate Information (identified by plan id) 
    -----------------------------------------------
       1 --Hash Join (2,7)
             Hash Cond: (t2.c2 = t3.c2)
       3 --Hash Join (4,5)
             Hash Cond: (t1.c1 = t2.c1)
    (4 rows)
    ```

1.  生成最优方式的Gather计划 /*+ GATHER(ALL)*/。

    []()[]()会基于最优方式及规则选择GATHER(REL)或者GATHER(JOIN)路径。

    ```
    gaussdb=# explain select /*+ GATHER(ALL)*/* from t1, t2, t3 where t1.c1 = t2.c1 and t2.c2 = t3.c2;
     id |             operation              | E-rows | E-width | E-costs 
    ----+------------------------------------+--------+---------+---------
      1 | ->  Hash Join (2,7)                |     20 |      36 | 42.37
      2 |    ->  Streaming (type: GATHER)    |     20 |      24 | 27.49
      3 |       ->  Hash Join (4,5)          |     20 |      24 | 26.56
      4 |          ->  Seq Scan on t1        |     20 |      12 | 13.13
      5 |          ->  Hash                  |     21 |      12 | 13.13
      6 |             ->  Seq Scan on t2     |     20 |      12 | 13.13
      7 |    ->  Hash                        |     20 |      12 | 14.35
      8 |       ->  Streaming (type: GATHER) |     20 |      12 | 14.35
      9 |          ->  Seq Scan on t3        |     20 |      12 | 13.13
    (9 rows)

     Predicate Information (identified by plan id) 
    -----------------------------------------------
       1 --Hash Join (2,7)
             Hash Cond: (t2.c2 = t3.c2)
       3 --Hash Join (4,5)
             Hash Cond: (t1.c1 = t2.c1)
    (4 rows)
    ```

    local_ronndrobin hint使用:

    ```
    SET enable_fast_query_shipping=off; --关闭FQS优化
    SET query_dop=4; --设置并行度
    gaussdb=# EXPLAIN(costs off) SELECT /*+ local_roundrobin(t2) scandop(t2 1) scandop(t1 4)*/* FROM t1, t2  WHERE t1.c1 = t2.c1;
     id |                         operation
    ----+------------------------------------------------------------
      1 | ->  Streaming (type: GATHER)
      2 |    ->  Streaming(type: LOCAL GATHER dop: 1/4)
      3 |       ->  Nested Loop (4,6)
      4 |          ->  Streaming(type: LOCAL BROADCAST dop: 4/4)
      5 |             ->  Seq Scan on t1
      6 |          ->  Materialize
      7 |             ->  Streaming(type: LOCAL ROUNDROBIN dop: 4/1)
      8 |                ->  Seq Scan on t2
    (8 rows)

     Predicate Information (identified by plan id)
    -----------------------------------------------
       3 --Nested Loop (4,6)
             Join Filter: (t1.c1 = t2.c1)
    (2 rows)
    ```

    可以看到成功选择了local_ronndrobin的数据分布方式,但是使用约束较多(并行度设置,关闭FQS优化)一般不建议使用。

    ![](https://p6-xtjj-sign.byteimg.com/tos-cn-i-73owjymdk6/de3a80e5af4d4da494817e546216b0ab~tplv-73owjymdk6-jj-mark-v1:0:0:0:0:5o6Y6YeR5oqA5pyv56S-5Yy6IEAgamVycnl3YW5nMTk4Mw==:q75.awebp?rk3s=f64ab15b&x-expires=1771398322&x-signature=VYS%2Bxy8tWburDWWgXjbgeJ98DGA%3D)

    local_roundrobin hint只有在表扫描并行度为1的时候才会生效,建议和[scandop hint](https://doc.hcs.huawei.com/db/zh-cn/gaussdbqlh/24.1.30/devg-dist/gaussdb-12-0295.html#ZH-CN_TOPIC_0000001865585608)一起使用。

更多详情请参考GaussDB 文档中心:<https://doc.hcs.huawei.com/db/zh-cn/gaussdbqlh/24.1.30/productdesc/qlh_03_0001.html>