在JOIN中使用jOOQ的隐式连接 ...ON子句从jOOQ 3.11开始，类型安全的隐式JOIN ，并且在jOOQ 3

从jOOQ 3.11开始，类型安全的隐式JOIN ，并且在jOOQ 3.17的DML语句中也得到了支持。今天，我想关注隐式JOIN 的一个有点奇怪但非常强大的用例，当从显式JOIN'sON 子句中连接其他表时。

该使用案例

jOOQ代码生成器在查询各种字典视图时大量使用了jOOQ。在PostgreSQL中，大多数查询都是去查询SQL标准的information_schema ，但是每当这个时候，标准的元数据是不够的，我们也不得不去查询pg_catalog ，它更完整，但是也更有技术含量。

对于很多information_schema 视图，存在一个几乎等价的pg_catalog 表，包含相同的信息。比如说。

`information_schema`	`pg_catalog`
`schemata`	`pg_namespace`
`tables` 或`user_defined_types`	`pg_class`
`columns` 或`attributes`	`pg_attribute`

有趣的是，PostgreSQL作为一个ORDBMS，表和用户定义的类型是一回事，在类型系统中经常可以互换，但这是未来博文的主题。

这篇博文的重点是，通常情况下，当查询像information_schema.attributes 这样的视图时，我们还必须查询pg_catalog.pg_attribute ，以获得额外的数据。例如，为了找到一个UDT（用户定义类型）属性的声明数组维度，我们必须访问pg_catalog.pg_attribute.attndims ，因为这个信息在information_schema 。另见jOOQ功能请求#252，我们将添加对H2 / PostgreSQL多维数组的支持。

所以，我们可能会有一个这样的UDT。

CREATE TYPE u_multidim_a AS (
  i integer[][],
  n numeric(10, 5)[][][],
  v varchar(10)[][][][]
);

从attributes 视图中访问pg_attribute 表的典型SQL方式是。

SELECT 
  is_a.udt_schema,
  is_a.udt_name,
  is_a.attribute_name,
  pg_a.attndims 
FROM information_schema.attributes AS is_a
  JOIN pg_attribute AS pg_a 
    ON is_a.attribute_name = pg_a.attname 
  JOIN pg_class AS pg_c
    ON is_a.udt_name = pg_c.relname 
    AND pg_a.attrelid = pg_c.oid
  JOIN pg_namespace AS pg_n
    ON is_a.udt_schema = pg_n.nspname 
    AND pg_c.relnamespace = pg_n.oid
WHERE is_a.data_type = 'ARRAY'    
ORDER BY
  is_a.udt_schema,
  is_a.udt_name,
  is_a.attribute_name,
  is_a.ordinal_position

要可视化：

                +----- udt_schema = nspname ------> pg_namespace
                |                                      ^
                |                                      |
                |                                     oid
                |                                      =
                |                                     relnamespace
                |                                      |
                |                                      v 
                +------- udt_name = relname ------> pg_class
                |                                      ^
                |                                      |
                |                                     oid
                |                                      =
                |                                     attrelid
                |                                      |
                |                                      v 
is.attributes <-+- attribute_name = attname ------> pg_attribute

现在，我们可以看到一些集成测试的用户定义类型，包含多维数组。

|udt_schema|udt_name    |attribute_name|attndims|
|----------|------------|--------------|--------|
|public    |u_multidim_a|i             |2       |
|public    |u_multidim_a|n             |3       |
|public    |u_multidim_a|v             |4       |
|public    |u_multidim_b|a1            |1       |
|public    |u_multidim_b|a2            |2       |
|public    |u_multidim_b|a3            |3       |
|public    |u_multidim_c|b             |2       |

但是，看看所有这些JOIN 的表达式。它们肯定不好玩。我们必须拼出从pg_attribute 到pg_namespace 的整个路径，只是为了确保我们没有从其他UDTs或其他模式中获取任何模糊的数据。

使用隐式连接代替

这就是隐式JOIN 的力量所在。我们真正想用 SQL写的是这个。

SELECT 
  is_a.udt_schema,
  is_a.udt_name,
  is_a.attribute_name,
  pg_a.attndims 

-- This table we need
FROM information_schema.attributes AS is_a

-- And also this one
JOIN pg_attribute AS pg_a 
  ON is_a.attribute_name = pg_a.attname 

-- But the path joins from pg_attribute to pg_namespace should 
-- be implicit
  AND pg_a.pg_class.relname = is_a.udt_name
  AND pg_a.pg_class.pg_namespace.nspname = is_a.udt_schema
WHERE is_a.data_type = 'ARRAY'    
ORDER BY
  is_a.udt_schema,
  is_a.udt_name,
  is_a.attribute_name,
  is_a.ordinal_position

这并没有缩短多少，但是不再需要考虑如何连接不同的步骤，这绝对是非常方便的。请注意，与其他情况不同的是，我们在SELECT 或WHERE 中通过这些路径使用了隐式连接，这次我们是在JOIN .. ON 子句中使用它们！在jOOQ中，我们可以写。

Attributes isA = ATTRIBUTES.as("is_a");
PgAttribute pgA = PgAttribute.as("pg_a");

ctx.select(
       isA.UDT_SCHEMA,
       isA.UDT_NAME,
       isA.ATTRIBUTE_NAME,
       pgA.ATTNDIMS)
   .from(isA)
   .join(pgA)
     .on(isA.ATTRIBUTE_NAME.eq(pgA.ATTNAME))
     .and(isA.UDT_NAME.eq(pgA.pgClass().RELNAME))
     .and(isA.UDT_SCHEMA.eq(pgA.pgClass().pgNamespace().NSPNAME))
   .where(isA.DATA_TYPE.eq("ARRAY"))
   .orderBy(
       isA.UDT_SCHEMA,
       isA.UDT_NAME,
       isA.ATTRIBUTE_NAME,
       isA.ORDINAL_POSITION)
   .fetch();

生成的SQL看起来与原来的略有不同，因为jOOQ的隐式JOIN 算法不会平坦JOIN ，以保留任何潜在的JOIN 操作符的优先权，这在有LEFT JOIN 、FULL JOIN 或其他操作符存在的情况下是很重要的。输出看起来更像这样。

FROM information_schema.attributes AS is_a
  JOIN (
    pg_catalog.pg_attribute AS pg_a
      JOIN (
        pg_catalog.pg_class AS alias_70236485
          JOIN pg_catalog.pg_namespace AS alias_96617829
            ON alias_70236485.relnamespace = alias_96617829.oid
      )
        ON pg_a.attrelid = alias_70236485.oid
    )
    ON (
      is_a.attribute_name = pg_a.attname
      AND is_a.udt_name = alias_70236485.relname
      AND is_a.udt_schema = alias_96617829.nspname
    )

正如你所看到的，"可读的 "表别名（is_a 和pg_a ）是用户提供的，而 "不可读的 "系统生成的别名（alias_70236485 和alias_96617829 ）是来自隐含的JOIN 。而且，同样重要的是，这些隐式连接被嵌入到属于它们的地方，与路径根pg_a ，我们从那里开始了路径表达。这是我们保留正确的JOIN 操作符优先级语义的唯一方法，例如，如果我们在is_a 和之间使用了LEFT JOIN 。pg_a

未来的改进

在未来，可能会有更好的JOIN 路径，可以直接连接这样的图，因为每次要连接information_schema.attributes 和pg_catalog.pg_attribute ，都要在(udt_schema, udt_name, attribute_name) 元组上重复同样的等价关系，虽然隐式JOIN 已经很有帮助，但很容易看出这可以进一步改进。理想的查询应该是。

SELECT 
  is_a.udt_schema,
  is_a.udt_name,
  is_a.attribute_name,
  pg_a.attndims 
FROM information_schema.attributes AS is_a

-- Magic here
MAGIC JOIN pg_attribute AS pg_a 
  ON jooq_do_your_thing
WHERE is_a.data_type = 'ARRAY'    
ORDER BY
  is_a.udt_schema,
  is_a.udt_name,
  is_a.attribute_name,
  is_a.ordinal_position

但我们还没有完全达到这个目的。

获得对这些连接路径的访问

information_schema 视图和pg_catalog 表都没有暴露任何外键元数据，这些数据是隐式连接路径表达式和其他jOOQ代码生成功能的先决条件。这不是一个很大的问题，因为你可以指定合成外键给代码生成器，正是出于这个原因。也请看我们之前的博文，关于信息模式查询的合成外键。在这种情况下，我们所需要的至少是这个规范。

<configuration>
  <generator>
    <database>
      <syntheticObjects>
        <foreignKeys>
          <foreignKey>
            <tables>pg_attribute</tables>
            <fields><field>attrelid</field></fields>
            <referencedTable>pg_class</referencedTable>
          </foreignKey>
          <foreignKey>
            <tables>pg_class</tables>
            <fields><field>relnamespace</field></fields>
            <referencedTable>pg_namespace</referencedTable>
          </foreignKey>
        </foreignKeys>
      </syntheticObjects>
    </database>
  </generator>
</configuration>

然后ta-dah，我们就有了前面例子中看到的JOIN 路径。