使用PostGIS的乐趣:曼德布洛特集,生活游戏等

300 阅读9分钟

即将发布的jOOQ 3.16将通过issue #982最终提供对各种RDBMS GIS扩展的支持。这本身就是个好消息,在未来的博文中会涉及到,当整合准备好后。这里的帖子是关于其他方面的。

增加对这种功能的支持是拖延的一个重要原因。你看,当你开发一个后端库时,你所做的都是实现算法、API和其他抽象的东西。但GIS是不同的。有了GIS,像我这样的SQL开发人员突然有了一个 "用户界面",而有了 "用户界面 "就唤醒了我内心的程序员小孩。

我惊喜地发现,DBeaver,我首选的SQL编辑器,对GIS的WKT 格式有开箱即用的支持。例如,运行一个这样的查询:

SELECT st_geomfromtext('polygon((0 0, 2 3, 0 6, -2 3, 0 0))');

其输出结果是

[

让我们来玩玩GIS

那么,自然而然地,除了玩弄它,还有什么其他事情要做呢?最明显的下一个生成的东西是你最喜欢的标志,这恰好是非常容易映射到多边形。让我们一步一步地看一下GIS的一些特征。而在这篇博文中,我将使用PostGIS,你可以非常容易地从docker hub获得它

WITH
  -- These "sprites" are just reusable 2D polygons, which I declared
  -- to avoid having to deal with the lengthy WKT format
  sprites AS (
    SELECT 

      -- We project the original 1x1 square, and a scaled 4x4 version
      s AS square,
      st_scale(s, 4, 4) AS square4

    -- Here, we're creating a square called "s"
    FROM (VALUES 
      (st_polygonfromtext('polygon ((0 0, 1 0, 1 1, 0 1, 0 0))'))
    ) t (s)
  )

-- st_union combines 2 polygons into a multipolygon
SELECT st_union(square, st_translate(square4, 2, 0))
FROM sprites

上述查询的输出结果是

MULTIPOLYGON (((0 0, 1 0, 1 1, 0 1, 0 0)), ((2 0, 6 0, 6 4, 2 4, 2 0)))

或者,使用DBeaver的可视化工具:

](i0.wp.com/blog.jooq.o…)

st_union ,与其他任何集合的联合没有什么不同。请注意,我已经把较大的方块向右翻译了2个单位,所以它们没有重叠。否则,联合体就会是较大的正方形。

一个多边形描述了二维数平面上的一组点。创建这两组点的联合是很自然的。我们也可以创建两个正方形的差值来代替:

WITH
  sprites AS (
    SELECT 
      s AS square,
      st_scale(s, 4, 4) AS square4
    FROM (VALUES 
      (st_polygonfromtext('polygon ((0 0, 1 0, 1 1, 0 1, 0 0))'))
    ) t (s)
  )
SELECT st_difference(square4, square)
FROM sprites

结果是:

POLYGON ((0 4, 4 4, 4 0, 1 0, 1 1, 0 1, 0 4))

或者,从视觉上看:

通过这些简单的工具,我们现在可以创建jOOQ标志的所有4个字母。作为提醒,这些工具是:

  • st_polygonfromtext:从WKT文本表示创建一个多边形
  • st_scale:缩放任何几何体到一个新的尺寸
  • st_translate:将任何几何体平移到一个新的位置
  • st_union:合并两个几何体(或更多,如果作为一个聚合函数使用)
  • st_difference:从另一个几何体中删除一个几何体

无论是PostGIS还是ISO/IEC 13249-3:2016标准版,GIS的API都很庞大,但现在这些简单的工具就足够了。让我们来看看这个查询:

WITH
  -- Creating the two squares to work with
  sprites AS (
    SELECT 
      s AS square,
      st_scale(s, 4, 4) AS square4
    FROM (VALUES 
      (st_polygonfromtext('polygon ((0 0, 1 0, 1 1, 0 1, 0 0))'))
    ) t (s)
  ),
  
  -- All the 4 letters j, o1, o2, q of the jOOQ logo
  letters AS (
    SELECT
    
      -- The letter j
      st_difference(
        st_difference(
          st_difference(
            square4, 
            st_translate(st_scale(square, 2, 3), 1, 1)
          ),
          st_translate(st_scale(square, 1, 2), 0, 2)
        ),
        st_translate(st_scale(square, 1, 0.5), 3, 2.5)
      ) AS j,
      
      -- The letter o
      st_difference(square4, st_translate(square, 1, 1)) AS o1,
      
      -- The letter o
      st_difference(square4, st_translate(square, 2, 2)) AS o2,
      
      -- The letter q
      st_union(
        st_difference(
          square4, 
          st_translate(st_scale(square, 2, 2), 1, 1)
        ),
        st_translate(st_scale(square, 1, 2.5), 1.5, -1)
      ) as q
    FROM sprites
  )
SELECT

  -- Combine all the 4 letters
  st_union(v)
FROM
  letters,
  
  -- Arrange the 4 letters next to each other
  LATERAL (VALUES
    (st_translate(j, 0, 5)),
    (st_translate(o1, 5, 5)),
    (o2),
    (st_translate(q, 5, 0))
  ) t (v);

这就产生了:

很好,是吧?

下一步:曼德尔布罗特集

对于任何拖延者来说,下一步自然是生成曼德布罗特集。为了让我们对曼德布罗特背后的东西有所准备,请看一下这个整洁的视频(更多的是拖延)。

有不同的方法可以用SQL来计算它,这里是我想出的一个方法:

WITH RECURSIVE

  -- These are the dimensions that you can play around with
  dims (r1, r2, i1, i2, s, it, p) AS (
    VALUES (
      
      -- The dimensions of the real axis
      -2::float, 1::float, 
      
      -- The dimensions of the imaginary axis
      -1.5::float, 1.5::float, 
      
      -- The step size on each axis, per pixel or sprite
      0.01::float, 
      
      -- The maximum number of iterations
      100, 
      
      -- "Infinity", i.e. when to stop
      256.0::float
    )
  ),
  
  -- The square again, as before
  sprites (s) AS (VALUES 
    (st_polygonfromtext('polygon ((0 0, 0 1, 1 1, 1 0, 0 0))'))
  ),
  
  -- The number plane, as ints
  n1 (r, i) AS (
    SELECT r, i 
    FROM 
      dims, 
      generate_series((r1 / s)::int, (r2 / s)::int) r,
      generate_series((i1 / s)::int, (i2 / s)::int) i
  ),
  
  -- The number plane as scaled floats
  n2 (r, i) AS (
    SELECT r::float * s::float, i::float * s::float
    FROM dims, n1
  ),
  
  -- The recursive calculation of the Mandelbrot formula
  -- zn = (zn-1)^2 + c
  l (cr, ci, zr, zi, g, it, p) AS (
    SELECT r, i, 0::float, 0::float, 0, it, p FROM n2, dims
    UNION ALL
    SELECT cr, ci, zr*zr - zi*zi + cr, 2*zr*zi + ci, g + 1, it, p 
    FROM l
    
    -- The recursions stops when we reach the maximum
    WHERE g < it
    
    -- Or, when we reach "infinity"
    AND zr*zr + zi*zi < p
  ),
  
  -- Find the last calculated value per point in the
  -- complex number plane c (cr, ci), discard the others
  x (cr, ci, zr, zi, g) AS (
    SELECT DISTINCT ON (cr, ci) cr, ci, zr, zi, g
    FROM l
    ORDER BY cr, ci, g DESC
  )
  
-- Turn every calculated point into a square
SELECT 
  st_union(
    st_translate(sprites.s, round(cr / dims.s), round(ci / dims.s))
  )
FROM x, sprites, dims

-- Retain only the points *not* belonging to the Mandelbrot set
-- You can also inverse the equation to retain the points that
-- belong to the set
WHERE zr*zr + zi*zi > p;

注意,我在这里使用了一个人为的 "无穷大",因为:

  1. 在这个变焦范围内,这可以加快事情的进展,但没有什么明显的区别。
  2. 我不知道如何让PostgreSQL溢出浮点运算到浮点无穷大,就像Java或CockroachDB或其他IEEE 754浮点运算实现那样。希望得到任何帮助,请看这个Stack Overflow问题

输出是众所周知的形状

你可以玩玩这个,用不同的 "dims "值来放大,比如说:

dims (r1, r2, i1, i2, s, it, p) as (values (
  (-0.925-0.032)::float, (-0.925+0.032)::float, 
  (0.266-0.032)::float, (0.266+0.032)::float, 
  0.00005::float, 100, 256.0::float)
)

这将产生一些非常整齐的缩放,都是用SQL:

当然,这不是计算这些东西的最有效方法,但这不是这里的重点,不是吗?

生命的游戏

我可以当书呆子了:

人生游戏的下一步?比如说滑翔枪。

- Aleksey Shipilëv(@shipilev)2021年11月14

当然,我也不得不接受这个挑战!现在,"生命游戏"是一个简单的 "游戏",我们有x,y自然数平面(例如 "像素"),通过每个坐标,我们联系该 "细胞 "是死是活。然后,我们建立以下一套规则:

  1. 任何有两个或三个活邻居的活细胞都能存活。
  2. 任何有三个活邻居的死细胞都会变成活细胞。
  3. 所有其他活细胞在下一代中死亡。同样地,所有其他的死细胞都保持死亡。

不,在SQL中使用空间扩展很难将事物 "动画化",所以作为一种变通方法,我只是将生命游戏中100×100像素的瓦片迭代显示在彼此旁边。第一次迭代只是一个起点,例如一个随机数量的 "活 "细胞:

WITH RECURSIVE

  -- Again generate a "sprite"a from a square polygon
  sprites (s) AS (VALUES (
    st_polygonfromtext('polygon ((0 0, 0 1, 1 1, 1 0, 0 0))')
  )),

  -- Generate the x, y number plane and associate a boolean value
  -- with each coordinate, generated randomly.
  -- 10% of all cells are alive
  m (x, y, b) AS (
    SELECT x, y, 
      CASE WHEN random() > 0.9 THEN 1 ELSE 0 END 
    FROM 
      generate_series(1, 100) x, 
      generate_series(1, 100) y
  )
SELECT st_union(st_translate(sprites.s, x::float, y::float))
FROM m, sprites

-- Display only "alive" cells
WHERE b = 1

这将产生类似的东西:

现在,我们要做的就是对游戏公式进行迭代。现在,递归SQL有相当多的限制。例如,我无法让PostgreSQL聚合递归数据,也无法自我连接递归表以找到任何单元格的最近邻居。但是通过窗口函数,这绝对是可能的。所以,这里就有了:

WITH RECURSIVE
  sprites (s) AS (VALUES (
    st_polygonfromtext('polygon ((0 0, 0 1, 1 1, 1 0, 0 0))')
  )),
  m (x, y, b) AS (
    SELECT x, y, 
      CASE WHEN random() > 0.9 THEN 1 ELSE 0 END
    FROM 
      generate_series(1, 100) x, 
      generate_series(1, 100) y
  ),
  
  -- This is the recursion of the Game of Life
  e (x, y, b, g) AS (
  
    -- The recursion starts with the above initial data
    SELECT x, y, b, 1 FROM m
    UNION ALL
    SELECT
      e.x, e.y,
      CASE
      
        -- If this cell is alive, and 2 or 3
        -- neighbors are alive, too, then this
        -- cell stays alive
        WHEN e.b = 1 AND
          sum(e.b) OVER w1
        + sum(e.b) OVER w2
        + sum(e.b) OVER w3 IN (2, 3)
          THEN 1
          
        -- If this cell is dead, and 3 neighbors
        -- are alive, then this cell becomes alive
        WHEN e.b = 0 AND
          sum(e.b) OVER w1
        + sum(e.b) OVER w2
        + sum(e.b) OVER w3 = 3
          THEN 1
          
        -- Otherwise, this cell dies or stays dead
        ELSE 0
      END,
      e.g + 1
    FROM e
    
    -- The recursion stops after 100 iterations
    WHERE e.g < 100
    WINDOW
    
      -- We order the data set by x, y. In SQL, there isn't 
      -- a notion of a 2-dimensional number plane. We only
      -- have 1 dimension.
      o AS (ORDER BY x, y),
      
      -- w1 is all the neighbors of the previous row in the x, y
      -- plane, which is all the SQL rows that are 101-99 rows
      -- before the current row.
      w1 AS (o ROWS BETWEEN 101 PRECEDING AND  99 PRECEDING),
      
      -- w2 is all the neighbors of the current row in the x, y
      -- number plane, excluding the current cell
      w2 AS (o ROWS BETWEEN   1 PRECEDING AND   1 FOLLOWING 
               EXCLUDE CURRENT ROW),
               
      -- w3 is all the neighbors of the next row
      w3 AS (o ROWS BETWEEN  99 FOLLOWING AND 101 FOLLOWING)
  )

-- Finally, combine all the iterations  
SELECT st_union(st_translate(
  sprites.s, 
  (x + (g - 1) % 10 * 120)::float, 
  (y - (g - 1) / 10 * 120)::float
))
FROM e, sprites
WHERE b = 1
;

把窗口的规格想成如下:

----+-------------+-------------+-------------+-------------+----
... | 105: (1, 5) | 106: (1, 6) | 107: (1, 7) | 108: (1, 8) | ...
... | 205: (2, 5) | 

因此,在一个100×100的网格中,x=3,y=7可以被编码为307,其邻居为

  • w1:206、207、208,即307的前101位和前99位之间。
  • w2:306,308,即在307的前1位和后1位之间
  • w3: 406, 407, 408, 即在307的99次方和101次方之间。

输出看起来像这样:

或者,放大到迭代1-4:

或21-24:

这真是太酷了,哈哈!

滑翔机枪

书呆子的办法是将滑翔机枪的图案做成动画,这只是意味着我们必须用一些恒定的东西来取代第一次迭代的随机生成。

WITH RECURSIVE
  sprites (s) AS (VALUES (
    st_polygonfromtext('polygon ((0 0, 0 1, 1 1, 1 0, 0 0))')
  )),
  m (x, y, b) AS (
    SELECT x, y, 

      -- Initial data here
      CASE WHEN (x, y) IN (
        (2, 6), (2, 7), (3, 6), (3, 7), (12, 6), (12, 7), (12, 8), 
        (13, 5), (13, 9), (14, 4), (14, 10), (15, 4), (15, 10),
        (16, 7), (17, 5), (17, 9), (18, 6), (18, 7), (18, 8), 
        (19, 7), (22, 4), (22, 5), (22, 6), (23, 4), (23, 5), 
        (23, 6), (24, 3), (24, 7), (26, 2), (26, 3), (26, 7), 
        (26, 8), (36, 4), (36, 5), (37, 4), (37, 5)
      ) THEN 1 ELSE 0 END 
    FROM 
      generate_series(1, 100) x, 
      generate_series(1, 100) y
  ),
  e (x, y, b, g) AS (
    SELECT x, y, b, 1 FROM m
    UNION ALL
    SELECT
      e.x, e.y,
      CASE
        WHEN e.b = 1 AND
          sum(e.b) OVER w1
        + sum(e.b) OVER w2
        + sum(e.b) OVER w3 IN (2, 3)
          THEN 1
        WHEN e.b = 0 AND
          sum(e.b) OVER w1
        + sum(e.b) OVER w2
        + sum(e.b) OVER w3 = 3
          THEN 1
        ELSE 0
      END,
      e.g + 1
    FROM e
    WHERE e.g < 100
    WINDOW
      o AS (ORDER BY x, y),
      w1 AS (o ROWS BETWEEN 101 PRECEDING AND  99 PRECEDING),
      w2 AS (o ROWS BETWEEN   1 PRECEDING AND   1 FOLLOWING 
               EXCLUDE CURRENT ROW),
      w3 AS (o ROWS BETWEEN  99 FOLLOWING AND 101 FOLLOWING)
  )
SELECT st_union(st_translate(
  sprites.s, 
  (x + (g - 1) % 10 * 120)::float, 
  (y - (g - 1) / 10 * 120)::float
))
FROM e, sprites
WHERE b = 1
;

而且,正如你所看到的,这把枪是有效的:

它从这个开始:

并最终产生了众所周知的滑翔机: