如何用SQL绘制ASCII条形图

114 阅读3分钟

不需要昂贵的Tableau订阅费用。抛弃微软的Excel。只要使用原生的PostgreSQL,就可以快速地将你的数据可视化!

这是我有过的一个想法。正如你可能知道的,jOOQ可以从你的jOOQ结果中产生花哨的图表。但这需要你使用jOOQ,而你可能没有使用jOOQ,因为你没有用Java/Kotlin/Scala编码(否则,你就会使用jOOQ)。这没关系。我想,那为什么不直接用SQL(具体来说是PostgreSQL)来做呢?毕竟,我是我。

我把这称为反向@lukaseder 。"数据库是无用的,在应用程序中做所有的事情"。

- Christian Nockemann (@nockemannc)2022年8月31

所以,有了下面这个花哨的查询,我将在这个github repo上进一步维护和开发,你将能够轻松地直接从你最喜欢的SQL编辑器中绘制任何东西。

看一下吧看吧:

-- The example uses https://www.jooq.org/sakila, but you can just replace
-- the "source" table with anything else
with 

  -- This part is what you can modify to adapt to your own needs
  --------------------------------------------------------------

  -- Your data producing query here 
  source (key, value) as (
    select payment_date::date::timestamp, sum(amount)
    from payment
    where extract(year from payment_date) < 2006
    group by payment_date::date::timestamp
    order by payment_date::date::timestamp
  ),
  
  -- Some configuration items:
  constants as (
    select
    
      -- the height of the y axis
      15 as height, 

      -- the width of the x axis, if normalise_x, otherwise, ignored
      25 as width, 

      -- the bar characters
      '##' as characters,

      -- the characters between bars
      ' ' as separator,
      
      -- the padding of the labels on the y axis
      10 as label_pad, 
      
      -- whether to normalise the data on the x axis by
      -- - filling gaps (if int, bigint, numeric, timestamp, 
      --   timestamptz)
      -- - scaling the x axis to "width"
      true as normalise_x
  ),
  
  -- The rest doesn't need to be touched
  --------------------------------------
  
  -- Pre-calculated dimensions of the source data
  source_dimensions (kmin, kmax, kstep, vmin, vmax) as (
    select 
      min(key), max(key), 
      (max(key) - min(key)) / max(width), 
      min(value), max(value)
    from source, constants
  ),
  
  -- Normalised data, which fills the gaps in case the key data
  -- type can be generated with generate_series (int, bigint, 
  -- numeric, timestamp, timestamptz)
  source_normalised (key, value) as (
    select k, coalesce(sum(source.value), 0)
    from source_dimensions
      cross join constants
      cross join lateral 
        generate_series(kmin, kmax, kstep) as t (k)
      left join source 
        on source.key >= t.k and source.key < t.k + kstep
    group by k
  ),

  -- Replace source_normalised by source if you don't like the 
  -- normalised version
  actual_source (i, key, value) as (
    select row_number() over (order by key), key, value 
    from source_normalised, constants
    where normalise_x
    union all
    select row_number() over (order by key), key, value
    from source, constants
    where not normalise_x
  ),
    
  -- Pre-calculated dimensions of the actual data
  actual_dimensions (
    kmin, kmax, kstep, vmin, vmax, width_or_count
  ) as (
    select 
      min(key), max(key), 
      (max(key) - min(key)) / max(width), 
      min(value), max(value), 
      case
        when every(normalise_x) then least(max(width), count(*)::int) 
        else count(*)::int 
      end
    from actual_source, constants
  ),
  
  -- Additional convenience
  dims_and_consts as (
    with 
      temp as (
        select *, 
        (length(characters) + length(separator)) 
          * width_or_count as bar_width
      from actual_dimensions, constants
    )
    select *,
      (bar_width - length(kmin::text) - length(kmax::text)) 
        as x_label_pad
    from temp
  ),
  
  -- A cartesian product for all (x, y) data points
  x (x) as (
    select generate_series(1, width_or_count) from dims_and_consts
  ),
  y (y) as (
    select generate_series(1, height) from dims_and_consts
  ),

  -- Rendering the ASCII chart
  chart (rn, chart) as (
    select
      y,
      lpad(y * (vmax - vmin) / height || '', label_pad) 
        || ' | ' 
        || string_agg(
             case 
               when height * actual_source.value / (vmax - vmin) 
                 >= y then characters 
               else repeat(' ', length(characters)) 
             end, separator 
             order by x
           )
    from 
      x left join actual_source on actual_source.i = x, 
      y, dims_and_consts
    group by y, vmin, vmax, height, label_pad
    union all
    select 
      0, 
      repeat('-', label_pad) 
        || '-+-' 
        || repeat('-', bar_width)
    from dims_and_consts
    union all
    select 
      -1, 
      repeat(' ', label_pad) 
        || ' | ' 
        || case 
             when x_label_pad < 1 then '' 
             else kmin || repeat(' ', x_label_pad) || kmax 
           end
    from dims_and_consts
  )
select chart
from chart
order by rn desc
;

sakila数据库中运行,你会得到这个漂亮的图表。

chart                                                                                   |
----------------------------------------------------------------------------------------+
11251.7400 |                                                       ##                   |
10501.6240 |                                                       ##                   |
9751.50800 |                                                       ##                   |
9001.39200 |                                                       ##                   |
8251.27600 |                                                       ##                   |
7501.16000 |                                     ##                ##             ## ## |
6751.04400 |                                     ##                ##             ## ## |
6000.92800 |                                     ##                ##             ## ## |
5250.81200 |                   ##                ##             ## ##             ## ## |
4500.69600 |                   ##                ##             ## ##             ## ## |
3750.58000 |                   ## ##             ## ##          ## ##             ## ## |
3000.46400 |                   ## ##             ## ##          ## ##             ## ## |
2250.34800 |    ##             ## ##          ## ## ##          ## ## ##          ## ## |
1500.23200 | ## ##             ## ##          ## ## ##          ## ## ##          ## ## |
750.116000 | ## ##             ## ##          ## ## ##          ## ## ##          ## ## |
-----------+----------------------------------------------------------------------------|
           | 2005-05-24 00:00:00                                     2005-08-23 00:00:00|

这不是很了不起吗?

它是如何工作的?

该查询有3个部分:

  • source:实际的查询,产生数据。这是你可以替代和放置你自己的东西,而不是
  • constants:配置部分,在这里你可以调整尺寸、柱状图字符等。
  • 其余部分,你不需要篡改。

source 就是这样的一个查询:

  source (key, value) as (
    select payment_date::date::timestamp, sum(amount)
    from payment
    where extract(year from payment_date) < 2006
    group by payment_date::date::timestamp
    order by payment_date::date::timestamp
  )

它从支付表中产生每个支付日期的所有收入。而payment_date 是一个timestamp ,我们把它铸成日期,以便能够得到每日收入。但是为了使用PostgreSQL的generate_series 来填补空白,我们必须把日期值铸回时间戳,因为令人惊讶的是,PostgreSQL中没有本地的generate_series(date, date) 函数。

你所要做的就是以键/值的形式产生一组数据。你可以用其他任何东西来代替它,例如,得到累积收入。

  source (key, value) as (
    select 
      payment_date::date::timestamp,
      sum(sum(amount)) over (order by payment_date::date::timestamp)
    from payment
    where extract(year from payment_date) < 2006
    group by payment_date::date::timestamp
    order by payment_date::date::timestamp
  )

......为此,你(目前)必须将规范化的数据打回false (间隙的填充还不正确)。另外,为了节省空间,我把条形图变得更细了。

  '#' as characters,
  '' as separator,
  false as normalise_x

现在你会得到这个漂亮的图表,它显示了收入的指数增长,我们非常希望向我们的经理展示这个指数(它实际上不是指数,因为现在,间隙没有被尊重,但咄,这只是生成的样本数据)。

chart                                                |
-----------------------------------------------------+
66872.4100 |                                        #|
62414.2493 |                                       ##|
57956.0886 |                                     ####|
53497.9280 |                                   ######|
49039.7673 |                                  #######|
44581.6066 |                               ##########|
40123.4460 |                              ###########|
35665.2853 |                            #############|
31207.1246 |                          ###############|
26748.9640 |                       ##################|
22290.8033 |                     ####################|
17832.6426 |                   ######################|
13374.4820 |                #########################|
8916.32133 |            #############################|
4458.16066 |        #################################|
-----------+-----------------------------------------|
           | 2005-05-24 00:00:00  2005-08-23 00:00:00|