本文已参与「新人创作礼」活动,一起开启掘金创作之路
jsonpath是用来解析json数据的工具,类似于xpath,jsonpath可以解析十分复杂的json数据。
PostgreSQL json发展历史:
PostgreSQL从9.2开始就支持json数据类型,但是由于解析json数据的性能很差,导致并不受大家青睐,而是选择使用nosql数据库代替。于是从pg9.4开始支持了jsonb数据类型,相较于json类型,jsonb由于并不需要每次使用时都去进行解析,因此性能提升很多,都是还支持索引查询等。
而从pg12开始对于json的支持更加强大:sql 2016的sql/json标准有15条, PG 12 实现了14条, 远远超过oracle(18c 11/15), mysql(8.0.4 5/15), sqlserver(2017 2/15)最新版本。
同时在pg12中引入了jsonpath类型,以及一系列相关的函数,使得json数据的查询性能更进一步,功能也愈发强大。
JSONPATH语法:
JSONpath 函数表达式语法如下:
- 点号 . 表示引用 Json 数据的元素
- 方括号 [] 表示引用数组元素
- Json 数据中的数组元素下标从0开始
JSONpath中的变量如下:
- $ 符号表示要查询的Json文本的变量
- $varname 表示指定变量
- @ 指在 filter 表达式中表示当前路径元素的变量
JSONPATH使用举例:
简单查询:
bill@bill=>SELECT jsonb_path_query_array('[1,2,3,4,5]', '$[*] ? (@ > 3)');
jsonb_path_query_array
------------------------
[4, 5]
(1 row)
创建测试表:
CREATE TABLE house(js jsonb);
INSERT INTO house VALUES
('{
"address": {
"city":"Moscow",
"street": "Ulyanova, 7A"
},
"lift": false,
"floor": [
{
"level": 1,
"apt": [
{"no": 1, "area": 40, "rooms": 1},
{"no": 2, "area": 80, "rooms": 3},
{"no": 3, "area": 50, "rooms": 2}
]
},
{
"level": 2,
"apt": [
{"no": 4, "area": 100, "rooms": 3},
{"no": 5, "area": 60, "rooms": 2}
]
}
]
}');
查询:
bill@bill=>select jsonb_pretty(js) from house ;
jsonb_pretty
----------------------------------
{ +
"lift": false, +
"floor": [ +
{ +
"apt": [ +
{ +
"no": 1, +
"area": 40, +
"rooms": 1 +
}, +
{ +
"no": 2, +
"area": 80, +
"rooms": 3 +
}, +
{ +
"no": 3, +
"area": 50, +
"rooms": 2 +
} +
], +
"level": 1 +
}, +
{ +
"apt": [ +
{ +
"no": 4, +
"area": 100,+
"rooms": 3 +
}, +
{ +
"no": 5, +
"area": 60, +
"rooms": 2 +
} +
], +
"level": 2 +
} +
], +
"address": { +
"city": "Moscow", +
"street": "Ulyanova, 7A"+
} +
}
(1 row)
该数据的层次结构如下图:
看上去该数据层次挺复杂的,但是实际中可能数据层次远比这个复杂的多,那么我们看看如何使用jsonpath来进行查询的:
bill@bill=>SELECT jsonb_path_query_array(js, '$.floor[0, 1].apt[1 to last]') FROM house;
jsonb_path_query_array
-----------------------------------------------------------------------------------------------------------
[{"no": 2, "area": 80, "rooms": 3}, {"no": 3, "area": 50, "rooms": 2}, {"no": 5, "area": 60, "rooms": 2}]
(1 row)
而如果不用jsonpath的话,我们可能需要这么写:
bill@bill=>SELECT jsonb_agg(apt) FROM (
bill(# SELECT apt->generate_series(1, jsonb_array_length(apt) - 1) FROM (
bill(# SELECT js->'floor'->unnest(array[0, 1])->'apt' FROM house
bill(# ) apts(apt)
bill(# ) apts(apt);
jsonb_agg
-----------------------------------------------------------------------------------------------------------
[{"no": 2, "area": 80, "rooms": 3}, {"no": 3, "area": 50, "rooms": 2}, {"no": 5, "area": 60, "rooms": 2}]
(1 row)
相比之下,使用jsonpath相关的函数查询简便太多了。
又比如,我们需要判断json数据中是否包含某个值,可以这样:
bill@bill=>SELECT jsonb_path_exists(js, '$.** ? (@ == "Moscow")') FROM house;
jsonb_path_exists
-------------------
t
(1 row)
而如果不使用jsonpath呢?
bill@bill=>WITH RECURSIVE t(value) AS (
bill(# SELECT * FROM house UNION ALL (
bill(# SELECT COALESCE(kv.value, e.value) AS value
bill(# FROM t
bill(# LEFT JOIN LATERAL jsonb_each (
bill(# CASE WHEN jsonb_typeof(t.value) = 'object' THEN t.value
bill(# ELSE NULL END
bill(# ) kv ON true
bill(# LEFT JOIN LATERAL jsonb_array_elements (
bill(# CASE WHEN jsonb_typeof(t.value) = 'array' THEN t.value
bill(# ELSE NULL END
bill(# ) e ON true
bill(# WHERE kv.value IS NOT NULL OR e.value IS NOT NULL
bill(# )
bill(# ) SELECT EXISTS (SELECT 1 FROM t WHERE value = '"Moscow"');
exists
--------
t
(1 row)
那么如何使用jsonpath进行数据过滤呢?已上面这张表为例,我们查询apt.no大于3的数据:
bill@bill=>SELECT jsonb_path_query(js, '$.floor.apt.no ? (@>3)') FROM house;
jsonb_path_query
------------------
4
5
(2 rows)
同样,jsonpath也支持索引的使用:
bill@bill=>CREATE INDEX ON house USING gin (js);
CREATE INDEX
bill@bill=>SET ENABLE_SEQSCAN TO OFF;
SET
bill@bill=>EXPLAIN (COSTS OFF) SELECT * FROM house
bill-# WHERE js @? '$.floor[*].apt[*] ? (@.rooms == 3)';
QUERY PLAN
--------------------------------------------------------------------------------
Bitmap Heap Scan on house
Recheck Cond: (js @? '$."floor"[*]."apt"[*]?(@."rooms" == 3)'::jsonpath)
-> Bitmap Index Scan on house_js_idx
Index Cond: (js @? '$."floor"[*]."apt"[*]?(@."rooms" == 3)'::jsonpath)
(4 rows)
除此之外,jsonpath支持了20多种相关的函数,是不是十分强大,赶快用起来吧!
postgres=# \df *.*json*path*
List of functions
Schema | Name | Result data type | Argument data types
| Type
------------+------------------------------+------------------+------------------------------------------------------------------------------------------
-+------
pg_catalog | gin_consistent_jsonb_path | boolean | internal, smallint, jsonb, integer, internal, internal, internal, internal
| func
pg_catalog | gin_extract_jsonb_path | internal | jsonb, internal, internal
| func
pg_catalog | gin_extract_jsonb_query_path | internal | jsonb, internal, smallint, internal, internal, internal, internal
| func
pg_catalog | gin_triconsistent_jsonb_path | "char" | internal, smallint, jsonb, integer, internal, internal, internal
| func
pg_catalog | json_extract_path | json | from_json json, VARIADIC path_elems text[]
| func
pg_catalog | json_extract_path_text | text | from_json json, VARIADIC path_elems text[]
| func
pg_catalog | jsonb_delete_path | jsonb | jsonb, text[]
| func
pg_catalog | jsonb_extract_path | jsonb | from_json jsonb, VARIADIC path_elems text[]
| func
pg_catalog | jsonb_extract_path_text | text | from_json jsonb, VARIADIC path_elems text[]
| func
pg_catalog | jsonb_path_exists | boolean | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_exists_opr | boolean | jsonb, jsonpath
| func
pg_catalog | jsonb_path_exists_tz | boolean | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_match | boolean | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_match_opr | boolean | jsonb, jsonpath
| func
pg_catalog | jsonb_path_match_tz | boolean | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_query | SETOF jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_query_array | jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_query_array_tz | jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_query_first | jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_query_first_tz | jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonb_path_query_tz | SETOF jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false
| func
pg_catalog | jsonpath_in | jsonpath | cstring
| func
pg_catalog | jsonpath_out | cstring | jsonpath
| func
pg_catalog | jsonpath_recv | jsonpath | internal
| func
pg_catalog | jsonpath_send | bytea | jsonpath
| func
(25 rows)
参考链接:
github.com/digoal/blog…
www.postgres.cn/docs/12/fun…
www.postgres.cn/docs/12/dat…