大家好,我是砸锅。一个摸鱼八年的后端开发。熟悉 Go、Lua。第十五天还是继续和大家一起学习 Rust😊
設計一個 SQL 查询工具
流程大部分都是从各种数据源里操作数据,例如数据获取、过滤、投影、排序。首先需要一个 SQL 解析器,然后需要一个将数据源加载为 DataFrame。所以选定了 sqlparser-rs、polars,然后接下来就是将 sqlparser 解析出来的抽象语法树(Abstract Syntax Tree),映射到 polars 的 DataFrame 的操作
抽象语法树是用来描述复杂语法规则的工具,SQL 或者 DSL、编程语言都有通过 AST 描述,polars 内部也有自己的 AST,所以 SQL 查询工具的需求本质就是如何将一个 SQL AST 转换成另一个 DataFrame AST,在两个数据结构之间进行转换
对于不熟悉的第三方库,可以先写 example 测试一下
Cargo.toml
[package]
name = "queryer"
version = "0.1.0"
edition = "2021"
[[example]]
name = "dialect"
[dependencies]
anyhow = "1" # 错误处理
async-trait = "0.1" # 允许 trait 里有 async fn
sqlparser = "0.10" # SQL 解析器
polars = { version = "0.15", features = ["json", "lazy"] } # DataFrame 库
reqwest = { version = "0.11", default-features = false, features = [
"rustls-tls",
] } # 我们的老朋友 HTTP 客户端
tokio = { version = "1", features = ["fs"] } # 我们的老朋友异步库,我们这里需要异步文件处理
tracing = "0.1" # 日志处理
[dev-dependencies]
tracing-subscriber = "0.2" # 日志处理
tokio = { version = "1", features = [
"full",
] } # 在 example 下我们需要更多的 tokio feature
dialect.rs
use sqlparser::{dialect::GenericDialect, parser::Parser};
fn main() {
tracing_subscriber::fmt::init();
let sql = "SELECT a a1, b, 123, myfunc(b), * \
FROM data_source \
WHERE a > b AND b < 100 AND c BETWEEN 10 AND 20 \
ORDER BY a DESC, b \
LIMIT 50 OFFSET 10";
let ast = Parser::parse_sql(&GenericDialect::default(), sql);
println!("{:#?}", ast)
}
执行结果:
➜ queryer git:(master) ✗ cargo run --example dialect
Finished dev [unoptimized + debuginfo] target(s) in 0.47s
Running `target/debug/examples/dialect`
Ok(
[
Query(
Query {
with: None,
body: Select(
Select {
distinct: false,
top: None,
projection: [
ExprWithAlias {
expr: Identifier(
Ident {
value: "a",
quote_style: None,
},
),
alias: Ident {
value: "a1",
quote_style: None,
},
},
...
最终解析效果:
➜ queryer git:(master) ✗ cargo run --example covid --quiet
Ok(shape: (13, 5)
+-----------------------+--------------+-----------+--------------+------------+
| location | total_cases | new_cases | total_deaths | new_deaths |
| --- | --- | --- | --- | --- |
| str | f64 | f64 | f64 | f64 |
+=======================+==============+===========+==============+============+
| "Asia" | 2.08859351e8 | 8.4798e4 | 1.605175e6 | 377 |
+-----------------------+--------------+-----------+--------------+------------+
| "Colombia" | 6.352923e6 | 4567 | 1.42186e5 | 101 |
+-----------------------+--------------+-----------+--------------+------------+
| "Germany" | 3.7739472e7 | 1.1317e4 | 1.65441e5 | 127 |
+-----------------------+--------------+-----------+--------------+------------+
| "High income" | 4.26794547e8 | 1.01266e5 | 2.838402e6 | 376 |
+-----------------------+--------------+-----------+--------------+------------+
| ... | ... | ... | ... | ... |
+-----------------------+--------------+-----------+--------------+------------+
| "Spain" | 1.3731478e7 | 8801 | 1.18434e5 | 251 |
+-----------------------+--------------+-----------+--------------+------------+
| "Sweden" | 2.693458e6 | 2985 | 2.3279e4 | 259 |
+-----------------------+--------------+-----------+--------------+------------+
| "United Kingdom" | 2.4274361e7 | 1.5121e4 | 2.17262e5 | 1007 |
+-----------------------+--------------+-----------+--------------+------------+
| "Upper middle income" | 1.4404686e8 | 2.0479e4 | 2.592276e6 | 194 |
+-----------------------+--------------+-----------+--------------+------------+
| "World" | 6.70246597e8 | 1.22817e5 | 6.823644e6 | 588 |
+-----------------------+--------------+-----------+--------------+------------+
)
➜ queryer git:(master) ✗