openLooKeng一键安装部署openLooKeng是华为技术团队基于Trino(最初称作PrestoSQL)开发的

openLooKeng是华为技术团队基于Trino(最初称作PrestoSQL)开发的数据虚拟化技术。与Trino开源项目相比，openLooKeng提供了额外的优化和增强功能，而且其吸引我的地方就是：跨源数据分析能力。接下来我们简单记录其安装和测试过程。

安装环境

操作系统CentOS
内存大于4G（本人测试机为8C,8G的虚拟机）

一键安装

讲真，安装过程比较省心，这个必须表扬下，只需运行下面命令即可，默认安装成功后会自动启动服务

wget -O - https://download.openlookeng.io/install.sh|bash

【安装过程出现小意外】

[ERROR] There is not enough memory for openLooKeng to install. OpenLooKeng requires more than 4GB JVM memory.
[ERROR] openLooKeng installation failed.

从异常信息中很容易发现是内存小了，最开始虚拟机分配的内存是不到4G，所以出现了安装错误信息，修改虚拟机内存为8G后问题解决。

安装过程中从日志信息发现，为openLookeng创建默认的系统用户openlkadmin，建议安装后阅读下安装日志。

[INFO] Create user openlkadmin.

【安装成功日志信息】

[INFO] Installed openLooKeng cluster success. 
[INFO] Starting openLooKeng service now... 
[INFO] start openLooKeng service on localhost...
waiting cluster to start.......

[INFO] You can see more details in /home/openlkadmin/.openlkadmin/logs/launcher.log and /home/openlkadmin/.openlkadmin/logs/server.log.
[INFO] Started openLooKeng server success.
[INFO] Execute /opt/openlookeng/bin/stop.sh by user 'openlkadmin', to stop openLooKeng cluster.
[INFO] Execute /opt/openlookeng/bin/openlk-cli, to start openLooKeng client.

从日志中可看到安装成功信息并正常启动，启动日志可查看server.log文件，日志中都有详细介绍。
键快速安装成功后，默认配置为协调节点和工作节点一起工作，分布式集群部署可参考官方文档进行配置。
客户端命令也是常用的工具，具体参考文档使用。
可视化查询管理界面，可在浏览器中访问http://localhost:8090 ，与基本的数据库管理工具类似，参考文档。

数据查询及体验

使用命令行查询

安装和启动成功后使用命令/opt/openlookeng/bin/openlk-cli打开客户端

show catalogs; # 查看库信息
use tpcds.sf10; # 切换数据库

测试脚本如下我们测试使用的catlogs为tpcds，它包含的测试数据schema有sf1， sf10，sf100...，这里tpcds根据数据大小划分不同的schema,sf后面跟的数值越大说明其数据量越大，但每个schema中的表是相同的。

with
    customer_total_return
    as
    (
        select sr_customer_sk as ctr_customer_sk 
            , sr_store_sk as ctr_store_sk 
            , sum(SR_FEE) as ctr_total_return
        from store_returns 
            , date_dim
        where sr_returned_date_sk = d_date_sk and d_year =2000
        group by sr_customer_sk 
            ,sr_store_sk
    )
select c_customer_id
from customer_total_return ctr1 
    , store 
    , customer
where ctr1.ctr_total_return > (
        select avg(ctr_total_return)*1.2
        from customer_total_return ctr2
        where ctr1.ctr_store_sk = ctr2.ctr_store_sk
    ) 
    and s_store_sk = ctr1.ctr_store_sk 
    and s_state = 'NM' 
    and ctr1.ctr_customer_sk = c_customer_sk
order by c_customer_id
limit 100;

测试结果截图

从截图中可看出查询用时18秒及查询记录条数。这里特意说明一下，cli界面会显示执行过程及查询进度百分比，比较贴心。

在测试过程中发现一个现象，这里做一个记录。单节点在执行上面测试脚本时，虚拟机（8C8G）的cpu使用率为将近100%，切换过不同数据量的schema.

也就是说不管数据量大小只要执行查询，cpu就直接飙到100，也有可能是单节点运算，下面图可能更能说明问题。

4. 一键安装默认内存为1G 在测试过程中使用上面脚本查询tpcd.sf100数据库时，没有执行成功，过程报异常信息大致意思是内存不够用。异常信息如下：

Query 20210311_084550_00044_pba3g failed: Query exceeded per-node total memory limit of 1GB [Allocated: 
1023.73MB, Delta: 956.27kB, Top Consumers: {HashAggregationOperator=900.85MB, 
HashBuilderOperator=108.25MB, PartitionedOutputOperator=10.77MB}]

看监控图发现，执行查询内存一直增长，所有的数据运算应该都是在内存中进行，一键安装默认内存为1G，/opt/openlookeng/hetu-server-1.1.0/etc下的config.properties，将内存修改大一点，最终查询成功。

以上初次学习和使用openLookeng的心得和总结，能对您有所帮助不甚荣幸。