【性能优化】mysql查询改写checkhouse查询性能优化SpringBoot启动超过10分钟--数据库篇背景上

SpringBoot启动超过10分钟--数据库篇

背景

上次不是我们关于启动速度非常慢，因为表太多原因，于是我们db层改成checkhouse。改用多库的方式解决问题。但是又引发新的问题，我们领导说不能直接用关联【inner join】之类的查询。于是我就去查阅了些资料，得知checkhouse是将数据全部查出来放置到内存中处理，怕影响服务器内存溢出宕机问题吧。于是叫我用一张表一张表的解决问题

过程

如下SQL只是一个数据的查询，因为使用表不一样，计算方式不一样，这里就只列了一个

1、关联查询改单表查询

select count(a.id), e.module_name from a inner join b on ...
 inner join c on ...
  inner join d on ...
   inner join e on ...
where c.条件 = 参数
group by e.module_name

从上述来看就是很简单的一个关联SQL，但是问题是c表有条件，e表有模块，然后我又要查a表的数据。乍一看以为循环即可，于是我的第一步就是

select count(a.id) from a
where a.id in (
    select b.aId from b where b.cId in (
        select c.bId from c where c.条件 = 参数 and c.dId in (
            select dId from d where d.eId in (
                select eId from e where e.module_name = '参数'
            )
        )
    )
)

乍一看就是一坨屎一样的SQL，明明很简单关联查询，但是没办法啊，因为是临时方案，先查询过滤条件后的模块，再将模块穿进去，一个个查询

运行

发现执行了20秒，因为模块过多再加上数据多，这么一看这咋行，业务咋通过，于是再优化

2、一个表多个业务汇合在一个表中计算

select a.id，a.sex, a.status from a
where a.id in (
    select b.aId from b where b.cId in (
        select c.bId from c where c.条件 = 参数 and c.dId in (
            select dId from d where d.eId in (
                select eId from e where e.module_name = '参数'
            )
        )
    )
)

这么一看还是上一版，刚刚已经说了多个业务汇聚在一个表中直接查询出来，然后在Java代码中计算个数或者总数等逻辑，但是还有更复杂的计算业务

3、a表分组，b表计算，b表过滤业务

select a.num from a inner join d on a.code = d.id
and a.test_id 过滤
group by d.group_name

这也是一个关联查询，如果单纯使用上面方法也可以进行，于是乎

select a.num from a where a.test_id 过滤 and a.code in (
    select d.id from d where d.group_name = '模块'
)

这里就会出现问题，这个模块是不是要先经过a表的条件筛选，不然模块太多了，无用的io查询。这个时候就需要先通过d表嵌套a表来过滤出可用数据，再接着查a表的每个模块对应关联数据，来查询