clickhouse 查询速度快的原因

1：clickhouse 对每个查询尽可能地利用最大的cpu资源。每个查询背后多线程并发执行加载数据。 ......

clickhouse 查询的注意点

clickhouse集群的并发数需要限制，clickhouse 不适合高并发的场景。

join

join 算法

grace_hash
hash
parallel_hash
partial_merge
direct
auto
full_sorting_merge
prefer_partial_merge 默认会使用 hash 或 direct

查询参数

查询限制参数

max_rows_to_read: 一个SQL所能读取的最大行数
max_bytes_to_read：一个SQL所能读取的最大字节数
read_overflow_mode：SQL读取数据超出限制的处理模式
max_rows_to_read_leaf：一个分布式SQL 在每个节点上所能读取的最大的行数
max_bytes_to_read_leaf：一个分布式SQL 在每个节点上所能读取的最大的字节数
read_overflow_mode_leaf：SQL在节点上读取数据超出限制的处理模式

查询内存限制参数

max_memory_usage：一个SQL所能消耗的最大的内存。
max_block_size：每次从表中的一个block之中最多加载多少行数据。

限制cpu的使用量

max_threads：限制一个SQL，SQL执行的每个阶段最多启用的线程数

查询结果数限制

max_result_rows：一个SQL 返回的结果数的限制
result_overflow_mode: SQL 查询数据超出限制的处理模式

并发相关的参数

max_concurrent_queries：一个节点最多的并发数
max_concurrent_insert_queries：一个节点最多的并发写入的数目
max_concurrent_select_queries：一个节点最多的鬓发查询数目
max_concurrent_queries_for_user：一个用户的最多的并发数，默认是0，没有限制
max_concurrent_queries_for_all_users：所有用户的最所的并发数，默认是0，没有限制
queue_max_wait_ms：当并发请求满了之后，请求在队列之中等待的时间

join 相关的参数

max_bytes_in_join：每个join 所能使用的最大的内存 max_rows_in_join：每个join 所能处理的最大的行数 join_overflow_mode：join 出现溢出处理的模式 join_use_nulls：如何处理在outer join过程之中出现的空字段的填充问题，默认取值为0，表示为空的字段填充字段的默认值。 join_algorithm: join 后面所使用的算法 join_default_strictness: 设置默认的Join类型默认的取值为ALL

查询监控

QPS
查看当前正在执行的SQL的详细信息
慢查询监控
查询最近查询失败的100条SQL 详细信息

QPS

select * from  system.events where event = 'SelectQuery'

注释：给出的SQL 提示了查询SQL的次数，具体的QPS指标还要结合prometheus的语法来计算。

查看当前正在执行的SQL的详细信息

select 
  hostName() as hostName,
  user,
  is_cancelled,
  query_id,
  query,
  elapsed,
  memory_usage,
  rows_read,
  bytes_read,
  total_rows_approx
from 
  clusterAllReplicas('集群名', 'system.processes')

慢查询监控

select 
   user,
   query_start_time,
   is_initial_query,
   query,
   query_duration_ms,
   databases[1] as database,
   tables[1] as table,
   read_rows,
   read_bytes,
   length(columns) as columnLength,
   memory_usage,
   result_rows,
   result_bytes,
   query_id,
   Settings
from 
   clusterAllReplicas('集群名', 'system.query_log')
where 
   query_duration_ms >= 慢查询的时间间隔
  and type = 2 
  and query_kind = 'Select' 
  and query_start_time >= '查询开始时间'
  and query_start_time <= '查询结束时间'

查询最近查询失败的100条SQL 详细信息

select 
   user,
   query_start_time,
   query,
   query_duration_ms,
   databases[1] as database,
   tables[1] as table,
   read_rows,
   read_bytes,
   length(columns) as columnLength,
   memory_usage,
   result_rows,
   result_bytes,
   query_id
from 
   clusterAllReplicas('集群名', 'system.query_log')
where 
  query_start_time >= '查询开始时间'
  and query_start_time <= '查询结束时间'
  and type <> 1 
  and type <> 2 
  and query_kind = 'Select' 
order by 
   query_start_time  desc 
limit 100

查询-clickhouse