集群监控

集群各个节点的存活时间
http 连接数监控
tcp 连接数监控
集群当前数据库的数目
集群当前表的数目

集群各个节点的存活时间

select
         hostName, upDay
     from (
        select
           hostName() as hostName, intDiv(uptime(), 3600 * 24) as upDay  
        from  clusterAllReplicas('集群名', 'system.clusters')
        where cluster = '集群名'
     ) a
     group by  hostName,upDay

http 连接数监控

  select * from system.metrics where metric = 'HTTPConnection'

注意：system.metrics 表之中的指标是一个瞬时值。

tcp 连接数监控

 select * from system.metrics where metric = 'TCPConnection'

注意：system.metrics 表之中的指标是一个瞬时值。

集群当前数据库的数目

select count(*) from system.databases

集群当前表的数目

select count(*) from system.tables

表数据量监控

查看集群之中各个表的数据量的大小
查看具体某个表在各个节点上的数据分布

查看集群之中各个表的数据量的大小

  select
      database,
      name,
      sum(total_rows) as total_rows , // 表之中的行数
      sum(total_bytes) as total_bytes  // 表之中的数据量
  from
      clusterAllReplicas('集群名', 'system.tables')
  where
  group by 
     database,
     name 
  order by total_bytes desc

查看具体某个表在各个节点上的数据量分布

  select
      hostName() as hostName,
      database,
      name,
      total_rows,
      total_bytes
  from
      clusterAllReplicas('集群名', 'system.tables')
  where
      database = '数据库名'
      and table = '表名'
  order by total_rows desc

表分区监控

查看某个表的分区方式
获取某个表的分区数目
查看某个表的分区数据量情况分区之中part数目
获取一个表的分区范围
检查集群之中没有分区的表以及表的数据量
检查集群之中存在分区不合理的表

查看某个表的分区方式

select table,partition_key from system.tables where database = '数据库名'  and table = '表名'

获取某个表的分区数目

 select count(*) as partition_num
       from (
          select partition_id from clusterAllReplicas('集群名', 'system.parts') 
          where database = '数据库名' and table = '表名'
          group by partition_id
       ) t

查看某个表的分区数据量情况分区之中part数目

select
   database,
   table,
   partition,
   count(*) as part_num, // 分区下的part 数目
   sum(rows) as rows, // 分区下的行数
   sum(data_compressed_bytes) as data_compress, // 压缩后的数据量大小 
   formatReadableSize(data_compress) r_data,
   divide(data_compress, rows) as row_rate -- 每行的大小
from
  clusterAllReplicas('集群名', 'system.parts')
where active = 1
    and database = '数据库名'
   and table = '表名'
group by
   database,
   table,
   partition
order by partition

获取一个表的分区范围

   select max(partition) as max_partition, min(partition) as min_partition
        from clusterAllReplicas('集群名', 'system.parts')
        where database = '数据库名' and table = '表名'

检查集群之中没有分区的表以及表的数据量

select
          database, 
          table, 
          sum(bytes_on_disk) as bytes_on_disk,  // 表之中的数据量大小 
          sum(rows) as rows
       from clusterAllReplicas('集群名', 'system.parts')
       where
          partition = 'tuple()'
          and active = 1
       group by 
           database, table
       order by
          bytes_on_disk desc,
        database, table

表注释监控

集群之中表字段缺乏注释的表
获取指定表的字段未添加注释的有哪些字段

集群之中表字段缺乏注释的表

 select
       database, table
     from
       clusterAllReplicas('集群名', 'system.columns')
     where comment = ''
     and database <> 'system'
     group by database, table

获取指定表的字段未添加注释的有哪些字段

       database, table,name 
     from
       clusterAllReplicas('集群名', 'system.columns')
     where comment = ''
     and database = '集群名'
     and table = '表名'
     group by database, table,name

表压缩方式监控

集群之中使用默认压缩方式的表

表TTL 监控

查看集群之中合并树系列的表没有设置表级别TTL的表有哪些

查看集群之中合并树系列的表没有设置表级别TTL的表有哪些

   select 
    database, name, sum(total_rows) as total_rows,sum(total_bytes) as total_bytes 
 from 
  clusterAllReplicas('集群名', 'system.tables') 
where 
   engine like '%MergeTree%' 
   and engine_full not like '%TTL%' 
group by 
    database, name

可复制表监控

检查集群之中可复制合并树的 zookeeper的路径设置不符合规则的表

检查集群之中可复制合并树的 zookeeper的路径设置不符合规则的表

注意：这里的规则 zookeeper 之中 clickhouse表存储路径以数据库名/表名这样的格式作为结尾

      select
          database,
          name,
          engine,
          engine_full,
          'database_name/table_name' as standard_form
      from
         clusterAllReplicas('集群名', 'system.tables')
      where
         engine like '%ReplicatedMergeTree%'
         and engine <> 'SystemReplicatedMergeTreeSettings'
         and endsWith(splitByString(',', engine_full)[1], concat(database, '/', name, '\'')) = 0
      group by
         database, name, engine, engine_full
      order by
         engine, database, name

分布式表监控

查看分布式表的建表是否符合某个固定的规则

查看分布式表的建表是否符合某个固定的规则

 select database,name 
   from clusterAllReplicas('default', 'system.tables')
   where engine = 'Distributed'
     and splitByString(',', engine_full)[3] <> concat('\'', name,'$local', '\'')
   group by database, name

表元数据不一致监控

集群之中存在字段不一致的表
检查集群之中合并树以及分布式类型的表在集群之中元数据不一致的问题
传入指定的数据库名和表名，查看此表的哪些字段是集群不一致的
集群之中分布不均匀的表（有的节点存在表，但是有的节点不存在表）
获得指定表之中，在集群之中元数据不合理的字段
对于表字段数目进行监控

集群之中各节点

select 
     database, table
from (
        select database, table, count(*) as table_num, min(column_num) as min_column_num
        from (
        select database,table, name, type, count(*) as column_num
        from clusterAllReplicas('集群名','system.columns')
        group by database, table, name,type
        ) a
     group by database, table
     ) b
     where table_num > min_column_num

监控-clickhouse

集群监控

集群各个节点的存活时间

http 连接数监控

tcp 连接数监控

集群当前数据库的数目

集群当前表的数目

表数据量监控

查看集群之中各个表的数据量的大小

查看具体某个表在各个节点上的数据量分布

表分区监控

查看某个表的分区方式

获取某个表的分区数目

查看某个表的 分区数据量情况 分区之中part数目

获取一个表的分区范围

检查集群之中没有分区的表以及表的数据量

表注释监控

集群之中表字段缺乏注释的表

获取指定表的 字段未添加注释的有哪些字段

表压缩方式监控

表TTL 监控

查看集群之中合并树系列的表没有设置表级别TTL的表有哪些

可复制表 监控

检查集群之中 可复制合并树的 zookeeper的路径设置 不符合规则的表

分布式表 监控

查看分布式表的建表 是否符合某个固定的规则

表元数据不一致监控

集群之中各节点

查看某个表的分区数据量情况分区之中part数目

获取指定表的字段未添加注释的有哪些字段

可复制表监控

检查集群之中可复制合并树的 zookeeper的路径设置不符合规则的表

分布式表监控

查看分布式表的建表是否符合某个固定的规则