victorialogs实现收录日志

54 阅读6分钟

一、准备与环境

victorialogs指标监控

http://172.16.224.84:3000/d/XqCOFEX4z/victorialogs-cluster?orgId=1&var-DS_PROMETHEUS=ff9igzd62e60wf&var-job=victoria-84-logs&var-instance=All&var-version=v1.43.1&var-instance_storage_all=All&var-instance_storage=All&var-instance_insert_all=All&var-instance_insert=All&var-instance_select_all=All&var-instance_select=All&from=now-12h&to=now

metrics指标

http://172.16.224.84:9491/metrics

vmui访问地址

http://172.16.224.85:9481/select/vmui

环境

172.16.224.84
172.16.224.85


# ### 
# logstash
84上 
cd /opt/victoria-logs-data/logstash-9.1.3

sudo -u logstash /opt/victoria-logs-data/logstash-9.1.3/bin/logstash -f /opt/victoria-logs-data/logstash-9.1.3/config/conf.d

nohup sudo -u logstash  /opt/victoria-logs-data/logstash-9.1.3/bin/logstash -f /opt/victoria-logs-data/logstash-9.1.3/config/conf.d &

# 重启脚本
sh /tmp/restart_logstash.sh


# logstash配置文件
/opt/victoria-logs-data/logstash-9.1.3/config/conf.d


# 日志路径
tailf -n 100 /opt/victoria-logs-data/logstash-9.1.3/logs/logstash-plain.log






sudo kill -9 $(ps -ef | awk '/Xms8g/ && $3 != 0 {print $2}' | head -1)
#sudo kill -9 $(ps -ef | awk '/logstash-9.1.3/ && $3 != 0 {print $2}' | head -1)
echo ''
cd /tmp && nohup sudo -u logstash  /opt/victoria-logs-data/logstash-9.1.3/bin/logstash -f /opt/victoria-logs-data/logstash-9.1.3/config/conf.d &
#cd /tmp && nohup sudo -u logstash  /opt/victoria-logs-data/logstash-9.1.3/bin/logstash -f /opt/victoria-logs-data/logstash-9.1.3/config &
echo ''



二、victorialogs部署

# ### 
# 172.16.224.84
# ### 

# 
cd /opt/victoria-logs

# vlstorage 节点
nohup ./victoria-logs-prod   -httpListenAddr=:9491   -storageDataPath=/data/vl-storage-84   -retentionPeriod=3d   > vlstorage-84.log 2>&1 &       


# vlselect(vlinsert)节点,配置2个storage, 
nohup ./victoria-logs-prod -httpListenAddr=:9481 -storageNode=172.16.224.84:9491,172.16.224.85:9491 -insert.disableCompression=false   > vlselect-84.log 2>&1 &

# vlselect
nohup ./victoria-logs-prod -httpListenAddr=:9471 -storageNode=172.16.224.84:9491,172.16.224.85:9491 > vlselect-84.log 2>&1 &
# ### 
# 172.16.224.85
# ### 


# 
cd /opt/victoria-logs

# vlstorage 节点
nohup ./victoria-logs-prod -httpListenAddr=:9491 -storageDataPath=/data/vl-storage-85  -retentionPeriod=3d   > vlstorage-85.log 2>&1 &


# vlinsert
nohup ./victoria-logs-prod -httpListenAddr=:9481 -storageNode=172.16.224.84:9491,172.16.224.85:9491 -insert.disableCompression=false > vlstorage-85.log 2>&1 &

# vlselect
nohup ./victoria-logs-prod -httpListenAddr=:9471 -storageNode=172.16.224.84:9491,172.16.224.85:9491 > vlselect-85.log 2>&1 &

调整



# ### 
# 172.16.224.84 和 172.16.224.85  翻新
# ### 

# vlstorage 
nohup /opt/victoria/victoria-logs-prod -httpListenAddr=:9491 -storageDataPath=/opt/victoria-logs-data -retentionPeriod=5d   > /opt/victoria/vlstorage.log 2>&1 &

# vlselect
nohup /opt/victoria/victoria-logs-prod -httpListenAddr=:9471 -storageNode=172.16.224.84:9491,172.16.224.85:9491  > /opt/victoria/vlselect.log 2>&1 &



# ### 
# 
# ### 



三、日志量对比

  1. 85机器上的0109日的文件夹大小

  1. 84机器上的0108日的文件夹大小

  1. 下面 第十二章 的结论



四、跨站架构


# 收y4 日志
172.21.20.131
172.21.20.130

# logstash
172.21.20.133 



# 收y3 日志
172.16.224.84
172.16.224.85


端口划分: 
vlstorage   9491
vlinsert    9481
vlselect    9471 9472 9473 9474 9475
  1. 部署

172.21.20.131
172.21.20.130
172.21.20.133


mkdir /opt/victoria-logs-data
mkdir /opt/victoria

cd /opt/victoria && wget http://172.21.240.67:8666/victoria-logs-linux-amd64-v1.43.1.tar.gz


# 解压
tar zxvf victoria-logs-linux-amd64-v1.43.1.tar.gz

# vlstorage
nohup /opt/victoria/victoria-logs-prod -httpListenAddr=:9491 -storageDataPath=victoria-logs-data -retentionPeriod=3d   > /opt/victoria/vlstorage.log 2>&1 &

# vlinsert
nohup /opt/victoria/victoria-logs-prod -httpListenAddr=:9481 -storageNode=172.21.20.131:9491,172.21.20.132:9492 -insert.disableCompression=false   > /opt/victoria/vlinsert.log 2>&1 &

# vlselect
nohup /opt/victoria/victoria-logs-prod -httpListenAddr=:9471 -storageNode=172.21.20.131:9491,172.21.20.132:9492  > /opt/victoria/vlselect.log 2>&1 &

2. ## logstash部署



scp 172.16.224.84:/opt/victoria-logs-data/logstash-9.1.3-linux-x86_64.tar.gz .
scp logstash-9.1.3-linux-x86_64.tar.gz 172.21.240.67:/data/package/


cd /opt/victoria && wget http://172.21.240.67:8666/logstash-9.1.3-linux-x86_64.tar.gz


# 
tar zxvf logstash-9.1.3-linux-x86_64.tar.gz
# 
cd /opt/victoria-logs-data/logstash-9.1.3/config



七、模拟vlstorage进程故障

# ### 
# 172.21.20.131
# ### 

# vlstorage
nohup /opt/victoria/victoria-logs-prod -httpListenAddr=:9491 -storageDataPath=/opt/victoria-logs-data -retentionPeriod=5d   > /opt/victoria/vlstorage.log 2>&1 &

# vlinsert
nohup /opt/victoria/victoria-logs-prod -httpListenAddr=:9481 -storageNode=172.21.20.131:9491,172.21.20.132:9491 -insert.disableCompression=false   > /opt/victoria/vlinsert.log 2>&1 &

# vlselect
nohup /opt/victoria/victoria-logs-prod -httpListenAddr=:9471 -storageNode=172.21.20.131:9491,172.21.20.132:9491  > /opt/victoria/vlselect.log 2>&1 &




# ### 
# 172.21.20.132
# ### 
# vlstorage
nohup /opt/victoria/victoria-logs-prod -httpListenAddr=:9491 -storageDataPath=/opt/victoria-logs-data -retentionPeriod=5d   > /opt/victoria/vlstorage.log 2>&1 &

# vlinsert
nohup /opt/victoria/victoria-logs-prod -httpListenAddr=:9481 -storageNode=172.21.20.131:9491,172.21.20.132:9491 -insert.disableCompression=false   > /opt/victoria/vlinsert.log 2>&1 &

# vlselect
nohup /opt/victoria/victoria-logs-prod -httpListenAddr=:9471 -storageNode=172.21.20.131:9491,172.21.20.132:9491  > /opt/victoria/vlselect.log 2>&1 &

131的vlselect

132的vlselect

kill掉132上的vlstorage

131的vlselect报错

132的vlselect报错

现象

132上的vlstorage进程故障,最终导致131和132的vlselect均不能查询出结果; 但不影响数据写入。


原因:132 节点挂掉 → 其 vlstorage 端口关闭 → vlselect 发现少了一块必须的数据 → 整个查询失败 → vmui 看到 connection refused恢复 132 节点或把它从 -storageNode 列表里摘掉 即可立即消除报错。

【解决方式】

  1. 降低力度,快速摘除故障vlstorage节点。

  1. 等待新版本victorialogs

VictoriaMetrics 已公布路线图,未来 vlstorage 会支持 副本因子(replicationFactor)集群一致性 模式;届时

-replicationFactor=2

即可让同一份 block 存在于两个 storage node,单点故障时自动降级查询,不影响前端。

132vlstorage进程的中断后是否影响insert写入?

  1. kill掉132上vlstorage进程

  1. 观察vmui

http://172.21.20.131:9471/select/vmui/

http://172.21.20.132:9471/select/vmui/

  1. 等待5分钟

中断时间 2026.01.13 17:25 - 17:30 之间的5分钟。

  1. 恢复132上的vlstorage进程

  1. 观察vmui

http://172.21.20.131:9471/select/vmui/

http://172.21.20.132:9471/select/vmui/

现象:

并没有中断。

结论:

vlinsert中配置2个vlstorage后,其中一个vlstorage异常挂了:

  1. 一个vlstorage进程异常并不会阻止整体数据的写入。
  2. 会影响vlselect的查询。



八、vlselect跨站查询

  1. 在85上启动一个9461的vlselect(172.16.224.85

# vlselect 
nohup /opt/victoria/victoria-logs-prod -httpListenAddr=:9461 -storageNode=172.21.20.131:9491,172.21.20.132:9491  > /opt/victoria/vlselect9461.log 2>&1 &

2. vmui查看

http://172.16.224.85:9461 查询结果: 2,709,467 hits

http://172.21.20.132:9471 的查询结果:2,709,467 hits

结论:

可以实现跨站查询。

如果vlstorage进程出现异常,修改vlselect配置,剔除异常的vlstorage信息后启动vlselect。即可。




九、logstash的output配置2个vlinsert

output {
  elasticsearch {
    hosts => [
      "http://172.16.224.85:9481/insert/elasticsearch/",
      "http://172.16.224.86:9481/insert/elasticsearch/"
    ]
    parameters => { ... }
  }
}




十、vlinsert与vlselect节点单一性配置

  1. 启动命令

# vlinsert
nohup /opt/victoria/victoria-logs-prod -httpListenAddr=:9481 -storageNode=172.21.20.131:9491,172.21.20.132:9491 -insert.disableCompression=false   > /opt/victoria/vlinsert.log 2>&1 &


# vlselect
nohup /opt/victoria/victoria-logs-prod -httpListenAddr=:9471 -storageNode=172.21.20.131:9491,172.21.20.132:9491  > /opt/victoria/vlselect.log 2>&1 &
  1. 单一性配置
 # Disable select endpoints on vlinsert  ./victoria -logs-prod  -storageNode =... -select .disable
 # Disable insert endpoints on vlselect  ./victoria -logs-prod  -storageNode =... -insert .disable



十一、victorialogs的副本机制




十二、vlselect查询2套vlstorage

架构

图片.png

说明:

  1. victorialogs结构
节点角色作用备注
vlselectvictorialogs的查询组件,实现跨节点的检索。
vlstorage存储组件,用来存储写入进来的数据。数据只是存储在本地。
vlinsert写入组件。本架构中,写入组件只是写入本地vlstorage
  1. 每个节点的写入使用nginx做负载均衡

写入组件的端口9481,通过nginx的负载均衡,实际vlinsert端口有4个,均在本地节点上。

  1. 版本: v1.43.1
[root@xfzconpys0388 victoria]# ./victoria-logs-prod --version                                                                                                
victoria-logs-20251226-223354-tags-v1.43.1-0-g66d23fbb3d 
  1. 131的vlstorage

total: 37946505

  1. 84的vlstorage

total: 486807500

  1. 合并查询的vlselect配置

# 
# vlselect 
nohup /opt/victoria/victoria-logs-prod -httpListenAddr=:9461 -storageNode=172.21.20.131:9491,172.21.20.132:9491,172.16.224.84:9491,172.16.224.85:9491  > /opt/victoria/vlselect.log 2>&1 &

4. ## 结果

结果:524754005,查询结果等于两套victorialogs的结果之和。

结论

可以实现跨站查询,且结果会将两套vlstorage中的数据合并展示。




十三、功能调研

  1. 对接LDAP实现用户鉴权
  2. 查询结果导出
序号功能解决方案是否支持备注
1账号鉴权运维研发做前端页面,将vmui嵌入到百胜云不能对接LDAP
2查询结果集导出支持
3进阶查询方式(类似dev Tools)LogsQL,支持分词docs.victoriametrics.com/victorialog…
4聚合查询暂无
5vector消费降级方案
6vector弹性扩容方案
7vector与创建topic自动化联动
8victorialogs监控
9vector资源使用率监控
10victorialogs日志量统计待跟进
11victorialogs监控
12日志保留时间10天
13kafka消费组名用-vector-y3-vectot-y4*-vector-ent3*-vector-ent4
14
15



十四、日志量对比

  1. mid-uh-go-server(y4)

2026.01.14 截止到10:00

victorialogs:586.3 Milion条

ES中的大小:232.5 gb

  1. mid-orderhub(y4)

victorialogs的大小: 193.2 Milion条

ES中的大小:47.7GB

victorialogs磁盘大小占比

说明:uh : orderhub = 3 :1, uh大小为52.5GB, orderhub为17.5GB。 二者总大小70GB。

  1. 表格对比

victorialogsES ( GB )ES ( 条 )ES / victorialogs (大小)
mid-uh-go-server593.8 Mili 条59400000005.9亿52.5 GB237.5 GB6473299786.5亿4.5 : 1
mid-orderhub193.2 Mili 条19320000001.9亿17.5 GB47.7 GB1967464361.9亿2.8 : 1

ES总大小=290GB

victorialogs总大小= 70GB

对比: ES:victorialogs = 4:1

  1. zstd压缩率与原始日志大小

说明: zstd对纯文本的压缩率为5倍。

  1. 结论

victorialogs中总的磁盘占用70GB, ES(zstd压缩后)约290GB,zstd压缩前为 1450 GB。

使用victorialogs替换火山ES后,存储成本为原来1/4。

补充信息:ES(原始日志) : victorialogs日志大小 = 1500 : 70 约等于 20倍。

  1. 01.15日志

# 
一共90GB:
uh占比37%
orderhub占比11%

uh + orderhub = 43.2 GB

# 
uh + orderhub = 288 GB

对比结果

ES : victorialogs = 288:43.2 =`` 6.55