Fluent Bit 简介
Fluent Bit 简称 td-agent
,是开源的日志处理和转发工具,可以从不同的数据源采集数据,比如指标数据和日志,并且可以通过使用过滤器 filters
对数据进行更加丰富的处理,td-agent 可以使用 output
插件把数据转存到多个目标系统。td-agent 同时也是容器化环境比如 kubernetes
最佳的日志采集工具。
td-agent 使用 C
语言开发,与生俱来得拥有更高效更节省资源的特性。td-agent 还是 CNCF
项目 Fluentd 的子项目。
采集长亭雷池 WAF SYSLOG
雷池 WAF 的日志需要使用雷池自带的 syslog 发信功能才能把日志转存到其他系统,而用于接收 syslog 的采集器常用的有以下几种:
- rsyslog
- filebeat
- logstash
- fluentd
以上各种采集器各有优缺点,本文仅针对 fluentbit 展开。
心急的同学肯定想直接看配置,照抄过去,当然也不是不行,只要是同款数据源1基本都可以直接套用,那么下面先放配置文件。
td-agent-bit.conf
td-agent-bit.conf
是 FluentBit 的主配置文件,这里使用了 @INCLUDE
引入我们自定义的 input配置文件
:
[SERVICE]
flush 5
daemon Off
log_level info
parsers_file parsers.conf
plugins_file plugins.conf
http_server Off
http_listen 0.0.0.0
http_port 2020
storage.metrics on
@INCLUDE input-*.conf
input-syslog-waf.conf
根据个人习惯,通常把每个数据源配置为独立的文件,一个是为了主配置文件最简化,一个是为了把每个数据源独立起来,方便调试。
[INPUT]
Name tail
Tag waf
Path /data/syslog/host/10.2.34.11/*
DB /var/log/td-agent/waf-log.db
Parser waf-syslog
Mem_Buf_Limit 24m
Refresh_Interval 1
[FILTER]
Name nest
Match *
Operation lift
Nested_under message
Remove_prefix message
[OUTPUT]
Name es
Match waf*
Host 10.2.53.2
Port 9200
HTTP_User elastic
HTTP_Passwd mypassword
Logstash_Format on
Logstash_Prefix waf-threat
Logstash_DateFormat %Y.%m.%d
Trace_Output on
Trace_Error on
Retry_Limit 3
Tag_key tag
[OUTPUT]
Name stdout
Match *
parsers.conf
parsers.conf
是软件包自带的解析器配置文件,我这里把解析 waf 的配置段也放在里面,配置如下:
[PARSER]
Name waf-syslog
Format regex
Regex (?<message>\{.*})
Time_Key timestamp
Time_Format %Y-%m-%dT%H:%M:%S %z
# Command | Decoder | Field | Optional Action
# =============|==================|=================
Decode_Field_As escaped_utf8 message do_next
Decode_Field_As json message
原始数据示例
原始数据为雷池 syslog 发送过来的原始数据,是由 rsyslog 接收,未作任何过滤的:
由于数据涉及敏感信息,原始数据中的域名和 IP 均使用 *** 代替。
2021-09-21T23:59:24+08:00 0684c3476940 /mario/mario[1] {"action":"allow","attack_type":"none","body":"","cookie":"","country":"CN","decode_path":"","dest_ip":"10.2.120.250","dest_port":80,"event_id":"23756fd838ee47c8bf54d31766915a64","host":"***","location":"","method":"GET","module":"","node":"chaitin-safeline","payload":"","protocol":"http","province":"北京","reason":"whitelist","referer":"","req_header_raw":"GET /p/login/index HTTP/1.1\\r\\nX-Forwarded-Proto: http\\r\\nHost: ***\\r\\nX-Forwarded-For: ***\\r\\nX-Forwarded-For: ***\\r\\nX-Real-IP: ***\\r\\nUser-Agent: Go-http-client/1.1\\r\\nAccept-Charset: utf-8\\r\\nAccept-Encoding: gzip\\r\\n\\r\\n","resp_body":"","resp_header_raw":"","resp_reason_phrase":"","resp_status_code":"","risk_level":"none","rule_id":"/1@time-1631893715","selector_id":"","session":"","src_ip":"***","src_port":35928,"timestamp":1632239964,"timestamp_human":"2021-09-21 23:59:24","urlpath":"/p/login/index","user_agent":"Go-http-client/1.1"}
过滤后的数据示例
过滤后的数据输出到了 ES ,这里使用 ES 的数据格式展示:
上面选择的原始数据和这里的过滤后的数据并不是同一条记录,实际效果其实是一样的。
{ "_index": "waf-threat-2021.09.22", "_type": "_doc", "_id": "KOxODnwBGLD5dr227ftA", "_version": 1, "_score": null, "_source": { "@timestamp": "2021-09-22T16:20:23.511Z", "action": "allow", "attack_type": "none", "body": "", "cookie": "", "country": "CN", "decode_path": "", "dest_ip": "10.2.120.250", "dest_port": 81, "event_id": "a8d64637368f4c65b9c24a62e7df7949", "host": "***", "location": "", "method": "GET", "module": "", "node": "chaitin-safeline", "payload": "", "protocol": "http", "province": "北京", "reason": "whitelist", "referer": "", "req_header_raw": "GET / HTTP/1.0\r\nHost: ***\r\nX-Forwarded-For: ***\r\nX-Forwarded-Proto: https\r\nX-Real-IP: ***\r\nConnection: close\r\nUser-Agent: Go-http-client/1.1\r\nAccept-Charset: utf-8\r\nAccept-Encoding: gzip\r\n\r\n", "resp_body": "", "resp_header_raw": "", "resp_reason_phrase": "", "resp_status_code": "", "risk_level": "none", "rule_id": "/1@time-1631893715", "selector_id": "", "session": "", "src_ip": "***", "src_port": 33754, "timestamp": 1632327623, "timestamp_human": "2021-09-23 00:20:23", "urlpath": "/", "user_agent": "Go-http-client/1.1" }, "fields": { "@timestamp": [ "2021-09-22T16:20:23.511Z" ] }, "sort": [ 1632327623511 ] }
ES 截图
Fluent Bit 关键概念
- 补充详细介绍