一:kafka简介:
Kafka 被称为下一代分布式消息系统,是非营利性组织ASF(Apache Software Foundation,简称为ASF)基金会中的一个开源项目,比如HTTP Server、Hadoop、ActiveMQ、Tomcat等开源软件都属于Apache基金会的开源软件,类似的消息系统还有RbbitMQ、ActiveMQ、ZeroMQ,最主要的优势是其具备分布式功能、并且结合zookeeper可以实现动态扩容。
web(106-108)
安装jdk
apt install openjdk-8-jdk -y
安装
zookeeper官方下载地址:archive.apache.org/dist/zookee… 或 zookeeper.apache.org/releases.ht…
kafka官方下载地址:kafka.apache.org/downloads
cd /usr/local/src/
#解压zookeeper
tar xf zookeeper-3.4.14.tar.gz
#软链接
ln -sv /usr/local/src/zookeeper-3.4.14 /usr/local/zookeeper
创建zookeeper保存数据的目录
mkdir /usr/local/zookeeper/data
web1(106)
拷贝模板文件
cd /usr/local/zookeeper/conf/
cp zoo_sample.cfg zoo.cfg
zookeeper配置信息
grep "^[a-Z]" zoo.cfg
#服务器与服务器之间和客户端与服务器之间的单次心跳检测时间间隔,单位为毫秒
tickTime=2000
#集群中leader服务器与follower服务器初始连接心跳次数,即多少个2000毫秒
initLimit=10
# leader与follower之间连接完成之后,后期检测发送和应答的心跳次数,如果该follower 在设置的时间内(5*2000)不能与leader 进行通信,那么此 follower 将被视为不可用。
syncLimit=5
#自定义的zookeeper保存数据的目录
dataDir=/usr/local/zookeeper/data
#客户端连接 Zookeeper 服务器的端口,Zookeeper 会监听这个端口,接受客户端的访问请求
clientPort=2181
maxClientCnxns=4096
#设置zookeeper保存保留多少次客户端连接的数据
autopurge.snapRetainCount=256
#设置zookeeper间隔多少小时清理一次保存的客户端数据
autopurge.purgeInterval=2
#服务器编号=服务器IP:LF数据同步端口:LF选举端口
server.1=192.168.37.106:2888:3888
server.2=192.168.37.107:2888:3888
server.3=192.168.37.108:2888:3888
拷贝到其他主机
scp zoo.cfg 192.168.37.107:/usr/local/zookeeper/conf
scp zoo.cfg 192.168.37.108:/usr/local/zookeeper/conf
echo "1" > /usr/local/zookeeper/data/myid
web2(107)
echo "2" > /usr/local/zookeeper/data/myid
web3(108)
echo "3" > /usr/local/zookeeper/data/myid
web(106-108)
启动zookeeper服务
/usr/local/zookeeper/bin/zkServer.sh start
查看状态、此时有1个'leader'、2个'follower'
/usr/local/zookeeper/bin/zkServer.sh status
Mode: leader <--
web(106-108)
解压kafka
cd /usr/local/src/
tar xf kafka_2.12-2.3.0.tgz
软链接
ln -sv /usr/local/src/kafka_2.12-2.3.0 /usr/local/kafka
web1(106)
cd /usr/local/kafka/config
#修改配置文件
vim server.properties
21 broker.id=1
31 listeners=PLAINTEXT://192.168.37.106:9092
#保留指定小时的日志内容(日志多可改小、不多可改大)
103 log.retention.hours=168
#所有的zookeeper地址
123 zookeeper.connect=192.168.37.106:2181,192.168.37.107:2181,192.168.37.108:2181
传文件
scp server.properties 192.168.37.107:/usr/local/kafka/config/
scp server.properties 192.168.37.108:/usr/local/kafka/config/
web2(107)
cd /usr/local/kafka/config
#修改配置文件
vim server.properties
21 broker.id=2
31 listeners=PLAINTEXT://192.168.37.107:9092
123 zookeeper.connect=192.168.37.106:2181,192.168.37.107:2181,192.168.37.108:2181
web3(108)
cd /usr/local/kafka/config
#修改配置文件
vim server.properties
21 broker.id=3
31 listeners=PLAINTEXT://192.168.37.108:9092
123 zookeeper.connect=192.168.37.106:2181,192.168.37.107:2181,192.168.37.108:2181
web(106-108)
启动kafka、以守护进程方式启动
/usr/local/kafka/bin/kafka-server-start.sh -daemon /usr/local/kafka/config/server.properties
查看日志、启动成功
tail -f /usr/local/kafka/logs/server.log
...(省略中间部分)
[2023-05-17 16:48:29,221] INFO Kafka startTimeMs: 1684313309213 (org.apache.kafka.common.utils.AppInfoParser)
[2023-05-17 16:48:29,228] INFO [KafkaServer id=1] started (kafka.server.KafkaServer)
安装:ZooInspector.zip 和 java 包(可参考实现对主机和Tomcat监控(小节2)中windows主机部分)
测试创建topic:
在任意一台kafka服务器如web1(106)
/usr/local/kafka/bin/kafka-topics.sh --create --zookeeper 192.168.37.108:2181 --partitions 3 --replication-factor 3 --topic logstashtest
Created topic logstashtest. <-- 已创建logstashtest主题
测试获取topic:
可以在任意一台kafka服务器进行测试:
/usr/local/kafka/bin/kafka-topics.sh --describe --zookeeper 192.168.37.106:2181,192.168.37.107:2181,192.168.37.108:2181 --topic logstashtest
Topic:logstashtest PartitionCount:3 ReplicationFactor:3 Configs:
Topic: logstashtest Partition: 0 Leader: 3 Replicas: 3,2,1 Isr: 3,2,1
Topic: logstashtest Partition: 1 Leader: 1 Replicas: 1,3,2 Isr: 1,3,2
Topic: logstashtest Partition: 2 Leader: 2 Replicas: 2,1,3 Isr: 2,1,3
状态说明:logstashtest有三个分区分别为0、1、2,分区0的leader是3(broker.id),分区0有三个副本,并且状态都为lsr(ln-sync,表示可以参加选举成为leader)。
查看topic帮助:
/usr/local/kafka/bin/kafka-topics.sh --help
删除topic:
/usr/local/kafka/bin/kafka-topics.sh --delete --zookeeper 192.168.37.106:2181,192.168.37.107:2181,192.168.37.108:2181 --topic logstashtest
获取所有topic:
#创建topic
/usr/local/kafka/bin/kafka-topics.sh --create --zookeeper 192.168.37.108:2181 --partitions 3 --replication-factor 3 --topic linux01
Created topic linux01.
#获取所有topic
/usr/local/kafka/bin/kafka-topics.sh --list --zookeeper 192.168.37.106[:2181,192.168.37.107:2181,192.168.37.108:2181
linux01
/usr/local/kafka/bin/kafka-console-producer.sh --broker-list 192.168.37.106:9092,192.168.37.107:9092,192.168.37.108:9092 --topic linux01
#数据会分散放在kafka的分区上
>linux01
>linux02
>linux03
web2(107)
其他kafka服务器测试获取数据:
/usr/local/kafka/bin/kafka-console-consumer.sh --topic linux01 --bootstrap-server 192.168.37.108:9092 --from-beginning
linux02
linux03
linux01
web1(106)
cd /etc/logstash/conf.d/
cat log-to-kafka.conf
input {
stdin {}
}
output {
kafka {
bootstrap_servers => "192.168.37.106:9092"
topic_id => "syslog-37-106"
}
}
重启服务
systemctl restart logstash
检查
/usr/share/logstash/bin/logstash -f log-to-kafka.conf -t
启动
/usr/share/logstash/bin/logstash -f log-to-kafka.conf
#输入
111
222
333
右击'linux01'-reconnect(重新连接)
logstash1(103)
使用logstash测试向kafka写入数据
cd /etc/logstash/conf.d/
cat kafka-to-es.conf
input {
kafka {
bootstrap_servers => "192.168.37.106:9092"
topics => "syslog-37-106"
}
}
output {
elasticsearch {
hosts => ["http://192.168.37.102:9200"]
index => "kafka-syslog-37-106-%{+YYYY.MM.dd}"
}
}
重启服务
systemctl restart logstash
检查
/usr/share/logstash/bin/logstash -f kafka-to-es.conf -t
启动
/usr/share/logstash/bin/logstash -f kafka-to-es.conf
web1(106)
/usr/share/logstash/bin/logstash -f log-to-kafka.conf
#再次输入
777
888
999
添加到kibana
web(106)
cd /etc/logstash/conf.d/
vim log-to-kafka.conf
input {
file {
path => "/var/log/syslog"
type => "syslog-37-106"
}
}
output {
kafka {
bootstrap_servers => "192.168.37.106:9092"
topic_id => "syslog-37-106"
}
}
重启服务
systemctl restart logstash
logstash(103)
判断
cd /etc/logstash/conf.d/
vim kafka-to-es.conf
input {
kafka {
bootstrap_servers => "192.168.37.106:9092"
topics => "syslog-37-106"
}
}
output {
if [type] == "syslog-37-106" {
elasticsearch {
hosts => ["http://192.168.37.102:9200"]
index => "kafka-syslog-37-106-%{+YYYY.MM.dd}"
}}
}
重启服务
systemctl restart logstash
可以收到106的系统日志
web1(106)
查看nginx端口是否打开、没开用括号中命令打开(/apps/nginx/sbin/nginx)
ss -nltp|grep 80
LISTEN 0 128 0.0.0.0:80 0.0.0.0:* users:(("nginx",pid=1762,fd=6),("nginx",pid=1761,fd=6))
多日志收集
cat log-to-kafka.conf
input {
file {
path => "/var/log/syslog"
type => "syslog-37-106"
codec => "json"
}
file {
path => "/var/log/access.log"
type => "nginx-access-log-37-106"
codec => "json"
}
}
output {
if [type] == "syslog-37-106" {
kafka {
bootstrap_servers => "192.168.37.106:9092"
topic_id => "syslog-37-106"
codec => "json"
}}
if [type] == "nginx-access-log-37-106" {
kafka {
bootstrap_servers => "192.168.37.106:9092"
topic_id => "nginx-access-log-37-106"
codec => "json"
}}
}
检查
/usr/share/logstash/bin/logstash -f log-to-kafka.conf -t
重启服务
systemctl restart logstash
浏览器访问
logstash(103)
cat kafka-to-es.conf
input {
kafka {
bootstrap_servers => "192.168.37.106:9092"
topics => "syslog-37-106"
codec => "json"
}
kafka {
bootstrap_servers => "192.168.37.106:9092"
topics => "nginx-access-log-37-106"
codec => "json"
}
}
output {
if [type] == "syslog-37-106" {
elasticsearch {
hosts => ["http://192.168.37.102:9200"]
index => "kafka-syslog-37-106-%{+YYYY.MM.dd}"
}}
if [type] == "nginx-access-log-37-106" {
elasticsearch {
hosts => ["http://192.168.37.102:9200"]
index => "kafka-nginx-access-log-37-106-%{+YYYY.MM.dd}"
}}
}
重启服务
systemctl restart logstash
启动
/usr/share/logstash/bin/logstash -f kafka-to-es.conf
web1(106)
echo 111 >> /var/log/syslog
到elasticsearch验证192.168.37.106:5601刷新 再看是否有kafka-nginx-access-log-37-106和kafka-syslog-37-106
添加到kibana即可