zookeeper及kafka使用、logstash读写kafka(小节5)

74 阅读4分钟

一:kafka简介:

Kafka 被称为下一代分布式消息系统,是非营利性组织ASF(Apache Software Foundation,简称为ASF)基金会中的一个开源项目,比如HTTP Server、Hadoop、ActiveMQ、Tomcat等开源软件都属于Apache基金会的开源软件,类似的消息系统还有RbbitMQ、ActiveMQ、ZeroMQ,最主要的优势是其具备分布式功能、并且结合zookeeper可以实现动态扩容。

www.infoq.com/cn/articles…

图片.png

web(106-108)

安装jdk

apt install openjdk-8-jdk -y

安装

zookeeper官方下载地址:archive.apache.org/dist/zookee…zookeeper.apache.org/releases.ht…

kafka官方下载地址:kafka.apache.org/downloads

网盘下载: zookeeperkafka

cd /usr/local/src/

#解压zookeeper
tar xf zookeeper-3.4.14.tar.gz

#软链接
ln -sv /usr/local/src/zookeeper-3.4.14 /usr/local/zookeeper

创建zookeeper保存数据的目录

mkdir /usr/local/zookeeper/data

web1(106)

拷贝模板文件

cd /usr/local/zookeeper/conf/
cp zoo_sample.cfg zoo.cfg

zookeeper配置信息

grep "^[a-Z]" zoo.cfg
#服务器与服务器之间和客户端与服务器之间的单次心跳检测时间间隔,单位为毫秒
tickTime=2000
#集群中leader服务器与follower服务器初始连接心跳次数,即多少个2000毫秒
initLimit=10
# leader与follower之间连接完成之后,后期检测发送和应答的心跳次数,如果该follower 在设置的时间内(5*2000)不能与leader 进行通信,那么此 follower 将被视为不可用。
syncLimit=5
#自定义的zookeeper保存数据的目录
dataDir=/usr/local/zookeeper/data
#客户端连接 Zookeeper 服务器的端口,Zookeeper 会监听这个端口,接受客户端的访问请求
clientPort=2181

maxClientCnxns=4096
#设置zookeeper保存保留多少次客户端连接的数据
autopurge.snapRetainCount=256
#设置zookeeper间隔多少小时清理一次保存的客户端数据
autopurge.purgeInterval=2
#服务器编号=服务器IP:LF数据同步端口:LF选举端口
server.1=192.168.37.106:2888:3888
server.2=192.168.37.107:2888:3888
server.3=192.168.37.108:2888:3888

拷贝到其他主机

scp zoo.cfg 192.168.37.107:/usr/local/zookeeper/conf

scp zoo.cfg 192.168.37.108:/usr/local/zookeeper/conf
echo "1" > /usr/local/zookeeper/data/myid

web2(107)

echo "2" > /usr/local/zookeeper/data/myid

web3(108)

echo "3" > /usr/local/zookeeper/data/myid

web(106-108)

启动zookeeper服务

/usr/local/zookeeper/bin/zkServer.sh start

查看状态、此时有1个'leader'、2个'follower'

/usr/local/zookeeper/bin/zkServer.sh status
Mode: leader   <--

web(106-108)

解压kafka

cd /usr/local/src/
tar xf kafka_2.12-2.3.0.tgz

软链接

ln -sv /usr/local/src/kafka_2.12-2.3.0 /usr/local/kafka

web1(106)

cd /usr/local/kafka/config

#修改配置文件
vim server.properties
 21 broker.id=1
 31 listeners=PLAINTEXT://192.168.37.106:9092
#保留指定小时的日志内容(日志多可改小、不多可改大)
103 log.retention.hours=168
#所有的zookeeper地址
123 zookeeper.connect=192.168.37.106:2181,192.168.37.107:2181,192.168.37.108:2181

传文件

scp server.properties 192.168.37.107:/usr/local/kafka/config/

scp server.properties 192.168.37.108:/usr/local/kafka/config/

web2(107)

cd /usr/local/kafka/config

#修改配置文件
vim server.properties
 21 broker.id=2
 31 listeners=PLAINTEXT://192.168.37.107:9092
123 zookeeper.connect=192.168.37.106:2181,192.168.37.107:2181,192.168.37.108:2181

web3(108)

cd /usr/local/kafka/config

#修改配置文件
vim server.properties
 21 broker.id=3
 31 listeners=PLAINTEXT://192.168.37.108:9092
123 zookeeper.connect=192.168.37.106:2181,192.168.37.107:2181,192.168.37.108:2181

web(106-108)

启动kafka、以守护进程方式启动

/usr/local/kafka/bin/kafka-server-start.sh -daemon /usr/local/kafka/config/server.properties

查看日志、启动成功

tail -f /usr/local/kafka/logs/server.log
...(省略中间部分) 
[2023-05-17 16:48:29,221] INFO Kafka startTimeMs: 1684313309213 (org.apache.kafka.common.utils.AppInfoParser)
[2023-05-17 16:48:29,228] INFO [KafkaServer id=1] started (kafka.server.KafkaServer)

安装:ZooInspector.zipjava 包(可参考实现对主机和Tomcat监控(小节2)中windows主机部分)

图片.png

图片.png

测试创建topic:

在任意一台kafka服务器如web1(106)

/usr/local/kafka/bin/kafka-topics.sh --create  --zookeeper 192.168.37.108:2181 --partitions 3 --replication-factor 3 --topic logstashtest
Created topic logstashtest.    <-- 已创建logstashtest主题

测试获取topic:

可以在任意一台kafka服务器进行测试:

/usr/local/kafka/bin/kafka-topics.sh  --describe --zookeeper 192.168.37.106:2181,192.168.37.107:2181,192.168.37.108:2181  --topic logstashtest
Topic:logstashtest	PartitionCount:3	ReplicationFactor:3	Configs:
	Topic: logstashtest	Partition: 0	Leader: 3	Replicas: 3,2,1	Isr: 3,2,1
	Topic: logstashtest	Partition: 1	Leader: 1	Replicas: 1,3,2	Isr: 1,3,2
	Topic: logstashtest	Partition: 2	Leader: 2	Replicas: 2,1,3	Isr: 2,1,3

状态说明:logstashtest有三个分区分别为0、1、2,分区0的leader是3(broker.id),分区0有三个副本,并且状态都为lsr(ln-sync,表示可以参加选举成为leader)。

查看topic帮助:

/usr/local/kafka/bin/kafka-topics.sh --help

删除topic:

/usr/local/kafka/bin/kafka-topics.sh --delete --zookeeper 192.168.37.106:2181,192.168.37.107:2181,192.168.37.108:2181 --topic logstashtest

获取所有topic:

#创建topic
/usr/local/kafka/bin/kafka-topics.sh --create  --zookeeper 192.168.37.108:2181 --partitions 3 --replication-factor 3 --topic linux01
Created topic linux01.

#获取所有topic
/usr/local/kafka/bin/kafka-topics.sh --list --zookeeper 192.168.37.106[:2181,192.168.37.107:2181,192.168.37.108:2181
linux01

kafkatool安装包

/usr/local/kafka/bin/kafka-console-producer.sh --broker-list  192.168.37.106:9092,192.168.37.107:9092,192.168.37.108:9092 --topic linux01
#数据会分散放在kafka的分区上
>linux01
>linux02
>linux03

图片.png

图片.png

图片.png

图片.png

web2(107)

其他kafka服务器测试获取数据:

/usr/local/kafka/bin/kafka-console-consumer.sh --topic linux01 --bootstrap-server 192.168.37.108:9092 --from-beginning
linux02
linux03
linux01

web1(106)

cd /etc/logstash/conf.d/

cat log-to-kafka.conf
input {
  stdin {}
}

output {
      kafka {
        bootstrap_servers => "192.168.37.106:9092"
        topic_id => "syslog-37-106"
      }
}

重启服务

systemctl restart logstash

检查

/usr/share/logstash/bin/logstash -f log-to-kafka.conf -t

启动

/usr/share/logstash/bin/logstash -f log-to-kafka.conf
#输入
111
222
333

右击'linux01'-reconnect(重新连接)

图片.png

logstash1(103)

使用logstash测试向kafka写入数据

cd /etc/logstash/conf.d/

cat kafka-to-es.conf 
input {
  kafka {
    bootstrap_servers => "192.168.37.106:9092"
    topics => "syslog-37-106"
  }
}

output {
  elasticsearch {
    hosts => ["http://192.168.37.102:9200"]
    index => "kafka-syslog-37-106-%{+YYYY.MM.dd}"
  }
}

重启服务

systemctl restart logstash

检查

/usr/share/logstash/bin/logstash -f kafka-to-es.conf -t

启动

/usr/share/logstash/bin/logstash -f kafka-to-es.conf

web1(106)

/usr/share/logstash/bin/logstash -f log-to-kafka.conf
#再次输入
777
888
999

图片.png

添加到kibana

图片.png

图片.png

web(106)

cd /etc/logstash/conf.d/

vim log-to-kafka.conf
input {
  file {
    path => "/var/log/syslog"
    type => "syslog-37-106"
  }
}

output {
      kafka {
        bootstrap_servers => "192.168.37.106:9092"
        topic_id => "syslog-37-106"
      }
}

重启服务

systemctl restart logstash

logstash(103)

判断

cd /etc/logstash/conf.d/

vim kafka-to-es.conf
input {
  kafka {
    bootstrap_servers => "192.168.37.106:9092"
    topics => "syslog-37-106"
  }
}

output {
 if [type] == "syslog-37-106" {
    elasticsearch {
      hosts => ["http://192.168.37.102:9200"]
      index => "kafka-syslog-37-106-%{+YYYY.MM.dd}"
 }}
}

重启服务

systemctl restart logstash

可以收到106的系统日志 图片.png

web1(106)

查看nginx端口是否打开、没开用括号中命令打开(/apps/nginx/sbin/nginx)

ss -nltp|grep 80
LISTEN 0       128                         0.0.0.0:80             0.0.0.0:*      users:(("nginx",pid=1762,fd=6),("nginx",pid=1761,fd=6))

多日志收集

cat log-to-kafka.conf
input {
  file {
    path => "/var/log/syslog"
    type => "syslog-37-106"
    codec => "json"
  }

  file {
    path => "/var/log/access.log"
    type => "nginx-access-log-37-106"
    codec => "json"
  }
}

output {
  if [type] == "syslog-37-106" {
    kafka {
      bootstrap_servers => "192.168.37.106:9092"
      topic_id => "syslog-37-106"
      codec => "json"
  }}

  if [type] == "nginx-access-log-37-106" {
    kafka {
      bootstrap_servers => "192.168.37.106:9092"
      topic_id => "nginx-access-log-37-106"
      codec => "json"
  }}
}

检查

/usr/share/logstash/bin/logstash -f log-to-kafka.conf -t

重启服务

systemctl restart logstash

浏览器访问

图片.png

图片.png

logstash(103)

cat kafka-to-es.conf
input {
  kafka {
    bootstrap_servers => "192.168.37.106:9092"
    topics => "syslog-37-106"
    codec => "json"
  }

  kafka {
    bootstrap_servers => "192.168.37.106:9092"
    topics => "nginx-access-log-37-106"
    codec => "json"
  }
}

output {
  if [type] == "syslog-37-106" {
    elasticsearch {
      hosts => ["http://192.168.37.102:9200"]
      index => "kafka-syslog-37-106-%{+YYYY.MM.dd}"
  }}

  if [type] == "nginx-access-log-37-106" {
    elasticsearch {
      hosts => ["http://192.168.37.102:9200"]
      index => "kafka-nginx-access-log-37-106-%{+YYYY.MM.dd}"
  }}
}

重启服务

systemctl restart logstash

启动

/usr/share/logstash/bin/logstash -f kafka-to-es.conf

web1(106)

echo 111 >> /var/log/syslog

图片.png

到elasticsearch验证192.168.37.106:5601刷新 再看是否有kafka-nginx-access-log-37-106和kafka-syslog-37-106

图片.png

图片.png

图片.png

图片.png 添加到kibana即可