Flume 1 (服务器 adp-03)配置 exec-memory-replicating-avro.conf
a1.sources = r1
a1.channels = c1 c2
a1.sinks = k1 k2
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /opt/adp/logs/hive/hive.log
# 将数据流复制给所有 channel
a1.sources.r1.selector.type = replicating
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
a1.channels.c2.type = memory
a1.channels.c2.capacity = 1000
a1.channels.c2.transactionCapacity = 100
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = adp-01
a1.sinks.k1.port = 30000
a1.sinks.k2.type = avro
a1.sinks.k2.hostname = adp-02
a1.sinks.k2.port = 30000
a1.sources.r1.channels = c1 c2
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c2
Flume 2 (服务器 adp-01)配置 avro-memory-hdfs.conf
a2.sources = r1
a2.channels = c1
a2.sinks = k1
a2.sources.r1.type = avro
a2.sources.r1.bind = adp-01
a2.sources.r1.port = 30000
a2.channels.c1.type = memory
a2.channels.c1.capacity = 1000
a2.channels.c1.transactionCapacity = 100
a2.sinks.k1.type = hdfs
a2.sinks.k1.hdfs.path = /flume/replicating-case/%Y%m%d/%H
a2.sinks.k1.hdfs.filePrefix = k1
a2.sinks.k1.hdfs.round = true
a2.sinks.k1.hdfs.roundValue = 1
a2.sinks.k1.hdfs.roundUnit = hour
a2.sinks.k1.hdfs.rollInterval = 60
a2.sinks.k1.hdfs.rollSize = 134217728
a2.sinks.k1.hdfs.rollCount = 0
a2.sinks.k1.hdfs.useLocalTimeStamp = true
a2.sinks.k1.hdfs.fileType = DataStream
a2.sources.r1.channels = c1
a2.sinks.k1.channel = c1
Flume 3 (服务器 adp-02)配置 avro-memory-file-roll.conf
a3.sources = r1
a3.channels = c1
a3.sinks = k1
a3.sources.r1.type = avro
a3.sources.r1.bind = adp-02
a3.sources.r1.port = 30000
a3.channels.c1.type = memory
a3.channels.c1.capacity = 1000
a3.channels.c1.transactionCapacity = 100
a3.sinks.k1.type = file_roll
a3.sinks.k1.sink.directory = /home/admin/flume/replicating-case
a3.sinks.k1.sink.pathManager.prefix = k2-
a3.sinks.k1.sink.pathManager.extension = log
a3.sinks.k1.sink.rollInterval = 60
a3.sources.r1.channels = c1
a3.sinks.k1.channel = c1
在服务器 adp-02 上创建本地目录:
$ mkdir -p /home/admin/flume/replicating-case
启动 flume agent:
# 先启动监听 avro 端口的 agent
# 在服务器 adp-01 上启动
$ flume-ng agent -n a2 -c conf -f avro-memory-hdfs.conf
# 在服务器 adp-02 上启动
$ flume-ng agent -n a3 -c conf -f avro-memory-file-roll.conf
# 在服务器 adp-03 上启动
$ flume-ng agent -n a1 -c conf -f exec-memory-replicating-avro.conf
检查服务器 adp-02 的本地目录:
[admin@adp-02 ~]$ ll flume/replicating-case/
total 4
-rw-r--r-- 1 admin admin 0 Apr 26 16:40 k2-1682498404475-1.log
-rw-r--r-- 1 admin admin 0 Apr 26 16:41 k2-1682498404475-2.log
-rw-r--r-- 1 admin admin 0 Apr 26 16:56 k2-1682499400276-1.log
-rw-r--r-- 1 admin admin 0 Apr 26 16:56 k2-1682499408317-1.log
-rw-r--r-- 1 admin admin 0 Apr 26 16:57 k2-1682499436243-1.log
-rw-r--r-- 1 admin admin 0 Apr 26 16:58 k2-1682499436243-2.log
-rw-r--r-- 1 admin admin 0 Apr 26 16:59 k2-1682499436243-3.log
-rw-r--r-- 1 admin admin 0 Apr 26 17:00 k2-1682499436243-4.log
-rw-r--r-- 1 admin admin 0 Apr 26 17:01 k2-1682499436243-5.log
-rw-r--r-- 1 admin admin 0 Apr 26 17:02 k2-1682499436243-6.log
-rw-r--r-- 1 admin admin 0 Apr 26 17:03 k2-1682499436243-7.log
-rw-r--r-- 1 admin admin 0 Apr 26 17:04 k2-1682499436243-8.log
-rw-r--r-- 1 admin admin 0 Apr 26 17:05 k2-1682499436243-9.log
-rw-r--r-- 1 admin admin 2058 Apr 26 17:08 k2-1682500058176-1.log
-rw-r--r-- 1 admin admin 0 Apr 26 17:08 k2-1682500058176-2.log
检查服务器 adp-01 的 HDFS 目录:
hdfs dfs -ls /flume/replicating-case/20230426/17
Found 2 items
-rw-r--r-- 3 admin supergroup 2058 2023-04-26 17:09 /flume/replicating-case/20230426/17/k1.1682500083007
-rw-r--r-- 3 admin supergroup 343 2023-04-26 17:09 /flume/replicating-case/20230426/17/k1.1682500172970.tmp