Flume之常用选择器replicating selector

291 阅读1分钟

持续创作,加速成长!这是我参与「掘金日新计划 · 6 月更文挑战」的第1天,点击查看活动详情

 Flume中的选择器主要是用于决定source进入哪个channel中,主要有两种选择器:

replicating selector:将event复制到该source对应的所有channel中。

multiplexing selector:指定特定header的值进入某个channel。

如果没有指定选择器,默认使用replcating selector。

案例-replicating selector:

这里配置一个source,两个channel,两个sink。

  • 创建自定义conf文件
[root@hadoop01 test_conf]# pwd
/usr/local/wyh/apache-flume-1.8.0-bin/test_conf
[root@hadoop01 test_conf]# cat test-replicating-selector.conf
myagent.sources=mysource1
myagent.channels=mychannel1 mychannel2
myagent.sinks=mysink1 mysink2

myagent.sources.mysource1.type=syslogtcp
myagent.sources.mysource1.host=hadoop01
myagent.sources.mysource1.port=8888
myagent.sources.mysource1.selector.type=replicating

myagent.channels.mychannel1.type=memory
myagent.channels.mychannel2.type=memory

myagent.sinks.mysink1.type=hdfs
myagent.sinks.mysink1.hdfs.path=hdfs://hadoop01:8020/test_replicating_selector/
myagent.sinks.mysink1.hdfs.filePrefix=mysink1
myagent.sinks.mysink1.hdfs.fileSuffix=.log
myagent.sinks.mysink1.hdfs.writeFormat=Text
myagent.sinks.mysink1.hdfs.fileType=DataStream

myagent.sinks.mysink2.type=hdfs
myagent.sinks.mysink2.hdfs.path=hdfs://hadoop01:8020/test_replicating_selector/
myagent.sinks.mysink2.hdfs.filePrefix=mysink2
myagent.sinks.mysink2.hdfs.fileSuffix=.log
myagent.sinks.mysink2.hdfs.writeFormat=Text
myagent.sinks.mysink2.hdfs.fileType=DataStream

myagent.sources.mysource1.channels=mychannel1 mychannel2
myagent.sinks.mysink1.channel=mychannel1
myagent.sinks.mysink2.channel=mychannel2
  • 启动flume agent
[root@hadoop01 test_conf]# flume-ng agent -c /usr/local/wyh/apache-flume-1.8.0-bin/conf -f /usr/local/wyh/apache-flume-1.8.0-bin/test_conf/test-replicating-selector.conf -n myagent -Dflume.root.logger=INFO,console
  • 发送测试数据
[root@hadoop01 ~]# echo "Test Replicating" | nc hadoop01 8888
  • 验证数据

控制台中看到了分别来自于mysink1和mysink2的执行日志:

在HDFS目录树中也看到了两个数据文件:

 查看HDFS中写入的数据:

[root@hadoop01 ~]# hdfs dfs -cat /test_replicating_selector/mysink1.1652013959169.log
[root@hadoop01 ~]# hdfs dfs -cat /test_replicating_selector/mysink2.1652013959160.log

 这样就说明在使用replicating selector时,每条event都会被写入该source所配置的各个channel中。