Flume之常用选择器multiplexing selector持续创作，加速成长！这是我参与「掘金日新计划 · 6 月

持续创作，加速成长！这是我参与「掘金日新计划 · 6 月更文挑战」的第2天，点击查看活动详情

案例-multiplexing selector：

创建自定义conf文件

[root@hadoop01 test_conf]# cat test-multiplexing-selector.conf
myagent.sources=mysource1
myagent.channels=mychannel1 mychannel2
myagent.sinks=mysink1 mysink2

myagent.sources.mysource1.type=http
myagent.sources.mysource1.host=hadoop01
myagent.sources.mysource1.port=8888
myagent.sources.mysource1.selector.type=multiplexing
#使用mutiplexing selector时，需要设置根据header的哪个字段来进行区别，以便区分发送到不同的channel中
myagent.sources.mysource1.selector.header=role
#指定当header中role的值为Student时，数据写入mychannel1
myagent.sources.mysource1.selector.mapping.Student=mychannel1
#指定当header中role的值为Teacher时，数据写入mychannel2
myagent.sources.mysource1.selector.mapping.Teacher=mychannel2
#指定当header中role的值既不是Student也不是Teacher时，默认进入mychannel2
myagent.sources.mysource1.selector.default=mychannel2

myagent.channels.mychannel1.type=memory
myagent.channels.mychannel2.type=memory

myagent.sinks.mysink1.type=hdfs
myagent.sinks.mysink1.hdfs.path=hdfs://hadoop01:8020/test_multiplexing_selector/
myagent.sinks.mysink1.hdfs.filePrefix=mysink1
myagent.sinks.mysink1.hdfs.fileSuffix=.log
myagent.sinks.mysink1.hdfs.writeFormat=Text
myagent.sinks.mysink1.hdfs.fileType=DataStream

myagent.sinks.mysink2.type=hdfs
myagent.sinks.mysink2.hdfs.path=hdfs://hadoop01:8020/test_multiplexing_selector/
myagent.sinks.mysink2.hdfs.filePrefix=mysink2
myagent.sinks.mysink2.hdfs.fileSuffix=.log
myagent.sinks.mysink2.hdfs.writeFormat=Text
myagent.sinks.mysink2.hdfs.fileType=DataStream

myagent.sources.mysource1.channels=mychannel1 mychannel2
myagent.sinks.mysink1.channel=mychannel1
myagent.sinks.mysink2.channel=mychannel2

启动flume agent

[root@hadoop01 test_conf]# flume-ng agent -c /usr/local/wyh/apache-flume-1.8.0-bin/conf -f /usr/local/wyh/apache-flume-1.8.0-bin/test_conf/test-multiplexing-selector.conf -n myagent -Dflume.root.logger=INFO,console

发送role为Teacher的测试数据

[root@hadoop01 ~]# curl -X POST -d '[{"headers":{"role":"Teacher"},"body":"I am a teacher!"}]' http://hadoop01:8888

验证数据

可以看到控制台中只有mysink2相关的执行日志：

HDFS目录树中只生成了mysink2的数据文件：

查看HDFS中的数据：

[root@hadoop01 ~]# hdfs dfs -cat /test_multiplexing_selector/mysink2.1652015120789.log

说明我们设置的multiplexing selector根据role的值写入不同的channel是有效的。

发送role为Student的测试数据

[root@hadoop01 ~]# curl -X POST -d '[{"headers":{"role":"Student"},"body":"I am a student!"}]' http://hadoop01:8888

验证数据

控制台中只有mysink1的相关执行日志：

此时HDFS目录树中多了一个mysink1的数据文件：

查看HDFS中的数据：

[root@hadoop01 ~]# hdfs dfs -cat /test_multiplexing_selector/mysink1.1652015432499.log

发送role为非Student非Teacher的测试数据

[root@hadoop01 ~]# curl -X POST -d '[{"headers":{"role":"norole"},"body":"I am a visitor!"}]' http://hadoop01:8888

验证数据

控制台中出现了mysink2相关的执行日志：

HDFS目录树中会多出一个mysink2的数据文件：

查看最新生成的mysink2的数据文件：

[root@hadoop01 ~]# hdfs dfs -cat /test_multiplexing_selector/mysink2.1652015733840.log

说明在role不是我们指定的那两种mapping情形，数据会按照我们设置的default写入到mychannel2中。