Flume 案例:故障转移

96 阅读2分钟

failover-case.png

Flume 1 (服务器 adp-01)配置

a1.sources = r1
a1.channels = c1
a1.sinkgroups = g1
a1.sinks = k1 k2

a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

a1.sinkgroups.g1.sinks = k1 k2
a1.sinkgroups.g1.processor.type = failover
# 设置多个 sink 的优先级,优先级最高的为 active sink,其他的为 standby sink
# 当 active sink 不可用时,发送数据到新的 active sink,即剩余可用的 sink 中优先级最高的那个
a1.sinkgroups.g1.processor.priority.k1 = 10
a1.sinkgroups.g1.processor.priority.k2 = 5
# 如果 active sink 故障后,在此时间内(单位毫秒)内恢复了,则继续将数据发送给此 sink
# 如果 active sink 故障后,在此时间内(单位毫秒)还没有恢复,那么即使之后恢复了,也不会发数据给此 sink,而是继续发送数据给新的 active sink
a1.sinkgroups.g1.processor.maxpenalty = 10000

a1.sinks.k1.type = avro
a1.sinks.k1.hostname = adp-02
a1.sinks.k1.port = 30000

a1.sinks.k2.type = avro
a1.sinks.k2.hostname = adp-03
a1.sinks.k2.port = 30000

a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c1

Flume 2(服务器 adp-02)配置

a2.sources = r1
a2.channels = c1
a2.sinks = k1

a2.sources.r1.type = avro
a2.sources.r1.bind = adp-02
a2.sources.r1.port = 30000

a2.channels.c1.type = memory
a2.channels.c1.capacity = 1000
a2.channels.c1.transactionCapacity = 100

a2.sinks.k1.type = logger

a2.sources.r1.channels = c1
a2.sinks.k1.channel = c1

Flume 3(服务器 adp-03)配置

a3.sources = r1
a3.channels = c1
a3.sinks = k1

a3.sources.r1.type = avro
a3.sources.r1.bind = adp-03
a3.sources.r1.port = 30000

a3.channels.c1.type = memory
a3.channels.c1.capacity = 1000
a3.channels.c1.transactionCapacity = 100

a3.sinks.k1.type = logger

a3.sources.r1.channels = c1
a3.sinks.k1.channel = c1

启动 flume agent:

# 先启动监听 avro 端口的 agent
# 在服务器 adp-02 上启动
$ flume-ng agent -n a2 -c conf -f a2.conf
# 在服务器 adp-03 上启动
$ flume-ng agent -n a3 -c conf -f a3.conf
# 在服务器 adp-01 上启动
$ flume-ng agent -n a1 -c conf -f a1.conf

在 adp-01 上发送数据:

~ nc localhost 44444
hello flume1
OK
hello flume2
OK

在 Flume 2 (adp-02) 上接收到全部数据:

[admin@adp-02 ~]
2023-04-27 01:29:28,401 INFO sink.LoggerSink: Event: { headers:{} body: 68 65 6C 6C 6F 20 66 6C 75 6D 65 31             hello flume1 }
2023-04-27 01:29:29,203 INFO sink.LoggerSink: Event: { headers:{} body: 68 65 6C 6C 6F 20 66 6C 75 6D 65 32             hello flume2 }

Flume 2(adp-02) 故障后,在 adp-01 上继续发送数据:

~ nc localhost 44444
hello flume3
OK

Flume 1(adp-01) 打印日志:

java.net.ConnectException: Connection refused: adp-02/10.0.0.25:30000
java.io.IOException: Error connecting to adp-02/10.0.0.25:30000

在 Flume 3(adp-03) 上接收到数据:

[admin@adp-03 ~]
2023-04-27 01:30:32,711 INFO sink.LoggerSink: Event: { headers:{} body: 68 65 6C 6C 6F 20 66 6C 75 6D 65 33             hello flume3 }

启动 Flume 2 (adp-02)后, 在 adp-01 上继续发送数据:

~ nc localhost 44444
hello flume4
OK

在 Flume 2 (adp-02) 上接收到数据:

[admin@adp-02 ~]
2023-04-27 01:31:35,706 INFO sink.LoggerSink: Event: { headers:{} body: 68 65 6C 6C 6F 20 66 6C 75 6D 65 34             hello flume4 }