Apache Flume大数据开发工具概述与入门(八)

102 阅读2分钟

一起养成写作习惯!这是我参与「掘金日新计划 · 4 月更文挑战」的第3天,点击查看活动详情

1. Flume自定义Sink扩展

1.1. 自定义Sink说明

同自定义source类似,对于某些sink如果没有我们想要的,我们也可以自定义sink实现将数据保存到我们想要的地方去,例如kafka,或者mysql,或者文件等等都可以

需求:从网络端口当中发送数据,自定义sink,使用sink从网络端口接收数据,然后将数据保存到本地文件当中去。

1.2. 自定义Sink原理实现

自定义MySink

public class MySink extends AbstractSink implements Configurable {

    private Context context ;

    private String filePath = "";

    private String fileName = "";

    private File fileDir;

 

    //这个方法会在初始化调用,主要用于初始化我们的Context,获取我们的一些配置参数

    @Override

    public void configure(Context context) {

        try {

            this.context = context;

            filePath = context.getString("filePath");

            fileName = context.getString("fileName");

            fileDir = new File(filePath);

            if(!fileDir.exists()){

                fileDir.mkdirs();

            }

        } catch (Exception e) {

            e.printStackTrace();

        }

    }

public class MySink extends AbstractSink implements Configurable {

    private Context context ;

    private String filePath = "";

    private String fileName = "";

    private File fileDir;

 

    //这个方法会在初始化调用,主要用于初始化我们的Context,获取我们的一些配置参数

    @Override

    public void configure(Context context) {

        try {

            this.context = context;

            filePath = context.getString("filePath");

            fileName = context.getString("fileName");

            fileDir = new File(filePath);

            if(!fileDir.exists()){

                fileDir.mkdirs();

            }

        } catch (Exception e) {

            e.printStackTrace();

        }

    }

    //这个方法会被反复调用

    @Override

    public Status process() throws EventDeliveryException {

        Event event = null;

        Channel channel = this.getChannel();

        Transaction transaction = channel.getTransaction();

        transaction.begin();

        while(true){

            event = channel.take();

            if(null != event){

                break;

            }

        }

        byte[] body = event.getBody();

        String line = new String(body);

        try {

            FileUtils.write(new File(filePath+File.separator+fileName),line,true);

            transaction.commit();

        } catch (IOException e) {

            transaction.rollback();

            e.printStackTrace();

            return Status.BACKOFF;

        }finally {

            transaction.close();

        }

        return Status.READY;

    }

}

功能测试

将代码使用打包插件,打成jar包,注意一定要将commons-langs这个依赖包打进去,放到flume的lib目录下

开发flume的配置文件:

a1.sources = r1

a1.sinks = k1

a1.channels = c1

Describe/configure the source

a1.sources.r1.type = netcat

a1.sources.r1.bind = node-1

a1.sources.r1.port = 5678

a1.sources.r1.channels = c1

# Describe the sink

a1.sinks.k1.type = cn.itcast.flumesink.MySink

a1.sinks.k1.filePath=/export/servers

a1.sinks.k1.fileName=filesink.txt

# Use a channel which buffers events in memory

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel

a1.sources.r1.channels = c1

a1.sinks.k1.channel = c1

启动flume,并且使用telnet测试:

yum -y install telnet

bin/flume-ng agent -c conf -f conf/filesink.conf -n a1 -Dflume.root.logger=INFO,console

Telnet node-1 5678 连接到机器端口上输入数据。