1 - 简介
这个系列是我学习Flink之后,想到加强一下我的FLink能力,所以开始一系列的自己拟定业务场景,进行开发。
这里更类似笔记,而不是教学,所以不会特别细致,敬请谅解。
这里是实战的,具体一些环境,代码基础知识不会讲解,例如docker,flink语法之类的,看情况做具体讲解,所以需要一些技术门槛。
2 - 准备
-
flink - 1.12.0
-
elasticsearch - 7.12
-
kafka - 2.12-2.5.0
-
kibana - 7.12
-
filebeat - 7.12
这里就不做下载地址的分享了,大家自行下载吧。
3 - 代码
Flink代码
maven pom依赖,别问为啥这么多依赖,问我就说不知道,你就复制吧。
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.iolo</groupId>
<artifactId>flink_study</artifactId>
<version>1.0.0</version>
<!-- 指定仓库位置,依次为aliyun、apache和cloudera仓库 -->
<repositories>
<repository>
<id>aliyun</id>
<url>http://maven.aliyun.com/nexus/content/groups/public/</url>
</repository>
<repository>
<id>apache</id>
<url>https://repository.apache.org/content/repositories/snapshots/</url>
</repository>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
</repositories>
<properties>
<encoding>UTF-8</encoding>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<java.version>1.8</java.version>
<scala.version>2.12</scala.version>
<flink.version>1.12.0</flink.version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-scala_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-scala_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-api-scala-bridge_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-api-java-bridge_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<!-- blink执行计划,1.11+默认的-->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-planner-blink_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-common</artifactId>
<version>${flink.version}</version>
</dependency>
<!-- flink连接器-->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-sql-connector-kafka_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-jdbc_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-csv</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-json</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.bahir</groupId>
<artifactId>flink-connector-redis_2.11</artifactId>
<version>1.0</version>
<exclusions>
<exclusion>
<artifactId>flink-streaming-java_2.11</artifactId>
<groupId>org.apache.flink</groupId>
</exclusion>
<exclusion>
<artifactId>flink-runtime_2.11</artifactId>
<groupId>org.apache.flink</groupId>
</exclusion>
<exclusion>
<artifactId>flink-core</artifactId>
<groupId>org.apache.flink</groupId>
</exclusion>
<exclusion>
<artifactId>flink-java</artifactId>
<groupId>org.apache.flink</groupId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-hive_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-shaded-hadoop-2-uber</artifactId>
<version>2.7.5-10.0</version>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.38</version>
</dependency>
<!-- 日志 -->
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.7</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.44</version>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.2</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
<version>4.9.1</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-elasticsearch7_2.11</artifactId>
<version>${flink.version}</version>
</dependency>
</dependencies>
<build>
<sourceDirectory>src/main/java</sourceDirectory>
<plugins>
<!-- 编译插件 -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.5.1</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
<!--<encoding>${project.build.sourceEncoding}</encoding>-->
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.18.1</version>
<configuration>
<useFile>false</useFile>
<disableXmlReport>true</disableXmlReport>
<includes>
<include>**/*Test.*</include>
<include>**/*Suite.*</include>
</includes>
</configuration>
</plugin>
<!-- 打包插件(会包含所有依赖) -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<!--
zip -d learn_spark.jar META-INF/*.RSA META-INF/*.DSA META-INF/*.SF -->
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<!-- 设置jar包的入口类(可选) -->
<mainClass></mainClass>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
下面是flink的,具体讲解都在代码里
package com.iolo.flink.cases;
import com.alibaba.fastjson.JSONObject;
import com.iolo.common.util.DingDingUtil;
import org.apache.flink.api.common.RuntimeExecutionMode;
import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.common.functions.RuntimeContext;
import org.apache.flink.api.common.serialization.SimpleStringSchema;
import org.apache.flink.api.java.tuple.Tuple3;
import org.apache.flink.shaded.hadoop2.com.google.gson.Gson;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSinkFunction;
import org.apache.flink.streaming.connectors.elasticsearch.RequestIndexer;
import org.apache.flink.streaming.connectors.elasticsearch7.ElasticsearchSink;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer;
import org.apache.flink.util.Collector;
import org.apache.http.HttpHost;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.client.Requests;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Properties;
/**
* @author iolo
* @date 2021/3/17
* 监控日志实时报警
* <p>
* 准备环境
* 1 - flink 1.12
* 2 - kafka
* 3 - filebeat
* 4 - Springboot服务(可以产生各类级别日志的接口)
* 5 - es+kibana
* <p>
* filebeat 监控Springboot服务日志 提交给kafka(主题sever_log_to_flink_consumer)
* flink消费kafka主题日志 ,整理收集,如果遇到error日志发送邮件,或者发钉钉(这里可以调用Springboot服务,或者直接flink发送)
* 然后将所有日志存入es 进行 kibana分析
**/
public class case_1_kafka_es_log {
public static void main(String[] args) throws Exception {
// TODO env
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setRuntimeMode(RuntimeExecutionMode.AUTOMATIC);
Properties props = new Properties();
//集群地址
props.setProperty("bootstrap.servers", "127.0.0.1:9092");
//消费者组id
props.setProperty("group.id", "test-consumer-group");
//latest有offset记录从记录位置开始消费,没有记录从最新的/最后的消息开始消费
//earliest有offset记录从记录位置开始消费,没有记录从最早的/最开始的消息开始消费
props.setProperty("auto.offset.reset", "latest");
//会开启一个后台线程每隔5s检测一下Kafka的分区情况,实现动态分区检测
props.setProperty("flink.partition-discovery.interval-millis", "5000");
//自动提交(提交到默认主题,后续学习了Checkpoint后随着Checkpoint存储在Checkpoint和默认主题中)
props.setProperty("enable.auto.commit", "true");
//自动提交的时间间隔
props.setProperty("auto.commit.interval.ms", "2000");
// TODO source
FlinkKafkaConsumer<String> source = new FlinkKafkaConsumer<>("sever_log_to_flink_consumer", new SimpleStringSchema(), props);
DataStreamSource<String> ds = env.addSource(source);
// TODO transformation
SingleOutputStreamOperator<Tuple3<String, String, String>> result = ds.flatMap(new FlatMapFunction<String, Tuple3<String, String, String>>() {
@Override
public void flatMap(String s, Collector<Tuple3<String, String, String>> collector) throws Exception {
JSONObject json = (JSONObject) JSONObject.parse(s);
String timestamp = json.getString("@timestamp");
String message = json.getString("message");
String[] split = message.split(" ");
String level = split[3];
if ("[ERROR]".equalsIgnoreCase(level)) {
System.out.println("error!");
DingDingUtil.dingdingPost("error");
}
collector.collect(Tuple3.of(timestamp, level, message));
}
});
// TODO sink
result.print();
/**
* https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/connectors/elasticsearch.html
*/
List<HttpHost> httpHosts = new ArrayList<>();
httpHosts.add(new HttpHost("127.0.0.1", 9200, "http"));
ElasticsearchSink.Builder<Tuple3<String, String, String>> esSinkBuilder = new ElasticsearchSink.Builder<>(
httpHosts,
new ElasticsearchSinkFunction<Tuple3<String, String, String>>() {
@Override
public void process(Tuple3<String, String, String> stringStringStringTuple3, RuntimeContext runtimeContext, RequestIndexer requestIndexer) {
Map<String, String> json = new HashMap<>();
json.put("@timestamp", stringStringStringTuple3.f0);
json.put("level", stringStringStringTuple3.f1);
json.put("message", stringStringStringTuple3.f2);
IndexRequest item = Requests.indexRequest()
.index("my-log")
.source(json);
requestIndexer.add(item);
}
});
esSinkBuilder.setBulkFlushMaxActions(1);
result.addSink(esSinkBuilder.build());
// TODO execute
env.execute("case_1_kafka_es_log");
}
}
其中为了告警通知,做了个钉钉自定义机器人通知,需要的可以去百度查看一下,很方便。
developers.dingtalk.com/document/ap…
package com.iolo.common.util;
import lombok.extern.slf4j.Slf4j;
import okhttp3.MediaType;
import okhttp3.OkHttpClient;
import okhttp3.Request;
import okhttp3.RequestBody;
import okhttp3.Response;
import java.io.IOException;
/**
* @author iolo
* @date 2021/3/30
* https://developers.dingtalk.com/document/app/custom-robot-access/title-jfe-yo9-jl2
**/
@Slf4j
public class DingDingUtil {
private static final String url = "https://oapi.dingtalk.com/robot/send?access_token=你自己的token替换";
/**
* 秘钥token
*
* @param
* @return java.lang.String
* @author fengxinxin
* @date 2021/3/30 下午5:03
**/
public static void dingdingPost(String text) throws Exception {
MediaType JSON = MediaType.parse("application/json");
OkHttpClient client = new OkHttpClient();
String json = "{\"msgtype\": \"text\",\"text\": {\"content\": \"FlinkLog:" + text + "\"}}";
RequestBody body = RequestBody.create(JSON, json);
Request request = new Request.Builder()
.url(url)
.post(body)
.build();
try (Response response = client.newCall(request).execute()) {
String responseBody = response.body().string();
log.info(responseBody);
} catch (IOException e) {
log.error(e.getMessage());
}
}
}
然后可以直接在的控制面板直接启动这个main方法
Springboot
gitee地址直接下载,不做详细讲解
接口地址 http://127.0.0.1:8080/test/log?level=error&count=10
Kafka
操作命令,这些命令都是在Kafka里的bin目录下,Zookeeper是kafka自带的那个
# Zookeeper 启动命令
./zookeeper-server-start.sh ../config/zookeeper.properties
# Kafka 启动命令
./kafka-server-start.sh ../config/server.properties
# 创建 topic sever_log_to_flink_consumer
./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic sever_log_to_flink_consumer
# 查看是否创建成功
./kafka-topics.sh --list --zookeeper localhost:2181
# 这是生产者
./kafka-console-producer.sh --broker-list localhost:9092 --topic sever_log_to_flink_consumer
# 这是消费者
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic sever_log_to_flink_consumer --from-beginning
Elasticsearch
这里开始使用docker,具体环境可以自行搭建,并且以后docker的场景会越来越多,直接上命令。
docker run \
--name fxx-es \
-p 9200:9200 \
-p 9300:9300 \
-v /Users/iOLO/dev/docker/es/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-e "discovery.type=single-node" \
docker.elastic.co/elasticsearch/elasticsearch:7.12.0
验证
Kibana
docker run \
--name fxx-kibana \
--link fxx-es:elasticsearch \
-p 5601:5601 \
docker.elastic.co/kibana/kibana:7.12.0
我这里去容器内部设置中文,你可以不做
设置的方法,在配置文件kibana.yml增加i18n.locale: "zh-CN"
验证 地址 是 127.0.0.1:5601
具体操作的时候进行图文讲解
Filebeat
下载地址 www.elastic.co/cn/download…
选择自己电脑环境进行下载,我是MAC
解压之后修改配置文件里,直接上配置文件
# ============================== Filebeat inputs ===============================
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
- type: log
# Change to true to enable this input configuration. 这里是需要修改
enabled: true
# Paths that should be crawled and fetched. Glob based paths. 这里改成你本地下载那个Springboot的log文件地址
paths:
- /Users/iOLO/dev/Java/flinklog/logs/flink-log.*.log
# ------------------------------ Kafka Output -------------------------------
output.kafka:
# initial brokers for reading cluster metadata kafka的连接地址,这是直接从官网粘贴过来的,
# https://www.elastic.co/guide/en/beats/filebeat/current/kafka-output.html
hosts: ["127.0.01:9092"]
# message topic selection + partitioning 然后就是消费topic,其他都是官网的默认值,我就没做改动
topic: 'sever_log_to_flink_consumer'
partition.round_robin:
reachable_only: false
required_acks: 1
compression: gzip
max_message_bytes: 1000000
4 - 实战
环境和程序都准备好了之后,别忘了启动Springboot服务
然后通过请求接口服务 127.0.0.1:8080/test/log?level=error&count=10 来产生日志
通过查看钉钉 看是否有报警信息
钉钉成功!!!
然后就是Kibana的操作
直接上结果页面
然后就是操作步骤
第一先去es选择index
第二步根据红框点击进去es查询index页面
最后在输入框里查询你刚才的index ,咱们的代码就是my-index,根据提示进行下一步,我这里已经创建过了,所以就不在演示。最后就可以会有之前页面的效果。
5 - 结束语
整体就是这样,很多人肯定会提出质疑,说直接filebeat+ELK 也能完成这类效果,好的,你别杠,我这是学习flink之后,然后自己出业务场景,进行flink的实战总结,如果你有更好的方案,就干你的。
然后如果大家有啥想要的,遇到的场景,都可以提出来,我会斟酌后进行采纳进行实战实施。
最后感谢阅读。