创建一个scala项目
使用sbt创建scala项目,这里我使用的是sbt打包uber-jar
scala的sbt配置
build.sbt配置如下:
name := "learningKafka"
version := "0.1"
scalaVersion := "2.12.12"
libraryDependencies ++= Seq(
"org.apache.kafka" %% "kafka" % "2.8.0",
"org.apache.kafka" % "kafka-clients" % "2.8.0",
"org.apache.kafka" % "kafka-streams" % "2.8.0",
"org.apache.kafka" % "connect-api" % "2.8.0",
"org.apache.avro" % "avro" % "1.10.2",
"org.apache.kafka" %% "kafka-streams-scala" % "2.8.0",
"org.slf4j" % "slf4j-api" % "1.7.30",
"org.slf4j" % "slf4j-simple" % "1.7.30"
)
Compile / mainClass := Some("com.something.bz10.ConsumingApp")
assembly / assemblyJarName := "kafka_stream_example-v1.0.jar"
assembly / test := {}
assembly / mainClass := Some("com.something.bz10.ConsumingApp")
assembly / assemblyMergeStrategy := {
case "META-INF/MANIFEST.MF" => MergeStrategy.discard
case x => MergeStrategy.first
}
project/文件下创建一个build.properties文件:
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.15.0")
创建工程文件
下面开始创建scala工程文件
在main/scala/文件夹下创建一个包(package)名为com.something.bz10
之后创建文件ConsumingApp.scala
package com.liwenqiang.bz10
import java.util.Properties
import org.apache.kafka.streams.kstream.Materialized
import org.apache.kafka.streams.scala.ImplicitConversions._
import org.apache.kafka.streams.scala._
import org.apache.kafka.streams.scala.kstream._
import org.apache.kafka.streams.scala.serialization.Serdes.{longSerde, stringSerde}
import org.apache.kafka.streams.{KafkaStreams, StreamsConfig}
import java.time.Duration
object ConsumingApp extends App {
// 添加配置文件
val props: Properties = {
val p = new Properties()
p.put(StreamsConfig.APPLICATION_ID_CONFIG, "wordcount-application")
p.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "node1:9092,node2:9092,node3:9092")
p
}
// 创建拓扑
val builder: StreamsBuilder = new StreamsBuilder
val textLines: KStream[String, String] = builder.stream[String, String]("TextLinesTopic")
val wordCounts: KTable[String, Long] = textLines
.flatMapValues((textLine: String) => textLine.toLowerCase.split("\\W+"))
.groupBy((_: String, word: String) => word)
.count()(Materialized.as("counts-store"))
wordCounts.toStream.to("WordsWithCountsTopic")
// 运行它
val streams: KafkaStreams = new KafkaStreams(builder.build(), props)
streams.start()
sys.ShutdownHookThread {
streams.close(Duration.ofSeconds(10))
}
}
打包文件
使用sbt打包文件
先清空原来的包,之后打一个fat-jar
sbt clean assemble
测试程序是否正确
创建主题
使用命令:
/opt/kafka_2.13-2.8.0/bin/kafka-topics.sh --bootstrap-server node1:9092,node2:9092,node3:9092 --create --topic TextLinesTopic
运行jar包
把jar包上传到服务器,运行命令:
java -jar kafka_stream_example-v1.0.jar
如果没有报错,会弹出以下的控制台输出:
[wordcount-application-a2cea16e-297c-44eb-9f2d-3d77098a4846-StreamThread-1] INFO org.apache.kafka.streams.processor.internals.StreamTask - stream-thread [wordcount-application-a2cea16e-297c-44eb-9f2d-3d77098a4846-StreamThread-1] task [0_0] Suspended running
[wordcount-application-a2cea16e-297c-44eb-9f2d-3d77098a4846-StreamThread-1] INFO org.apache.kafka.clients.consumer.KafkaConsumer - [Consumer clientId=wordcount-application-a2cea16e-297c-44eb-9f2d-3d77098a4846-StreamThread-1-restore-consumer, groupId=null] Subscribed to partition(s): wordcount-application-counts-store-changelog-0
[wordcount-application-a2cea16e-297c-44eb-9f2d-3d77098a4846-StreamThread-1] INFO org.apache.kafka.streams.processor.internals.RecordCollectorImpl - stream-thread [wordcount-application-a2cea16e-297c-44eb-9f2d-3d77098a4846-StreamThread-1] task [0_0] Closing record collector clean
[wordcount-application-a2cea16e-297c-44eb-9f2d-3d77098a4846-StreamThread-1] INFO org.apache.kafka.streams.processor.internals.StreamTask - stream-thread [wordcount-application-a2cea16e-297c-44eb-9f2d-3d77098a4846-StreamThread-1] task [0_0] Closed clean
[wordcount-application-a2cea16e-297c-44eb-9f2d-3d77098a4846-StreamThread-1] INFO org.apache.kafka.streams.processor.internals.StreamTask - stream-thread [wordcount-application-a2cea16e-297c-44eb-9f2d-3d77098a4846-StreamThread-1] task [1_0] Suspended running
[wordcount-application-a2cea16e-297c-44eb-9f2d-3d77098a4846-StreamThread-1] INFO org.apache.kafka.clients.consumer.KafkaConsumer - [Consumer clientId=wordcount-application-a2cea16e-297c-44eb-9f2d-3d77098a4846-StreamThread-1-restore-consumer, groupId=null] Unsubscribed all topics or patterns and assigned partitions
监听消费者
运行命令:
bin/kafka-console-consumer.sh --bootstrap-server node1:9092,node2:9092,node3:9092 \
--topic WordsWithCountsTopic \
--from-beginning
--formatter kafka.tools.DefaultMessageFormatter \
--property print.key=true \
--property key.deserializer=org.apache.kafka.common.serialization.StringDeserializer \
--property value.deserializer=org.apache.kafka.common.serialization.LongDeserializer
如果运行正确,会出现以下控制台输出
world 1
scala 1
golang 1
hello 4
java 1
hello 5
scala 2
在TextLinesTopic发送消息
运行命令:
bin/kafka-console-producer.sh --bootstrap-server node1:9092,node2:9092,node3:9092 --topic TextLinesTopic
如果运行成功,可以在控制台输入多个任意字符,用空格分割
hello world hello scala hello java hello golang
这样在消费者端会看到类似以下的输出:
world 7
scala 8
java 4
hello 25
golang 4