Flink DataStream/DataSet Table 之间的转换

378 阅读3分钟

文章目录

一.简介

DataStream/DataSet Table 之间互相转换:

  • DataStream/DataSet 转换 Table
  • Table 转换DataStream/DataSet

二.示例

2.1 将Table转换为DataStream

有两种模式可以将 Table转换为DataStream

  • Append Mode 将一个表附加到流上
  • Retract Mode 将表转换为流

语法

// get TableEnvironment. 
// registration of a DataSet is equivalent
// ge val tableEnv = TableEnvironment.getTableEnvironment(env)
// Table with two fields (String name, Integer age)
val table: Table = ...
// convert the Table into an append DataStream of Row
val dsRow: DataStream[Row] = tableEnv.toAppendStream[Row](table)
// convert the Table into an append DataStream of Tuple2[String, Int]
val dsTuple: DataStream[(String, Int)] dsTuple = 
  tableEnv.toAppendStream[(String, Int)](table)
// convert the Table into a retract DataStream of Row.
//   A retract stream of type X is a DataStream[(Boolean, X)]. 
//   The boolean field indicates the type of the change. 
//   True is INSERT, false is DELETE.
val retractStream: DataStream[(Boolean, Row)] = tableEnv.toRetractStream[Row](table)

示例

object TableToDataStream {
  def main(args: Array[String]): Unit = {
    //构造数据,转换为table
    val data = List(
      Peoject(1L, 1, "Hello"),
      Peoject(2L, 2, "Hello"),
      Peoject(3L, 3, "Hello"),
      Peoject(4L, 4, "Hello"),
      Peoject(5L, 5, "Hello"),
      Peoject(6L, 6, "Hello"),
      Peoject(7L, 7, "Hello World"),
      Peoject(8L, 8, "Hello World"),
      Peoject(8L, 8, "Hello World"),
      Peoject(20L, 20, "Hello World"))
    val bsEnv = StreamExecutionEnvironment.getExecutionEnvironment
    val bsSettings = EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
    val tEnv = StreamTableEnvironment.create(bsEnv, bsSettings)
    import org.apache.flink.api.scala._
    val stream = bsEnv.fromCollection(data)
    val table = tEnv.fromDataStream(stream)
    //TODO 将table转换为DataStream----将一个表附加到流上Append Mode
    val appendStream: DataStream[Peoject] = tEnv.toAppendStream[Peoject](table)
    //TODO 将表转换为流Retract Mode true代表添加消息,false代表撤销消息
    val retractStream: DataStream[(Boolean, Peoject)] = tEnv.toRetractStream[Peoject](table)
    retractStream.print()
    bsEnv.execute()
  }
  case class Peoject(user: Long, index: Int, content: String)
}

结果

3> (true,Peoject(6,6,Hello))
2> (true,Peoject(5,5,Hello))
5> (true,Peoject(8,8,Hello World))
6> (true,Peoject(1,1,Hello))
8> (true,Peoject(3,3,Hello))
7> (true,Peoject(2,2,Hello))
7> (true,Peoject(20,20,Hello World))
4> (true,Peoject(7,7,Hello World))
1> (true,Peoject(4,4,Hello))
6> (true,Peoject(8,8,Hello World))

2.2 将Table转换为DataSet

语法

// get TableEnvironment 
// registration of a DataSet is equivalent
val tableEnv = TableEnvironment.getTableEnvironment(env)
// Table with two fields (String name, Integer age)
val table: Table = ...
// convert the Table into a DataSet of Row
val dsRow: DataSet[Row] = tableEnv.toDataSet[Row](table)
// convert the Table into a DataSet of Tuple2[String, Int]
val dsTuple: DataSet[(String, Int)] = tableEnv.toDataSet[(String, Int)](table)

示例

object TableToDataSet {
  def main(args: Array[String]): Unit = {
    //构造数据,转换为table
    val data = List(
      Peoject(1L, 1, "Hello"),
      Peoject(2L, 2, "Hello"),
      Peoject(3L, 3, "Hello"),
      Peoject(4L, 4, "Hello"),
      Peoject(5L, 5, "Hello"),
      Peoject(6L, 6, "Hello"),
      Peoject(7L, 7, "Hello World"),
      Peoject(8L, 8, "Hello World"),
      Peoject(8L, 8, "Hello World"),
      Peoject(20L, 20, "Hello World"))
    //初始化环境,加载table数据
    val fbEnv = ExecutionEnvironment.getExecutionEnvironment
    val fbTableEnv = BatchTableEnvironment.create(fbEnv)
    import org.apache.flink.api.scala._
    val collection: DataSet[Peoject] = fbEnv.fromCollection(data)
    val table: Table = fbTableEnv.fromDataSet(collection)
    //TODO 将table转换为dataSet
    val toDataSet: DataSet[Peoject] = fbTableEnv.toDataSet[Peoject](table)
    toDataSet.print()
  }
  case class Peoject(user: Long, index: Int, content: String)
}

结果

Peoject(1,1,Hello)
Peoject(2,2,Hello)
Peoject(3,3,Hello)
Peoject(4,4,Hello)
Peoject(5,5,Hello)
Peoject(6,6,Hello)
Peoject(7,7,Hello World)
Peoject(8,8,Hello World)
Peoject(8,8,Hello World)
Peoject(20,20,Hello World)

2.3 DataStrearm 转换Table对象

// get TableEnvironment
// registration of a DataSet is equivalent
val tableEnv = ... // see "Create a TableEnvironment" section
val stream: DataStream[(Long, String)] = ...
// convert the DataStream into a Table with default fields '_1, '_2
val table1: Table = tableEnv.fromDataStream(stream)
// convert the DataStream into a Table with fields 'myLong, 'myString
val table2: Table = tableEnv.fromDataStream(stream, 'myLong, 'myString)

2.4 DataSet 转换Table对象

// get TableEnvironment
// registration of a DataSet is equivalent
val tableEnv = ... // see "Create a TableEnvironment" section
val stream: DataSet[(Long, String)] = ...
// convert the DataSet into a Table with default fields '_1, '_2
val table1: Table = tableEnv.fromDataSet(stream)
// convert the DataSet into a Table with fields 'myLong, 'myString
val table2: Table = tableEnv.fromDataSet(stream, 'myLong, 'myString)

公众号

在这里插入图片描述

微信号:bigdata_limeng