1、从 Row 中抽取数据的方式 1:通过字段的顺序索引号来获取
val df: DataFrame = spark.read.option("header",true).csv("data/stu2.txt")
val rdd: RDD[Row] = df.rdd
rdd.map(
row =>{
val id = row.get(0).asInstanceOf[String]
val name = row.get(1).asInstanceOf[String]
(id,name)
}
).take(10).foreach(println)
2、从 Row 中抽取数据的方式 2:通过字段名称来获取
rdd.map{
row =>{
val id = row.getAs[String]("id")
val name = row.getAs[String]("name")
(id,name)
}
}.take(10).foreach(println)
3、通过cass class来匹配
val df2rdd: RDD[Row] = df.rdd
df2rdd.map({
case Row(id: Int, name: String, age: Int, city: String, score: Double) => {
(id, name, age, city, score)
}
}).take(10).foreach(println)