Hive - 数据治理案例

30 阅读1分钟

因为项目涉密,所以举个例子。

Part001:日期范围炸裂

有一张hive表,表中数据如下所示:

需要根据start_date,end_date转换成每个月,类似于以下这种数据

以下是具体的SQL

select
    tyshxym,
    dwmc,
    name,
    gender,
    concat(substr(add_months(ksrq,pos),1,4),substr(add_months(jsrq,pos),6,2)) as months,
    start_date,
    end_date
  from
  (
    select
      tyshxym,
      dwmc,
      name,
      gender,
      start_date,
      end_date,
      to_date(concat(start_date,'01')) as ksrq,
      to_date(concat(end_date,'01')) as jsrq,
      cast(months_between(
        to_date(concat(start_date,'01')),
        to_date(concat(end_date,'01'))
      ) as int) + 1 as month_count
    from your_database.your_table
    where start_date <= end_date  ) t
  LATERAL VIEW posexplode(split(space(month_count),'|')) pe as pos,val