flink 窗口对齐算法

2,633 阅读3分钟

当你开启时间窗口后,flink 会为每一个数据流中的元素计算它应该放入哪个窗口,具体算法如下

源码

org.apache.flink.streaming.api.windowing.windows.TimeWindow.getWindowStartWithOffset()

/**
 * Method to get the window start for a timestamp.
 *
 * @param timestamp epoch millisecond to get the window start.
 * @param offset The offset which window start would be shifted by.
 * @param windowSize The size of the generated windows.
 * @return window start
 */
public static long getWindowStartWithOffset(long timestamp, long offset, long windowSize) {
    return timestamp - (timestamp - offset + windowSize) % windowSize;
}

说明

  • timestamp 时间戳,单位毫秒
  • offset 偏远量,默认为 0,单位毫秒
  • windowSize 窗口大小,单位毫秒
  • WindowStartWithOffset 窗口开始时间,单位毫秒
WindowStartWithOffset = timestamp - (timestamp - offset + windowSize) % windowSize
                      = timestamp - (timestamp - 0 + windowSize) % windowSize
                      = timestamp - (timestamp + windowSize) % windowSize
                      = timestamp - (timestamp % windowSize + windowSize % windowSize)
                      = timestamp - timestamp % windowSize

举例

我要计算1分钟内,某个类型下的所有设备的在线情况,主要思路如下,先根据设备类型keyby(),然后开启一个1分钟的事件时间窗口

假设心跳发生时间如下,根据算法,则每个心跳对应的时间窗口如下

心跳发生时间窗口开始时间
2020-10-09 09:53:08,6552020-10-09 09:53:00,000
2020-10-09 09:53:11,6552020-10-09 09:52:00,000
2020-10-09 09:53:31,6552020-10-09 09:53:00,000
2020-10-09 09:53:47,6552020-10-09 09:52:00,000
2020-10-09 09:53:59,6552020-10-09 09:53:00,000
2020-10-09 09:54:00,6552020-10-09 09:54:00,000
2020-10-09 09:54:01,6552020-10-09 09:54:00,000
2020-10-09 09:54:07,6552020-10-09 09:54:00,000

如果现在时间窗口改成15秒,则对应的窗口开始时间也会发生变化,如下

心跳发生时间窗口开始时间
2020-10-09 09:53:08,6552020-10-09 09:53:00,000
2020-10-09 09:53:11,6552020-10-09 09:52:00,000
2020-10-09 09:53:31,6552020-10-09 09:53:30,000
2020-10-09 09:53:47,6552020-10-09 09:52:45,000
2020-10-09 09:53:59,6552020-10-09 09:53:45,000
2020-10-09 09:54:00,6552020-10-09 09:54:00,000
2020-10-09 09:54:01,6552020-10-09 09:54:00,000
2020-10-09 09:54:07,6552020-10-09 09:54:00,000

参考

关于 flink 的窗口对齐,官方也有地方说明,参考 Surprises 这一节的 Time Windows are Aligned to the Epoch

官方说明