grafana schedule 设计

21 阅读3分钟

grafana schedule 用于定期的去对一些告警的规则去判断。

grafana 围绕着 schedule 做了很多事。

一个简单的 ticker 可能是这样的:

func (s *schedule) Run() {
   t := time.NewTicker()
   defer t.Close 
   for range t.Chan() {
       processTick(time.Now)
   }
}

它有一个问题,如果 processTick 耗时过长,那么 time.NewTicker 所产生的 tick 可能会丢失。

不会丢失 tick 的 ticker

因此 grafana 实现了一个自己 的 ticker 包去保证不会丢失 tick:


// Ticker is a ticker to power the alerting scheduler. it's like a time.Ticker, except:
// * it doesn't drop ticks for slow receivers, rather, it queues up.  so that callers are in control to instrument what's going on.
// * it ticks on interval marks or very shortly after. this provides a predictable load pattern
//   (this shouldn't cause too much load contention issues because the next steps in the pipeline just process at their own pace)
// * the timestamps are used to mark "last datapoint to query for" and as such, are a configurable amount of seconds in the past
type Ticker struct {
	C        chan time.Time
	clock    clock.Clock
	last     time.Time
	interval time.Duration
	metrics  *metrics.Ticker
	stopCh   chan struct{}
}

可预测的 ticker

如果有多个 grafana 实例,那么每个实例的 goroutine 初始调度时间随机,因此第一个 tick 实际上是 ns 级别的时间戳。

比如第一次 tick 如果在 2022-06-06 17:15:27.9064497,那么就会根据 baseInterval 去触发,也就是 2022-06-06 17:15:20。

可以让"为什么我的规则被触发”这个问题更容易回答

github.com/grafana/gra…

均衡的负载执行

grafana schedule 通常默认每隔 10s,去执行一下现在能够执行的 rule。

为了避免 prometheus server 负载过高,每个负载会均匀的在时间氛围中执行。

比如有 100 个规则和 10s,那么每个规则之间的执行时间会间隔 100ms。

	// 1. It is used by the internal scheduler's timer to tick at this interval.
	// 2. to spread evaluations of rules that need to be evaluated at the current tick T. In other words, the evaluation of rules at the tick T will be evenly spread in the interval from T to T+scheduler_tick_interval.
	//    For example, if there are 100 rules that need to be evaluated at tick T, and the base interval is 10s, rules will be evaluated every 100ms.
	// 3. It increases delay between rule updates and state reset.
	// NOTE:
	// 1. All alert rule intervals should be times of this interval. Otherwise, the rules will not be evaluated. It is not recommended to set it lower than 10s or odd numbers. Recommended: 10s, 30s, 1m
	// 2. The increasing of the interval will affect how slow alert rule updates will reset the state, and therefore reset notification. Higher the interval - slower propagation of the changes.

github.com/grafana/gra…

jitter 随机执行

以上其实还有个问题,比如一个规则每 5m 评估一次,而每 10s 会去判断一次是否应该评估。

那么这个规则会固定的在 00, 05, 10 的开头处进行评估,这还是容易造成数据库的高负载。

因此我们将每个 baseInterval 分成n 个桶。

比如 5m / 10s ,就有 60个 桶,每个规则会随机在这 60 个桶之间去分配执行。

这样规则就会在评估间隔之间随机执行以减少负载。

对于相同的规则,总是在相同的桶中,所以需要通过 rule 去计算一个 hash,这样不同的 grafana 实例也会有相同的运行时间。

github.com/grafana/gra…

remove monotic time

在调度的时候,需要进行一些打点。

此时需要移除 monotonic clock, 不然反应的时间并不是经过的物理时间。。。

func (sch *schedule) schedulePeriodic(ctx context.Context, t *ticker.T) error {
    dispatcherGroup, ctx := errgroup.WithContext(ctx)
    for {
       select {
       case tick := <-t.C:
          // We use Round(0) on the start time to remove the monotonic clock.
          // This is required as ticks from the ticker and time.Now() can have
          // a monotonic clock that when subtracted do not represent the delta
          // in wall clock time.
          start := time.Now().Round(0)
          sch.metrics.BehindSeconds.Set(start.Sub(tick).Seconds())

          sch.processTick(ctx, dispatcherGroup, tick)

          sch.metrics.SchedulePeriodicDuration.Observe(time.Since(start).Seconds())
       case <-ctx.Done():
          // waiting for all rule evaluation routines to stop
          waitErr := dispatcherGroup.Wait()
          return waitErr
       }
    }
}

github.com/grafana/gra…

implementation

在 schedule 的设计中,每次 tick 都会。

  • 获取 rule diff
  • 停止旧规则
  • 建立新规则
  • 每个需要执行判断的 rule,传入 tick 的当前时间,进行判断规则。