prometheus通过scrape,从对应的target拉取数据,存入tsdb。
targetScraper
对每一个target,使用http get方法拉取数据。通过ctx控制http超时。
使用header X-Prometheus-Scrape-Timeout-Seconds 控制客户端收集数据超时时间。
数据通过gzip压缩写入一个writer。
该writer是一个通过bytes.buffer包装的[]byte切片。
维护一个不同size的[]byte的sync.pool,然后使用lastScrapeSize去获取[]byte,也就是最近scrape的大小。
fanoutstorage
fanoutstorage由primary storage + second storage组成:
- 写入,先写primary,再顺序写second
- 读取。primary读取失败则返回。second全部读取失败才算失败。
scrapeloop
通过一个ticker,scrapeloop每一段时间就会对一个target进行scrape:
此时我们会通过time.now获取一个scrape time。
scrapetime主要用于作为stable的series的时间戳。
但是go里头的ticker很有意思,它并不一定是精确的。
这会导致磁盘的开销增大。因此我们要对scrapetime进行一层align,让它尽量相近。
// Temporary workaround for a jitter in go timers that causes disk space
// increase in TSDB.
// See https://github.com/prometheus/prometheus/issues/7846
// Calling Round ensures the time used is the wall clock, as otherwise .Sub
// and .Add on time.Time behave differently (see time package docs).
scrapeTime := time.Now().Round(0)
if AlignScrapeTimestamps && sl.interval > 100*ScrapeTimestampTolerance {
// For some reason, a tick might have been skipped, in which case we
// would call alignedScrapeTime.Add(interval) multiple times.
for scrapeTime.Sub(alignedScrapeTime) >= sl.interval {
alignedScrapeTime = alignedScrapeTime.Add(sl.interval)
}
// Align the scrape time if we are in the tolerance boundaries.
if scrapeTime.Sub(alignedScrapeTime) <= ScrapeTimestampTolerance {
scrapeTime = alignedScrapeTime
}
}
在生成ticker的interval的时候,我们会有一个seed。用于错峰拉取。
seed每一个prometheus实例唯一。seed和interval会做一些计算生成新的interval
单prometheus实例内的不同target
单实例内的不同target,不是在某个时刻一起拉取,而是错开时间分别拉取。比如scrape_interval=30s,不同target的拉取时间如下:
- job=prometheus:20秒、50秒拉取;
- job=node-exporter:04秒、34秒拉取;
- job=pvc-test:07秒、37秒拉取;
多个promtheus对一个target
Scrapeloop对于获取的[]byte,通过mime.parse取出返回数据的form,然后用对应的parser去解析数据。
{`form-data; name="file"; filename="C:\dev\go\robots.txt"`, "form-data", m("name", "file", "filename", `C:\dev\go\robots.txt`)},
parset是一个iterator,通过next的形式迭代解析取出数据。
当遇到一个series时,判断是否在cache中:
- 如果series已经存在,从cache中取出lset加入dbappender
- 如果series未存在,
// scrapeCache tracks mappings of exposed metric strings to label sets and
// storage references. Additionally, it tracks staleness of series between
// scrapes.
type scrapeCache struct {
iter uint64 // Current scrape iteration.
// How many series and metadata entries there were at the last success.
successfulCount int
// Parsed string to an entry with information about the actual label set
// and its storage reference.
series map[string]*cacheEntry
// Cache of dropped metric strings and their iteration. The iteration must
// be a pointer so we can update it without setting a new entry with an unsafe
// string in addDropped().
droppedSeries map[string]*uint64
// seriesCur and seriesPrev store the labels of series that were seen
// in the current and previous scrape.
// We hold two maps and swap them out to save allocations.
seriesCur map[uint64]labels.Labels
seriesPrev map[uint64]labels.Labels
metaMtx sync.Mutex
metadata map[string]*metaEntry
}
scripecache维护一个seriesprev和seriescur分别表示当前scrape和上一次scrape的series。
然后对于那些在prev scrape出现的但是在cur scrape没有出现的series,append一个特殊的值,表示没有再出现。
这样我们就可以区别是scrape失败,还是scrape没有返回数据
sl.cache.forEachStale(func(lset labels.Labels) bool {
// Series no longer exposed, mark it stale.
_, err = app.Append(0, lset, defTime, math.Float64frombits(value.StaleNaN))
switch errors.Cause(err) {
case storage.ErrOutOfOrderSample, storage.ErrDuplicateSampleForTimestamp:
// Do not count these in logging, as this is expected if a target
// goes away and comes back again with a new scrape loop.
err = nil
}
return err == nil
})