【prometheus】-04 轻松搞定Prometheus Eureka服务发现

794 阅读4分钟

Prometheus服务发现机制之Eureka

概述

Eureka服务发现协议允许使用Eureka Rest API检索出Prometheus需要监控的targets,Prometheus会定时周期性的从Eureka调用Eureka Rest API,并将每个应用实例创建出一个target。

Eureka服务发现协议支持对如下元标签进行relabeling

  • __meta_eureka_app_name: the name of the app
  • __meta_eureka_app_instance_id: the ID of the app instance
  • __meta_eureka_app_instance_hostname: the hostname of the instance
  • __meta_eureka_app_instance_homepage_url: the homepage url of the app instance
  • __meta_eureka_app_instance_statuspage_url: the status page url of the app instance
  • __meta_eureka_app_instance_healthcheck_url: the health check url of the app instance
  • __meta_eureka_app_instance_ip_addr: the IP address of the app instance
  • __meta_eureka_app_instance_vip_address: the VIP address of the app instance
  • __meta_eureka_app_instance_secure_vip_address: the secure VIP address of the app instance
  • __meta_eureka_app_instance_status: the status of the app instance
  • __meta_eureka_app_instance_port: the port of the app instance
  • __meta_eureka_app_instance_port_enabled: the port enabled of the app instance
  • __meta_eureka_app_instance_secure_port: the secure port address of the app instance
  • __meta_eureka_app_instance_secure_port_enabled: the secure port of the app instance
  • __meta_eureka_app_instance_country_id: the country ID of the app instance
  • __meta_eureka_app_instance_metadata_<metadataname>: app instance metadata
  • __meta_eureka_app_instance_datacenterinfo_name: the datacenter name of the app instance
  • __meta_eureka_app_instance_datacenterinfo_<metadataname>: the datacenter metadata

eureka_sd_configs配置可选项如下:

# The URL to connect to the Eureka server.
server: <string>

# Sets the `Authorization` header on every request with the
# configured username and password.
# password and password_file are mutually exclusive.
basic_auth:
  [ username: <string> ]
  [ password: <secret> ]
  [ password_file: <string> ]

# Optional `Authorization` header configuration.
authorization:
  # Sets the authentication type.
  [ type: <string> | default: Bearer ]
  # Sets the credentials. It is mutually exclusive with
  # `credentials_file`.
  [ credentials: <secret> ]
  # Sets the credentials to the credentials read from the configured file.
  # It is mutually exclusive with `credentials`.
  [ credentials_file: <filename> ]

# Optional OAuth 2.0 configuration.
# Cannot be used at the same time as basic_auth or authorization.
oauth2:
  [ <oauth2> ]

# Configures the scrape request's TLS settings.
tls_config:
  [ <tls_config> ]

# Optional proxy URL.
[ proxy_url: <string> ]

# Configure whether HTTP requests follow HTTP 3xx redirects.
[ follow_redirects: <bool> | default = true ]

# Refresh interval to re-read the app instance list.
[ refresh_interval: <duration> | default = 30s ]

协议分析

通过前面分析的Prometheus服务发现原理以及基于文件方式服务发现协议实现的分析,Eureka服务发现大致原理如下图:

image-20210819234601299.png

通过解析配置中eureka_sd_configs协议的job生成Config,然后NewDiscovery方法创建出对应的Discoverer,最后调用Discoverer.Run()方法启动服务发现targets。

1、基于文件服务发现配置解析

假如我们定义如下job:

- job_name: 'eureka'
  eureka_sd_configs:
    - server: http://localhost:8761/eureka
    

会被解析成eureka.SDConfig如下:

image-20210819001313270.png

eureka.SDConfig定义如下:

type SDConfig struct {
    // eureka-server地址
	Server           string                  `yaml:"server,omitempty"`
    // http请求client配置,如:认证信息
	HTTPClientConfig config.HTTPClientConfig `yaml:",inline"`
    // 周期刷新间隔,默认30s
	RefreshInterval  model.Duration          `yaml:"refresh_interval,omitempty"`
}

2、Discovery创建

func NewDiscovery(conf *SDConfig, logger log.Logger) (*Discovery, error) {
	rt, err := config.NewRoundTripperFromConfig(conf.HTTPClientConfig, "eureka_sd", config.WithHTTP2Disabled())
	if err != nil {
		return nil, err
	}

	d := &Discovery{
		client: &http.Client{Transport: rt},
		server: conf.Server,
	}
	d.Discovery = refresh.NewDiscovery(
		logger,
		"eureka",
		time.Duration(conf.RefreshInterval),
		d.refresh,
	)
	return d, nil
}

3、Discovery创建完成,最后会调用Discovery.Run()启动服务发现:

和上一节分析的服务发现之File机制类似,执行Run方法时会执行tgs, err := d.refresh(ctx),然后创建定时周期触发器,不停执行tgs, err := d.refresh(ctx),将返回的targets结果信息通过channel传递出去。

4、上面Run方法核心是调用d.refresh(ctx)逻辑获取targets,基于Eureka发现协议主要实现逻辑就在这里:

func (d *Discovery) refresh(ctx context.Context) ([]*targetgroup.Group, error) {
	// 通过Eureka REST API接口从eureka拉取元数据:http://ip:port/eureka/apps
	apps, err := fetchApps(ctx, d.server, d.client)
	if err != nil {
		return nil, err
	}

	tg := &targetgroup.Group{
		Source: "eureka",
	}

	for _, app := range apps.Applications {//遍历app
        // targetsForApp()方法将app下每个instance部分转成target
		targets := targetsForApp(&app)
        //假如到
		tg.Targets = append(tg.Targets, targets...)
	}
	return []*targetgroup.Group{tg}, nil
}

refresh方法主要有两个流程:

​ 1、fetchApps():从eureka-server/eureka/apps接口拉取注册服务信息;

​ 2、targetsForApp():遍历appinstance,将每个instance解析出一个target,并添加一堆元标签数据。

如下就是从eureka-server的/eureka/apps接口拉取的注册服务信息:

<applications>
    <versions__delta>1</versions__delta>
    <apps__hashcode>UP_1_</apps__hashcode>
    <application>
        <name>SERVICE-PROVIDER-01</name>
        <instance>
            <instanceId>localhost:service-provider-01:8001</instanceId>
            <hostName>192.168.3.121</hostName>
            <app>SERVICE-PROVIDER-01</app>
            <ipAddr>192.168.3.121</ipAddr>
            <status>UP</status>
            <overriddenstatus>UNKNOWN</overriddenstatus>
            <port enabled="true">8001</port>
            <securePort enabled="false">443</securePort>
            <countryId>1</countryId>
            <dataCenterInfo class="com.netflix.appinfo.InstanceInfo$DefaultDataCenterInfo">
                <name>MyOwn</name>
            </dataCenterInfo>
            <leaseInfo>
                <renewalIntervalInSecs>30</renewalIntervalInSecs>
                <durationInSecs>90</durationInSecs>
                <registrationTimestamp>1629385562130</registrationTimestamp>
                <lastRenewalTimestamp>1629385682050</lastRenewalTimestamp>
                <evictionTimestamp>0</evictionTimestamp>
                <serviceUpTimestamp>1629385562132</serviceUpTimestamp>
            </leaseInfo>
            <metadata>
                <management.port>8001</management.port>
                <scrape__enable>true</scrape__enable>
                <scrape.port>8080</scrape.port>
            </metadata>
            <homePageUrl>http://192.168.3.121:8001/</homePageUrl>
            <statusPageUrl>http://192.168.3.121:8001/actuator/info</statusPageUrl>
            <healthCheckUrl>http://192.168.3.121:8001/actuator/health</healthCheckUrl>
            <vipAddress>service-provider-01</vipAddress>
            <secureVipAddress>service-provider-01</secureVipAddress>
            <isCoordinatingDiscoveryServer>false</isCoordinatingDiscoveryServer>
            <lastUpdatedTimestamp>1629385562132</lastUpdatedTimestamp>
            <lastDirtyTimestamp>1629385562039</lastDirtyTimestamp>
            <actionType>ADDED</actionType>
        </instance>
    </application>
</applications>

5、instance信息解析target

func targetsForApp(app *Application) []model.LabelSet {
	targets := make([]model.LabelSet, 0, len(app.Instances))

	// Gather info about the app's 'instances'. Each instance is considered a task.
	for _, t := range app.Instances {
		var targetAddress string
        // __address__取值方式:instance.hostname和port,没有port则默认port=80
		if t.Port != nil {
			targetAddress = net.JoinHostPort(t.HostName, strconv.Itoa(t.Port.Port))
		} else {
			targetAddress = net.JoinHostPort(t.HostName, "80")
		}

		target := model.LabelSet{
			model.AddressLabel:  lv(targetAddress),
			model.InstanceLabel: lv(t.InstanceID),

			appNameLabel:                     lv(app.Name),
			appInstanceHostNameLabel:         lv(t.HostName),
			appInstanceHomePageURLLabel:      lv(t.HomePageURL),
			appInstanceStatusPageURLLabel:    lv(t.StatusPageURL),
			appInstanceHealthCheckURLLabel:   lv(t.HealthCheckURL),
			appInstanceIPAddrLabel:           lv(t.IPAddr),
			appInstanceVipAddressLabel:       lv(t.VipAddress),
			appInstanceSecureVipAddressLabel: lv(t.SecureVipAddress),
			appInstanceStatusLabel:           lv(t.Status),
			appInstanceCountryIDLabel:        lv(strconv.Itoa(t.CountryID)),
			appInstanceIDLabel:               lv(t.InstanceID),
		}

		if t.Port != nil {
			target[appInstancePortLabel] = lv(strconv.Itoa(t.Port.Port))
			target[appInstancePortEnabledLabel] = lv(strconv.FormatBool(t.Port.Enabled))
		}

		if t.SecurePort != nil {
			target[appInstanceSecurePortLabel] = lv(strconv.Itoa(t.SecurePort.Port))
			target[appInstanceSecurePortEnabledLabel] = lv(strconv.FormatBool(t.SecurePort.Enabled))
		}

		if t.DataCenterInfo != nil {
			target[appInstanceDataCenterInfoNameLabel] = lv(t.DataCenterInfo.Name)

			if t.DataCenterInfo.Metadata != nil {
				for _, m := range t.DataCenterInfo.Metadata.Items {
					ln := strutil.SanitizeLabelName(m.XMLName.Local)
					target[model.LabelName(appInstanceDataCenterInfoMetadataPrefix+ln)] = lv(m.Content)
				}
			}
		}

		if t.Metadata != nil {
			for _, m := range t.Metadata.Items {
                // prometheus label只支持[^a-zA-Z0-9_]字符,其它非法字符都会被替换成下划线_
				ln := strutil.SanitizeLabelName(m.XMLName.Local)
				target[model.LabelName(appInstanceMetadataPrefix+ln)] = lv(m.Content)
			}
		}

		targets = append(targets, target)

	}
	return targets
}

解析比较简单,就不再分析,解析后的标签数据如下图:

image-20210819231535696.png

标签中有两个特别说明下:

​ 1、__address__:这个取值instance.hostname和port(默认80),所以要注意注册到eureka上的hostname准确性,不然可能无法抓取;

​ 2、metadata-map数据会被转成__meta_eureka_app_instance_metadata_<metadataname>格式标签,prometheus进行relabeling 一般操作metadata-map,可以自定义metric_path、抓取端口等;

​ 3、prometheuslabel只支持[a-zA-Z0-9_],其它非法字符都会被转换成下划线,具体参加:strutil.SanitizeLabelName(m.XMLName.Local);但是eureka的metadata-map标签含有下划线时,注册到eureka-server上变成双下划线,如下配置:

eureka:
  instance:
    metadata-map:
      scrape_enable: true
      scrape.port: 8080

通过/eureka/apps获取如下:

image-20210820001740801.png

总结

基于Eureka方式的服务原理如下图:

image-20210819234601299.png

大概说明:Discoverer启动后定时周期触发从eureka server/eureka/apps接口拉取注册服务元数据,然后通过targetsForApp遍历app下的instance,将每个instance解析成target,并将其它元数据信息转换成target原标签可以用于target抓取前relabeling操作。

扫码_搜索联合传播样式-标准色版.png