在分布式系统中,一个操作需要涉及到多个系统,多个进程的调用,出现问题的时候很难去排查跟踪。Jager提供一个服务来帮助我们追踪调用在多个服务和系统之间的全链路传输情况。
trace和span概念
学习Jaeger首先需要了解一下teace 和span的概念
trace:
在广义上,一个trace代表了一个事务或者流程在(分布式)系统中的整个执行过程,是整个链路视图。trace 是多个 span组成的一个有向无环图(DAG),每一个span代表trace中被命名并计时的连续性的执行片段。
span:
一个Span代表系统中具有开始时间和执行时长的逻辑运行单元。Span之间通过嵌套或者顺序排列建立逻辑因果关系。Span代表整个链路中不同服务内部的视图,所有的span 组合在一起就是整个 trace 的视图。
可以参考本篇文章进行理解
从实践学习Jaeger
Jaeger是Uber团队开发分布式链路追踪产品,由GO语言开发,主要由jaeger-client,jaeger-agent,jaeger-collector,storage,jaeger-quer几个部分组成。
简单的运行流程即:client 上报span给agent,collector 收集 agent 的数据(有Push和Pull两种方式),展示到前端WebUI
下面从最简单的例子来入门学习Jager全链路追踪。
1、部署Jaeger client实例
利用docker 启动的方式部署一个client 实例(可供学习使用,不可直接用于生产)
docker run -d -p 5775:5775/udp -p 16686:16686 -p 14250:14250 -p 14268:14268 jaegertracing/all-in-one:latest
打开 http://localhost:16686/search 可视化webUI
2、创建trace和上报信息
import (
"fmt"
"github.com/opentracing/opentracing-go"
"github.com/opentracing/opentracing-go/log"
"github.com/uber/jaeger-client-go"
"github.com/uber/jaeger-client-go/config"
"io"
)
func Init(service string) (opentracing.Tracer, io.Closer) {
// trace 配置
cfg := &config.Configuration{
ServiceName: service,
Sampler: &config.SamplerConfig{
Type: jaeger.SamplerTypeConst,
Param: 1,
},
Reporter: &config.ReporterConfig{
LogSpans: true,
// collector 信息根据自己ip配置
CollectorEndpoint: "http://127.0.0.1:14268/api/traces",
},
}
// 根据上面的配置新建一个tracer
tracer, closer, err := cfg.NewTracer(config.Logger(jaeger.StdLogger))
if err != nil {
panic(fmt.Sprintf("ERROR: cannot init Jaeger: %v\n", err))
}
return tracer, closer
}
func main() {
tracer, closer := Init("hello-world")
helloTo := "rookie in jaeger"
defer closer.Close()
// 创建一个span并且设置tag
span := tracer.StartSpan("say-hello")
span.SetTag("hello-to", helloTo)
helloStr := fmt.Sprintf("Hello, %s!", helloTo)
// LogFields和LogKV都可以设置log
span.LogFields(
log.String("event", "string-format"),
log.String("value", helloStr),
)
println(helloStr)
span.LogKV("event", "println")
span.Finish()
}
这段代码简单Init了一个trace,这个trace有一个span,运行后可以在web上看到这个trace
现在这个demo只是一个最简单的调用链路,只有一个span,没什么太大意义,jaeger的威力是在多个span上体现的
下面我们多创建几个Span,修改main函数切增加两个函数
func main() {
tracer, closer := Init("hello-world")
helloTo := "rookie in jaeger"
defer closer.Close()
opentracing.SetGlobalTracer(tracer)
span := tracer.StartSpan("say-hello")
span.SetTag("hello-to", helloTo)
ctx := opentracing.ContextWithSpan(context.Background(), span)
helloStr := formatString(ctx, helloTo)
printHello(ctx, helloStr)
span.Finish()
}
func formatString(ctx context.Context, helloTo string) string {
span, _ := opentracing.StartSpanFromContext(ctx, "formatString")
defer span.Finish()
helloStr := fmt.Sprintf("Hello, %s!", helloTo)
span.LogFields(
log.String("event", "string-format"),
log.String("value", helloStr),
)
return helloStr
}
func printHello(ctx context.Context, helloStr string) {
span, _ := opentracing.StartSpanFromContext(ctx, "printHello")
defer span.Finish()
println(helloStr)
span.LogKV("event", "println")
}
Say-hello是一个root span,
ContextWithSpan通过span创建了上下文信息,传递给两个函数formatString,printHello。StartSpanFromContext又根据ctx创建了span,创建出来的span是原来span的child-span。
这个模拟的是一个进程内的调用链路。
下面在模拟一下多个进程间的调用链路:
client/client.go
package main
import (
"context"
"encoding/json"
"fmt"
"github.com/opentracing/opentracing-go"
"github.com/opentracing/opentracing-go/ext"
"github.com/opentracing/opentracing-go/log"
"github.com/uber/jaeger-client-go"
"github.com/uber/jaeger-client-go/config"
"io"
"net/http"
"net/url"
)
func Init(service string) (opentracing.Tracer, io.Closer) {
cfg := &config.Configuration{
ServiceName: service,
Sampler: &config.SamplerConfig{
Type: jaeger.SamplerTypeConst,
Param: 1,
},
Reporter: &config.ReporterConfig{
LogSpans: true,
CollectorEndpoint: "http://127.0.0.1:14268/api/traces",
},
}
tracer, closer, err := cfg.NewTracer(config.Logger(jaeger.StdLogger))
if err != nil {
panic(fmt.Sprintf("ERROR: cannot init Jaeger: %v\n", err))
}
return tracer, closer
}
func main() {
tracer, closer := Init("hello-world")
helloTo := "rookie in jaeger"
defer closer.Close()
opentracing.SetGlobalTracer(tracer)
span := tracer.StartSpan("say-hello")
span.SetTag("hello-to", helloTo)
ctx := opentracing.ContextWithSpan(context.Background(), span)
helloStr := formatString(ctx, helloTo)
printHello(ctx, helloStr)
span.Finish()
}
func formatString(ctx context.Context, helloTo string) string {
span, _ := opentracing.StartSpanFromContext(ctx, "formatString")
defer span.Finish()
v := url.Values{}
v.Set("helloTo", helloTo)
url := "http://localhost:8081/format?" + v.Encode()
req, err := http.NewRequest("GET", url, nil)
if err != nil {
panic(err.Error())
}
ext.SpanKindRPCClient.Set(span)
ext.HTTPUrl.Set(span, url)
ext.HTTPMethod.Set(span, "GET")
span.Tracer().Inject(
span.Context(),
opentracing.HTTPHeaders,
opentracing.HTTPHeadersCarrier(req.Header),
)
resp, err := http.DefaultClient.Do(req)
if err != nil {
ext.LogError(span, err)
panic(err.Error())
}
helloStr, _ := json.Marshal(resp)
span.LogFields(
log.String("event", "string-format"),
log.String("value", string(helloStr)),
)
return string(helloStr)
}
func printHello(ctx context.Context, helloStr string) {
span, _ := opentracing.StartSpanFromContext(ctx, "printHello")
defer span.Finish()
v := url.Values{}
v.Set("helloStr", helloStr)
url := "http://localhost:8082/publish?" + v.Encode()
req, err := http.NewRequest("GET", url, nil)
if err != nil {
panic(err.Error())
}
ext.SpanKindRPCClient.Set(span)
ext.HTTPUrl.Set(span, url)
ext.HTTPMethod.Set(span, "GET")
span.Tracer().Inject(span.Context(), opentracing.HTTPHeaders, opentracing.HTTPHeadersCarrier(req.Header))
if _, err := http.DefaultClient.Do(req); err != nil {
ext.LogError(span, err)
panic(err.Error())
}
}
Foam/formatter.go
package main
import (
"fmt"
"github.com/opentracing/opentracing-go"
"github.com/opentracing/opentracing-go/ext"
otlog "github.com/opentracing/opentracing-go/log"
"github.com/uber/jaeger-client-go"
"github.com/uber/jaeger-client-go/config"
"io"
"log"
"net/http"
)
func Init(service string) (opentracing.Tracer, io.Closer) {
cfg := &config.Configuration{
ServiceName: service,
Sampler: &config.SamplerConfig{
Type: jaeger.SamplerTypeConst,
Param: 1,
},
Reporter: &config.ReporterConfig{
LogSpans: true,
CollectorEndpoint: "http://127.0.0.1:14268/api/traces",
},
}
tracer, closer, err := cfg.NewTracer(config.Logger(jaeger.StdLogger))
if err != nil {
panic(fmt.Sprintf("ERROR: cannot init Jaeger: %v\n", err))
}
return tracer, closer
}
func main() {
tracer, closer := Init("formatter")
defer closer.Close()
http.HandleFunc("/format", func(w http.ResponseWriter, r *http.Request) {
spanCtx, _ := tracer.Extract(opentracing.HTTPHeaders, opentracing.HTTPHeadersCarrier(r.Header))
span := tracer.StartSpan("format", ext.RPCServerOption(spanCtx))
defer span.Finish()
helloTo := r.FormValue("helloTo")
helloStr := fmt.Sprintf("Hello, %s!", helloTo)
span.LogFields(
otlog.String("event", "string-format"),
otlog.String("value", helloStr),
)
w.Write([]byte(helloStr))
})
log.Fatal(http.ListenAndServe(":8081", nil))
}
Publish/publisher.go
package main
import (
"fmt"
"github.com/opentracing/opentracing-go"
"github.com/opentracing/opentracing-go/ext"
"github.com/uber/jaeger-client-go"
"github.com/uber/jaeger-client-go/config"
"io"
"log"
"net/http"
)
func Init(service string) (opentracing.Tracer, io.Closer) {
cfg := &config.Configuration{
ServiceName: service,
Sampler: &config.SamplerConfig{
Type: jaeger.SamplerTypeConst,
Param: 1,
},
Reporter: &config.ReporterConfig{
LogSpans: true,
CollectorEndpoint: "http://127.0.0.1:14268/api/traces",
},
}
tracer, closer, err := cfg.NewTracer(config.Logger(jaeger.StdLogger))
if err != nil {
panic(fmt.Sprintf("ERROR: cannot init Jaeger: %v\n", err))
}
return tracer, closer
}
func main() {
tracer, closer := Init("publisher")
defer closer.Close()
http.HandleFunc("/publish", func(w http.ResponseWriter, r *http.Request) {
spanCtx, _ := tracer.Extract(opentracing.HTTPHeaders, opentracing.HTTPHeadersCarrier(r.Header))
span := tracer.StartSpan("publish", ext.RPCServerOption(spanCtx))
defer span.Finish()
helloStr := r.FormValue("helloStr")
println(helloStr)
})
log.Fatal(http.ListenAndServe(":8082", nil))
}
formater和publisher实际上就是启动了两个http服务,收到client的请求后,提取上下文并新建child-span
执行代码后可以看到下面这个trace,可以比较清晰的看到每个span的执行顺序和执行时间。
Inject将trace相关信息注入http header中;extract将上下文信息提取出来;可以参考文章深入理解一下
到这里我们应该已经大致理解如何使用jaeger进行链路追踪了,实际应用中,通过ctx传递trace,通过各种拦截器进行trace的提取和注入,用于链路追踪,这应该是最常用和简单的实践了。