基于etcd集群的服务发现应用 | 豆包MarsCode AI刷题

147 阅读4分钟

基于etcd集群的服务发现功能尝试

为什么需要服务发现

在微服务架构中,服务实例的数量和位置可能会随时改变,比如扩容节点、更改节点。服务发现可以实时跟踪这些变化,让服务能够找到当前可用的其他服务实例。

减少硬编码地址的依赖,服务调用侧不持有服务的地址和端口,只通过服务注册中心获取地址和端口。

与节点监控结合后,请求不会发送到已故障或不可用的节点上,也可支持负载均衡。

etcd

etcd一款基于raft分布式一致性算法实现的分布式键值数据库。(据说源自MIT 6.824 Distributed System的课设)

由于本文作者没有集群服务器,因此使用docker compose将etcd部署在自己的Windows笔记本上。Docker compose.yml配置文件如下所示:

services:
    node1:
      image: quay.io/coreos/etcd:v3.5.9-amd64
      volumes:
        - type: bind
          source: /d/Program Files/DockerImage/etcd/node1-data
          target: /etcd-data
      ports: 
        - "2379:2379"
        - "2380:2380"
      networks:
        cluster_net:
          ipv4_address: 172.16.238.100
      environment:
        - ETCDCTL_API=3
      command:
        - /usr/local/bin/etcd
        - --data-dir=/etcd-data
        - --name
        - node1
        - --initial-advertise-peer-urls
        - http://172.16.238.100:2380
        - --listen-peer-urls
        - http://0.0.0.0:2380
        - --advertise-client-urls
        - http://172.16.238.100:2379
        - --listen-client-urls
        - http://0.0.0.0:2379
        - --initial-cluster
        - node1=http://172.16.238.100:2380,node2=http://172.16.238.101:2380,node3=http://172.16.238.102:2380
        - --initial-cluster-state
        - new
        - --initial-cluster-token
        - docker-etcd
  
    node2:
      image: quay.io/coreos/etcd:v3.5.9-amd64
      volumes:
        - type: bind
          source: /d/Program Files/DockerImage/etcd/node2-data
          target: /etcd-data
      networks:
        cluster_net:
          ipv4_address: 172.16.238.101
      environment:
        - ETCDCTL_API=3
      ports: 
        - "2369:2379"
        - "2370:2380"
      command:
        - /usr/local/bin/etcd
        - --data-dir=/etcd-data
        - --name
        - node2
        - --initial-advertise-peer-urls
        - http://172.16.238.101:2380
        - --listen-peer-urls
        - http://0.0.0.0:2380
        - --advertise-client-urls
        - http://172.16.238.101:2379
        - --listen-client-urls
        - http://0.0.0.0:2379
        - --initial-cluster
        - node1=http://172.16.238.100:2380,node2=http://172.16.238.101:2380,node3=http://172.16.238.102:2380
        - --initial-cluster-state
        - new
        - --initial-cluster-token
        - docker-etcd
  
    node3:
      image: quay.io/coreos/etcd:v3.5.9-amd64
      volumes:
        - type: bind
          source: /d/Program Files/DockerImage/etcd/node3-data
          target: /etcd-data
      networks:
        cluster_net:
          ipv4_address: 172.16.238.102
      environment:
        - ETCDCTL_API=3
      ports: 
        - "2359:2379"
        - "2360:2380"
      command:
        - /usr/local/bin/etcd
        - --data-dir=/etcd-data
        - --name
        - node3
        - --initial-advertise-peer-urls
        - http://172.16.238.102:2380
        - --listen-peer-urls
        - http://0.0.0.0:2380
        - --advertise-client-urls
        - http://172.16.238.102:2379
        - --listen-client-urls
        - http://0.0.0.0:2379
        - --initial-cluster
        - node1=http://172.16.238.100:2380,node2=http://172.16.238.101:2380,node3=http://172.16.238.102:2380
        - --initial-cluster-state
        - new
        - --initial-cluster-token
        - docker-etcd
  
networks:
  cluster_net:
    driver: bridge
    ipam:
      driver: default
      config:
      - 
        subnet: 172.16.238.0/24
        gateway: 172.16.238.1

启动集群:docker compose up -d-d表示daemon

关闭集群:docker compose down

在桌面端查看容器状态

image.png

使用etcd官方的v3版本的golang客户端API,在宿主机上测试集群是否联通:

func TestEtcdConnection(t *testing.T) {
	cli, err := clientv3.New(clientv3.Config{
		Endpoints:   []string{"localhost:2379", "localhost:2369", "localhost:2359"},
		DialTimeout: 5 * time.Second,
	})
	if err != nil {
		log.Panicln(err)
	}
	defer cli.Close()
	log.Println("putting")
	ctx, _ := context.WithTimeout(context.Background(), time.Second*2)
	_, err = cli.Put(ctx, "test", "test")
	if err != nil {
		log.Panic(err)
	}

	log.Println("getting")
	ctx, _ = context.WithTimeout(context.Background(), time.Second*2)
	resp, err := cli.Get(ctx, "test")
	if err != nil {
		log.Panic(err)
	}
	
	for _, ev := range resp.Kvs {
		fmt.Println(string(ev.Key), string(ev.Value))
	}
}

编写IDL

这里使用thrift协议,需要先安装一下thriftgo。

go install github.com/cloudwego/thriftgo@latest

测试一下

thriftgo --version

这里按照CloudWeGo的进阶教程,编写base.thriftitem.thriftstock.thrift

// base
namespace go example.base

struct BaseResp {
    1: string code
    2: string msg
}
// item
namespace go example.item

include "base.thrift"

struct Item {
    1: i64 id
    2: string title
    3: string description
    4: i64 stock
}

struct GetItemReq {
    1: required i64 id
}

struct GetItemResp {
    1: Item item

    255: base.BaseResp baseResp
}

service ItemService{
    GetItemResp GetItem(1: GetItemReq req)
}
// stock
namespace go example.stock

include "base.thrift"

struct GetItemStockReq {
    1: required i64 item_id
}

struct GetItemStockResp {
    1: i64 stock

    255: base.BaseResp BaseResp
}

service StockService {
    GetItemStockResp GetItemStock(1:GetItemStockReq req)
}

kitex生产代码

安装kitex

go install github.com/cloudwego/kitex/tool/cmd/kitex@latest

测试一下

kitex --version

然后使用kitex生成对应的RPC代码代码,

kitex -module example ./idl/item.thrift
kitex -module example ./idl/stock.thrift

代码会生成在kitex_gen/example中,再生产对应的ClientServer脚手架代码:

# 创建rpc/item后,在 rpc/item下
kitex -module exmaple -service example.item -use example/kitex_gen ../../idl/item.thrift

RPC业务逻辑在func (s *ItemServiceImpl) GetItem(ctx context.Context, req *item.GetItemReq) (resp *item.GetItemResp, err error)内,函数名与IDL文件item.thrift内的GetItemResp GetItem(1: GetItemReq req)对应。

stock与之类似。

注册服务

NewServer函数中设置server.WithRegistry()选项即可。服务端侧的服务名(这里是example.item)要和客户端侧的服务名对上。

etcdRegistry, err := etcd.NewEtcdRegistry([]string{
    "localhost:2379", "localhost:2369", "localhost:2359"})
if err != nil {
    log.Fatal(err)
}

addr, _ := net.ResolveTCPAddr("tcp", "127.0.0.1:8888")

itemServiceImpl := new(ItemServiceImpl)

svr := item.NewServer(itemServiceImpl,
    server.WithServiceAddr(addr),
    server.WithRegistry(etcdRegistry),
    server.WithServerBasicInfo(
        &rpcinfo.EndpointBasicInfo{
                ServiceName: "example.item", //
        }),
)

在etcd节点上使用etcdctl查看结果

docker exec -it etcd-docker-compose-node1-1 etcdctl get --prefix "kitex"
# 结果
kitex/registry-etcd/example.item/127.0.0.1:8888
{"network":"tcp","address":"127.0.0.1:8888","weight":10,"tags":null}

期间的遇到一个问题:

server.WithServiceAddr(addr)注释或者将addr设置为0.0.0.0:8888后,etcd查询的结果的address从127.0.0.1:8888变成了169.254.64.52:8888,然后使用客户端获取服务时,报错:[happened in biz handler, method=ItemService.GetItem, please check the panic at the server side] service discovery error: no instance remains for exmaple.item。

网上搜到是,当计算机试图通过DHCP从网络服务器获取IP地址但失败时,操作系统会自动生成一个169.254.x.x范围内的IP地址。此措施保证即便没有获得网络管理员分配的IP地址,计算机在本地网络仍保持通信能力。

随后将又在宿主机运行了etcd,重试后依然如此,不知为什么。若有大佬能解答,十分感谢。

客户端侧

r, err := etcd.NewEtcdResolver([]string{"localhost:2379", "localhost:2369", "localhost:2359"})
if err != nil {
    log.Fatal(err)
}
c, err := itemservice.NewClient("example.item", client.WithResolver(r))
if err != nil {
    log.Fatal(err)
}
req := item.NewGetItemReq()
// 设置request
// ...

resp, err := c.GetItem(context.Background(), req, callopt.WithRPCTimeout(3*time.Second))

// 处理response
// ...

stock与之类似。