kube-ovn vpc-dns测试

585 阅读6分钟
  1. 部署vpc-dns相关依赖


#  cat 01-pre-vpc-dns.yml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:vpc-dns
rules:
  - apiGroups:
    - ""
    resources:
    - endpoints
    - services
    - pods
    - namespaces
    verbs:
    - list
    - watch
  - apiGroups:
    - discovery.k8s.io
    resources:
    - endpointslices
    verbs:
    - list
    - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: vpc-dns
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:vpc-dns
subjects:
- kind: ServiceAccount
  name: vpc-dns
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: vpc-dns
  namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: vpc-dns-corefile
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health {
          lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        forward . /etc/resolv.conf {
          prefer_udp
        }
        cache 30
        loop
        reload
        loadbalance
    }


  1. 创建NAD

#  cat 02-ovn-nad.yml
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: ovn-nad
  namespace: default
spec:
  config: '{
      "cniVersion": "0.3.0",
      "type": "kube-ovn",
      "server_socket": "/run/openvswitch/kube-ovn-daemon.sock",
      "provider": "ovn-nad.default.ovn"
    }'



  1. 修改ovn-default的provider字段,关联到NAD

# k edit subnet ovn-default
## 修改spec中的provider字段

provider: ovn-nad.default.ovn

  1. 指定vip

一开始vpc-dns是为了对照localdns的功能,为了将整个集群的dns都统一为一个ip


# 读取coredns的ip
#  k get svc -n kube-system    coredns
NAME      TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
coredns   ClusterIP   10.96.0.3    <none>        53/UDP,53/TCP,9153/TCP   23h

#  cat 04-vpc-dns-cm.yml
apiVersion: v1
kind: ConfigMap
metadata:
  name: vpc-dns-config
  namespace: kube-system
data:
  coredns-vip: 10.96.0.3
  enable-vpc-dns: "true"
  nad-name: ovn-nad
  nad-provider: ovn-nad.default.ovn

  1. 自定义 vpc 启用dns

#  cat 05-deploy-vpc-dns-in-vpc-subnet.yml
kind: VpcDns
apiVersion: kubeovn.io/v1
metadata:
  name: zbb-test-vpc1-dns
spec:
  vpc: vpc1
  subnet: vpc1-subnet1 # switch lb 作用在该switch
  



遇到问题1:

 k logs -f -n kube-system    kube-ovn-controller-7c554b984-pshgl


E0330 09:16:55.037036       7 vpc_dns.go:336] failed to get coredns template file, Get "https://raw.githubusercontent.com/kubeovn/kube-ovn/v1.12.0/yamls/coredns-template.yaml": dial tcp 0.0.0.0:443: connect: connection refused
E0330 09:16:55.037068       7 vpc_dns.go:270] failed to generate vpc-dns deployment, Get "https://raw.githubusercontent.com/kubeovn/kube-ovn/v1.12.0/yamls/coredns-template.yaml": dial tcp 0.0.0.0:443: connect: connection refused
E0330 09:16:55.040092       7 vpc_nat_gateway.go:198] process: addOrUpdateVpcDns. err: error syncing 'zbb-test-vpc1-dns': Get "https://raw.githubusercontent.com/kubeovn/kube-ovn/v1.12.0/yamls/coredns-template.yaml": dial tcp 0.0.0.0:443: connect: connection refused, requeuing
E0330 09:17:15.082054       7 vpc_dns.go:336] failed to get coredns template file, Get "https://raw.githubusercontent.com/kubeovn/kube-ovn/v1.12.0/yamls/coredns-template.yaml": dial tcp 0.0.0.0:443: connect: connection refused
E0330 09:17:15.082081       7 vpc_dns.go:270] failed to generate vpc-dns deployment, Get "https://raw.githubusercontent.com/kubeovn/kube-ovn/v1.12.0/yamls/coredns-template.yaml": dial tcp 0.0.0.0:443: connect: connection refused
E0330 09:17:15.085384       7 vpc_nat_gateway.go:198] process: addOrUpdateVpcDns. err: error syncing 'zbb-test-vpc1-dns': Get "https://raw.githubusercontent.com/kubeovn/kube-ovn/v1.12.0/yamls/coredns-template.yaml": dial tcp 0.0.0.0:443: connect: connection refused, requeuing


解决方式: 回到第四步

#  cat 04-vpc-dns-cm.yml
apiVersion: v1
kind: ConfigMap
metadata:
  name: vpc-dns-config
  namespace: kube-system
data:
  coredns-vip: 10.96.0.3
  enable-vpc-dns: "true"
  nad-name: ovn-nad
  nad-provider: ovn-nad.default.ovn
  coredns-template: https://raw.githubusercontent.com/kubeovn/kube-ovn/master/yamls/coredns-template.yaml

# 由于我用的master还没有发布,所以不存在v1.12.0目录,替换为master目录即可

# 确认该链接有效后,可能还会遇到无法下载的问题,那么需要检查下dns,确认第一行的dns是公网dns

cat /etc/resolv.conf
# Generated by NetworkManager
search default.svc.cluster.local svc.cluster.local
nameserver 223.5.5.5


# 然后重启下kube-ovn-controller,切换下dns,服务即正常

遇到问题2:

vpc-dns引用的镜像和版本不一致

目前我的主版本是master,也就是1.12,但是引用的镜像是1.11.0



#  k describe po -n kube-system    vpc-dns-zbb-test-vpc1-dns-7468fbb848-44d8c


  Normal   Scheduled       12m                   default-scheduler  Successfully assigned kube-system/vpc-dns-zbb-test-vpc1-dns-7468fbb848-44d8c to wrk
  Normal   AddedInterface  12m                   multus             Add eth0 [192.168.0.2/24] from kube-ovn
  Normal   AddedInterface  12m                   multus             Add net1 [10.16.0.6/16 fd00:10:16::6/64] from default/ovn-nad
  Normal   Pulling         12m                   kubelet            Pulling image "kubeovn/vpc-nat-gateway:v1.11.0"
  Normal   Pulled          12m                   kubelet            Successfully pulled image "kubeovn/vpc-nat-gateway:v1.11.0" in 26.259659739s (26.259666438s including waiting)
  Normal   Pulled          10m (x4 over 12m)     kubelet            Container image "kubeovn/vpc-nat-gateway:v1.11.0" already present on machine
  Normal   Created         10m (x5 over 12m)     kubelet            Created container init-route
  Normal   Started         10m (x5 over 12m)     kubelet            Started container init-route
  Warning  BackOff         2m25s (x47 over 12m)  kubelet            Back-off restarting failed container



#  k get deployment -n kube-system    vpc-dns-zbb-test-vpc1-dns -o yaml | grep "image:"
        image: registry.lank8s.cn/coredns/coredns:v1.9.3
        image: kubeovn/vpc-nat-gateway:v1.11.0 # 应跟随版本号
        
# 这个可以配置,同样是在第四步的配置






遇到问题3:


# 不支持ovn-default是双栈的场景,仅支持ipv4

  initContainers:
  - command:
    - sh
    - -c
    - ip route add ${KUBERNETES_SERVICE_HOST} via 10.16.0.1,fd00:10:16::1 dev net1;ip
      route add 223.5.5.5 via 10.16.0.1,fd00:10:16::1 dev net1;

# 改下代码兼容双栈ovn-default,已合入

遇到问题4: 默认开启安全组否则不通

关闭安全组或者配置安全组,然后需要重建下vpc

测试结果


#  k ko nbctl lr-route-list vpc1
IPv4 Routes
Route Table <main>:
                0.0.0.0/0             192.168.0.254 dst-ip


#  k get po -A -o wide | grep dns
kube-system    coredns-757c7c5698-4b5b8                     1/1     Running       0          35m     10.16.0.7      mst    <none>           <none>
kube-system    coredns-757c7c5698-v5f4n                     1/1     Running       0          35m     10.16.0.15     wrk    <none>           <none>
kube-system    dns-autoscaler-75895d54f-hwnw4               1/1     Running       0          35m     10.16.0.23     wrk    <none>           <none>
kube-system    nodelocaldns-r4bhh                           1/1     Running       0          31h     172.20.10.9    wrk    <none>           <none>
kube-system    nodelocaldns-whzpt                           1/1     Running       0          32h     172.20.10.16   mst    <none>           <none>
kube-system    vpc-dns-zbb-test-vpc1-dns-7c89f7567f-cbf5p   1/1     Terminating   0          5s      192.168.0.4    mst    <none>           <none>
kube-system    vpc-dns-zbb-test-vpc1-dns-8577c6f655-8g9hr   1/1     Running       0          5s      192.168.0.5    wrk    <none>           <none>
kube-system    vpc-dns-zbb-test-vpc1-dns-8577c6f655-wqngs   1/1     Running       0          5s      192.168.0.6    mst    <none>           <none>
#  k ko nbctl lb-list
UUID                                    LB                  PROTO      VIP                       IPs
74e398cd-7dd7-4039-b4cc-bda33640b20b    cluster-tcp-load    tcp        10.102.81.60:10661        172.20.10.16:10661
                                                            tcp        10.103.127.234:6642       172.20.10.16:6642
                                                            tcp        10.106.133.236:9402       10.16.0.6:9402
                                                            tcp        10.111.147.101:6643       172.20.10.16:6643
                                                            tcp        10.111.152.5:10660        172.20.10.9:10660
                                                            tcp        10.96.0.1:443             172.20.10.16:6443
                                                            tcp        10.96.194.157:8080        10.16.0.16:8080,10.16.0.9:8080
                                                            tcp        10.97.243.43:443          10.16.0.22:10250
                                                            tcp        10.99.186.164:6641        172.20.10.16:6641
                                                            tcp        10.99.29.126:10665        172.20.10.16:10665,172.20.10.9:10665
                                                            tcp        [fd00:10:96::45e]:8080    [fd00:10:16::10]:8080,[fd00:10:16::9]:8080
894e091b-56b5-4066-8f8a-fb0a08072eb4    vpc-vpc1-tcp-loa    tcp        10.96.0.3:53              192.168.0.5:53,192.168.0.6:53
                                                            tcp        10.96.0.3:9153            192.168.0.5:9153,192.168.0.6:9153
d2034a50-3e8f-43d7-8c87-2972a880aad7    vpc-vpc1-udp-loa    udp        10.96.0.3:53              192.168.0.5:53,192.168.0.6:53

#  k exec -it -n vpc1           vpc-1-busybox01 sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/ #
/ #
/ #
/ # nslookup kubernetes.default.svc.cluster.local 10.96.0.3
Server:		10.96.0.3
Address:	10.96.0.3:53


Name:	kubernetes.default.svc.cluster.local
Address: 10.96.0.1

/ #
# 上面可以看到同子网下的pod是通的


继续测试跨子网访问,结果是正常的


[root@mst #  kgpoaw | grep busy
vpc1           vpc-1-busybox01                              1/1     Running            0               126m   192.168.0.2     wrk    <none>           <none>
vpc1           vpc-1-busybox02                              1/1     Running            0               126m   192.168.0.5     wrk    <none>           <none>
vpc1           vpc-subnet2-busybox01                        1/1     Running            0               18s    192.168.10.2    wrk    <none>           <none>
vpc1           vpc-subnet2-busybox02                        1/1     Running            0               11s    192.168.10.3    wrk    <none>           <none>
[root@mst #
[root@mst #
[root@mst #  k exec -it -n vpc1           vpc-1-busybox01 sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/ #
/ # ping 192.168.10.2
PING 192.168.10.2 (192.168.10.2): 56 data bytes
64 bytes from 192.168.10.2: seq=0 ttl=63 time=1.464 ms
64 bytes from 192.168.10.2: seq=1 ttl=63 time=0.647 ms
^C
--- 192.168.10.2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.647/1.055/1.464 ms
/ # ping 192.168.10.3
PING 192.168.10.3 (192.168.10.3): 56 data bytes
64 bytes from 192.168.10.3: seq=0 ttl=63 time=0.925 ms
^C
--- 192.168.10.3 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.925/0.925/0.925 ms
/ #  nslookup kubernetes.default.svc.cluster.local 10.96.0.3
Server:		10.96.0.3
Address:	10.96.0.3:53

Name:	kubernetes.default.svc.cluster.local
Address: 10.96.0.1


[root@mst #  k exec -it -n vpc1           vpc-subnet2-busybox02 sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/ #
/ #
/ #
/ # nslookup kubernetes.default.svc.cluster.local 10.96.0.3
Server:		10.96.0.3
Address:	10.96.0.3:53


Name:	kubernetes.default.svc.cluster.local
Address: 10.96.0.1

/ #


最后附带下全面的测试文档


[root@mst #  cat ../nat-gw-cm.yml
kind: ConfigMap
apiVersion: v1
data:
  enable-vpc-nat-gw: "true"
metadata:
  name: ovn-vpc-nat-gw-config
  namespace: kube-system


[root@mst #  cat 01-vpc-route.yml
kind: Vpc
apiVersion: kubeovn.io/v1
metadata:
  name: vpc1
spec:
  namespaces:
  - vpc1
  staticRoutes:
  - cidr: 0.0.0.0/0
    nextHopIP: 192.168.0.254
    policy: policyDst
  - cidr: 192.168.0.0/24
    nextHopIP: 192.168.0.1
    policy: policySrc
  - cidr: 192.168.10.0/24
    nextHopIP: 192.168.10.1
    policy: policySrc
[root@mst #

[root@mst #  cat 02-vpc-subnet.yml
apiVersion: kubeovn.io/v1
kind: Subnet
metadata:
  name: vpc1-subnet1
spec:
  cidrBlock: 192.168.0.0/24
  default: false
  disableGatewayCheck: false
  disableInterConnection: true
  gatewayNode: ""
  gatewayType: distributed
  natOutgoing: false
  private: false
  protocol: IPv4
  provider: ovn
  vpc: vpc1
  namespaces:
  - vpc1
---
apiVersion: kubeovn.io/v1
kind: Subnet
metadata:
  name: vpc1-subnet2
spec:
  cidrBlock: 192.168.10.0/24
  default: false
  disableGatewayCheck: false
  disableInterConnection: true
  gatewayNode: ""
  gatewayType: distributed
  natOutgoing: false
  private: false
  protocol: IPv4
  provider: ovn
  vpc: vpc1
  namespaces:
  - vpc1


[root@mst #  cat 04-nat-gw.yaml
kind: VpcNatGateway
apiVersion: kubeovn.io/v1
metadata:
  name: gw1
spec:
  vpc: vpc1
  subnet: vpc1-subnet1
  lanIp: 192.168.0.254


[root@mst #  k get eip
NAME          IP              MAC                 NAT   NATGWDP   READY
eip-vpc1-01   172.20.20.103   00:00:00:F8:C0:1E   fip   gw1       true
eip-vpc2-01   172.20.20.104   00:00:00:1F:15:8F   fip   gw2       true
[root@mst #  k get fip
NAME       EIP           V4IP            INTERNALIP    V6IP   READY   NATGWDP
fip-vpc1   eip-vpc1-01   172.20.20.103   192.168.0.2          true    gw1
fip-vpc2   eip-vpc2-01   172.20.20.104   192.168.1.3          true    gw2
[root@mst #  k get vpc-dns
NAME                ACTIVE   VPC    SUBNET
zbb-test-vpc1-dns   true     vpc1   vpc1-subnet1
[root@mst #


[root@mst #  k ko nbctl lb-list
UUID                                    LB                  PROTO      VIP                       IPs
74e398cd-7dd7-4039-b4cc-bda33640b20b    cluster-tcp-load    tcp        10.103.127.234:6642       172.20.10.16:6642
                                                            tcp        10.106.133.236:9402       10.16.0.11:9402
                                                            tcp        10.111.147.101:6643       172.20.10.16:6643
                                                            tcp        10.111.152.5:10660        172.20.10.9:10660
                                                            tcp        10.96.0.1:443             172.20.10.16:6443
                                                            tcp        10.96.0.3:53              10.16.0.17:53,10.16.0.24:53
                                                            tcp        10.96.0.3:9153            10.16.0.17:9153,10.16.0.24:9153
                                                            tcp        10.96.194.157:8080        10.16.0.18:8080,10.16.0.4:8080
                                                            tcp        10.97.243.43:443          10.16.0.8:10250
                                                            tcp        10.99.186.164:6641        172.20.10.16:6641
                                                            tcp        10.99.29.126:10665        172.20.10.16:10665,172.20.10.9:10665
                                                            tcp        [fd00:10:96::45e]:8080    [fd00:10:16::12]:8080,[fd00:10:16::4]:8080
c429dc34-6b8c-4f80-9306-02d75a0f75d0    cluster-udp-load    udp        10.96.0.3:53              10.16.0.17:53,10.16.0.24:53
894e091b-56b5-4066-8f8a-fb0a08072eb4    vpc-vpc1-tcp-loa    tcp        10.96.0.3:53              192.168.0.3:53,192.168.0.6:53
                                                            tcp        10.96.0.3:9153            192.168.0.3:9153,192.168.0.6:9153
d2034a50-3e8f-43d7-8c87-2972a880aad7    vpc-vpc1-udp-loa    udp        10.96.0.3:53              192.168.0.3:53,192.168.0.6:53