- 部署vpc-dns相关依赖
# cat 01-pre-vpc-dns.yml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:vpc-dns
rules:
- apiGroups:
- ""
resources:
- endpoints
- services
- pods
- namespaces
verbs:
- list
- watch
- apiGroups:
- discovery.k8s.io
resources:
- endpointslices
verbs:
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: vpc-dns
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:vpc-dns
subjects:
- kind: ServiceAccount
name: vpc-dns
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: vpc-dns
namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
name: vpc-dns-corefile
namespace: kube-system
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf {
prefer_udp
}
cache 30
loop
reload
loadbalance
}
- 创建NAD
# cat 02-ovn-nad.yml
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: ovn-nad
namespace: default
spec:
config: '{
"cniVersion": "0.3.0",
"type": "kube-ovn",
"server_socket": "/run/openvswitch/kube-ovn-daemon.sock",
"provider": "ovn-nad.default.ovn"
}'
- 修改ovn-default的provider字段,关联到NAD
# k edit subnet ovn-default
## 修改spec中的provider字段
provider: ovn-nad.default.ovn
- 指定vip
一开始vpc-dns是为了对照localdns的功能,为了将整个集群的dns都统一为一个ip
# 读取coredns的ip
# k get svc -n kube-system coredns
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
coredns ClusterIP 10.96.0.3 <none> 53/UDP,53/TCP,9153/TCP 23h
# cat 04-vpc-dns-cm.yml
apiVersion: v1
kind: ConfigMap
metadata:
name: vpc-dns-config
namespace: kube-system
data:
coredns-vip: 10.96.0.3
enable-vpc-dns: "true"
nad-name: ovn-nad
nad-provider: ovn-nad.default.ovn
- 自定义 vpc 启用dns
# cat 05-deploy-vpc-dns-in-vpc-subnet.yml
kind: VpcDns
apiVersion: kubeovn.io/v1
metadata:
name: zbb-test-vpc1-dns
spec:
vpc: vpc1
subnet: vpc1-subnet1 # switch lb 作用在该switch
遇到问题1:
k logs -f -n kube-system kube-ovn-controller-7c554b984-pshgl
E0330 09:16:55.037036 7 vpc_dns.go:336] failed to get coredns template file, Get "https://raw.githubusercontent.com/kubeovn/kube-ovn/v1.12.0/yamls/coredns-template.yaml": dial tcp 0.0.0.0:443: connect: connection refused
E0330 09:16:55.037068 7 vpc_dns.go:270] failed to generate vpc-dns deployment, Get "https://raw.githubusercontent.com/kubeovn/kube-ovn/v1.12.0/yamls/coredns-template.yaml": dial tcp 0.0.0.0:443: connect: connection refused
E0330 09:16:55.040092 7 vpc_nat_gateway.go:198] process: addOrUpdateVpcDns. err: error syncing 'zbb-test-vpc1-dns': Get "https://raw.githubusercontent.com/kubeovn/kube-ovn/v1.12.0/yamls/coredns-template.yaml": dial tcp 0.0.0.0:443: connect: connection refused, requeuing
E0330 09:17:15.082054 7 vpc_dns.go:336] failed to get coredns template file, Get "https://raw.githubusercontent.com/kubeovn/kube-ovn/v1.12.0/yamls/coredns-template.yaml": dial tcp 0.0.0.0:443: connect: connection refused
E0330 09:17:15.082081 7 vpc_dns.go:270] failed to generate vpc-dns deployment, Get "https://raw.githubusercontent.com/kubeovn/kube-ovn/v1.12.0/yamls/coredns-template.yaml": dial tcp 0.0.0.0:443: connect: connection refused
E0330 09:17:15.085384 7 vpc_nat_gateway.go:198] process: addOrUpdateVpcDns. err: error syncing 'zbb-test-vpc1-dns': Get "https://raw.githubusercontent.com/kubeovn/kube-ovn/v1.12.0/yamls/coredns-template.yaml": dial tcp 0.0.0.0:443: connect: connection refused, requeuing
解决方式: 回到第四步
# cat 04-vpc-dns-cm.yml
apiVersion: v1
kind: ConfigMap
metadata:
name: vpc-dns-config
namespace: kube-system
data:
coredns-vip: 10.96.0.3
enable-vpc-dns: "true"
nad-name: ovn-nad
nad-provider: ovn-nad.default.ovn
coredns-template: https://raw.githubusercontent.com/kubeovn/kube-ovn/master/yamls/coredns-template.yaml
# 由于我用的master还没有发布,所以不存在v1.12.0目录,替换为master目录即可
# 确认该链接有效后,可能还会遇到无法下载的问题,那么需要检查下dns,确认第一行的dns是公网dns
cat /etc/resolv.conf
# Generated by NetworkManager
search default.svc.cluster.local svc.cluster.local
nameserver 223.5.5.5
# 然后重启下kube-ovn-controller,切换下dns,服务即正常
遇到问题2:
vpc-dns引用的镜像和版本不一致
目前我的主版本是master,也就是1.12,但是引用的镜像是1.11.0
# k describe po -n kube-system vpc-dns-zbb-test-vpc1-dns-7468fbb848-44d8c
Normal Scheduled 12m default-scheduler Successfully assigned kube-system/vpc-dns-zbb-test-vpc1-dns-7468fbb848-44d8c to wrk
Normal AddedInterface 12m multus Add eth0 [192.168.0.2/24] from kube-ovn
Normal AddedInterface 12m multus Add net1 [10.16.0.6/16 fd00:10:16::6/64] from default/ovn-nad
Normal Pulling 12m kubelet Pulling image "kubeovn/vpc-nat-gateway:v1.11.0"
Normal Pulled 12m kubelet Successfully pulled image "kubeovn/vpc-nat-gateway:v1.11.0" in 26.259659739s (26.259666438s including waiting)
Normal Pulled 10m (x4 over 12m) kubelet Container image "kubeovn/vpc-nat-gateway:v1.11.0" already present on machine
Normal Created 10m (x5 over 12m) kubelet Created container init-route
Normal Started 10m (x5 over 12m) kubelet Started container init-route
Warning BackOff 2m25s (x47 over 12m) kubelet Back-off restarting failed container
# k get deployment -n kube-system vpc-dns-zbb-test-vpc1-dns -o yaml | grep "image:"
image: registry.lank8s.cn/coredns/coredns:v1.9.3
image: kubeovn/vpc-nat-gateway:v1.11.0 # 应跟随版本号
# 这个可以配置,同样是在第四步的配置
遇到问题3:
# 不支持ovn-default是双栈的场景,仅支持ipv4
initContainers:
- command:
- sh
- -c
- ip route add ${KUBERNETES_SERVICE_HOST} via 10.16.0.1,fd00:10:16::1 dev net1;ip
route add 223.5.5.5 via 10.16.0.1,fd00:10:16::1 dev net1;
# 改下代码兼容双栈ovn-default,已合入
遇到问题4: 默认开启安全组否则不通
关闭安全组或者配置安全组,然后需要重建下vpc
测试结果
# k ko nbctl lr-route-list vpc1
IPv4 Routes
Route Table <main>:
0.0.0.0/0 192.168.0.254 dst-ip
# k get po -A -o wide | grep dns
kube-system coredns-757c7c5698-4b5b8 1/1 Running 0 35m 10.16.0.7 mst <none> <none>
kube-system coredns-757c7c5698-v5f4n 1/1 Running 0 35m 10.16.0.15 wrk <none> <none>
kube-system dns-autoscaler-75895d54f-hwnw4 1/1 Running 0 35m 10.16.0.23 wrk <none> <none>
kube-system nodelocaldns-r4bhh 1/1 Running 0 31h 172.20.10.9 wrk <none> <none>
kube-system nodelocaldns-whzpt 1/1 Running 0 32h 172.20.10.16 mst <none> <none>
kube-system vpc-dns-zbb-test-vpc1-dns-7c89f7567f-cbf5p 1/1 Terminating 0 5s 192.168.0.4 mst <none> <none>
kube-system vpc-dns-zbb-test-vpc1-dns-8577c6f655-8g9hr 1/1 Running 0 5s 192.168.0.5 wrk <none> <none>
kube-system vpc-dns-zbb-test-vpc1-dns-8577c6f655-wqngs 1/1 Running 0 5s 192.168.0.6 mst <none> <none>
# k ko nbctl lb-list
UUID LB PROTO VIP IPs
74e398cd-7dd7-4039-b4cc-bda33640b20b cluster-tcp-load tcp 10.102.81.60:10661 172.20.10.16:10661
tcp 10.103.127.234:6642 172.20.10.16:6642
tcp 10.106.133.236:9402 10.16.0.6:9402
tcp 10.111.147.101:6643 172.20.10.16:6643
tcp 10.111.152.5:10660 172.20.10.9:10660
tcp 10.96.0.1:443 172.20.10.16:6443
tcp 10.96.194.157:8080 10.16.0.16:8080,10.16.0.9:8080
tcp 10.97.243.43:443 10.16.0.22:10250
tcp 10.99.186.164:6641 172.20.10.16:6641
tcp 10.99.29.126:10665 172.20.10.16:10665,172.20.10.9:10665
tcp [fd00:10:96::45e]:8080 [fd00:10:16::10]:8080,[fd00:10:16::9]:8080
894e091b-56b5-4066-8f8a-fb0a08072eb4 vpc-vpc1-tcp-loa tcp 10.96.0.3:53 192.168.0.5:53,192.168.0.6:53
tcp 10.96.0.3:9153 192.168.0.5:9153,192.168.0.6:9153
d2034a50-3e8f-43d7-8c87-2972a880aad7 vpc-vpc1-udp-loa udp 10.96.0.3:53 192.168.0.5:53,192.168.0.6:53
# k exec -it -n vpc1 vpc-1-busybox01 sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/ #
/ #
/ #
/ # nslookup kubernetes.default.svc.cluster.local 10.96.0.3
Server: 10.96.0.3
Address: 10.96.0.3:53
Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1
/ #
# 上面可以看到同子网下的pod是通的
继续测试跨子网访问,结果是正常的
[root@mst # kgpoaw | grep busy
vpc1 vpc-1-busybox01 1/1 Running 0 126m 192.168.0.2 wrk <none> <none>
vpc1 vpc-1-busybox02 1/1 Running 0 126m 192.168.0.5 wrk <none> <none>
vpc1 vpc-subnet2-busybox01 1/1 Running 0 18s 192.168.10.2 wrk <none> <none>
vpc1 vpc-subnet2-busybox02 1/1 Running 0 11s 192.168.10.3 wrk <none> <none>
[root@mst #
[root@mst #
[root@mst # k exec -it -n vpc1 vpc-1-busybox01 sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/ #
/ # ping 192.168.10.2
PING 192.168.10.2 (192.168.10.2): 56 data bytes
64 bytes from 192.168.10.2: seq=0 ttl=63 time=1.464 ms
64 bytes from 192.168.10.2: seq=1 ttl=63 time=0.647 ms
^C
--- 192.168.10.2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.647/1.055/1.464 ms
/ # ping 192.168.10.3
PING 192.168.10.3 (192.168.10.3): 56 data bytes
64 bytes from 192.168.10.3: seq=0 ttl=63 time=0.925 ms
^C
--- 192.168.10.3 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.925/0.925/0.925 ms
/ # nslookup kubernetes.default.svc.cluster.local 10.96.0.3
Server: 10.96.0.3
Address: 10.96.0.3:53
Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1
[root@mst # k exec -it -n vpc1 vpc-subnet2-busybox02 sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/ #
/ #
/ #
/ # nslookup kubernetes.default.svc.cluster.local 10.96.0.3
Server: 10.96.0.3
Address: 10.96.0.3:53
Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1
/ #
最后附带下全面的测试文档
[root@mst # cat ../nat-gw-cm.yml
kind: ConfigMap
apiVersion: v1
data:
enable-vpc-nat-gw: "true"
metadata:
name: ovn-vpc-nat-gw-config
namespace: kube-system
[root@mst # cat 01-vpc-route.yml
kind: Vpc
apiVersion: kubeovn.io/v1
metadata:
name: vpc1
spec:
namespaces:
- vpc1
staticRoutes:
- cidr: 0.0.0.0/0
nextHopIP: 192.168.0.254
policy: policyDst
- cidr: 192.168.0.0/24
nextHopIP: 192.168.0.1
policy: policySrc
- cidr: 192.168.10.0/24
nextHopIP: 192.168.10.1
policy: policySrc
[root@mst #
[root@mst # cat 02-vpc-subnet.yml
apiVersion: kubeovn.io/v1
kind: Subnet
metadata:
name: vpc1-subnet1
spec:
cidrBlock: 192.168.0.0/24
default: false
disableGatewayCheck: false
disableInterConnection: true
gatewayNode: ""
gatewayType: distributed
natOutgoing: false
private: false
protocol: IPv4
provider: ovn
vpc: vpc1
namespaces:
- vpc1
---
apiVersion: kubeovn.io/v1
kind: Subnet
metadata:
name: vpc1-subnet2
spec:
cidrBlock: 192.168.10.0/24
default: false
disableGatewayCheck: false
disableInterConnection: true
gatewayNode: ""
gatewayType: distributed
natOutgoing: false
private: false
protocol: IPv4
provider: ovn
vpc: vpc1
namespaces:
- vpc1
[root@mst # cat 04-nat-gw.yaml
kind: VpcNatGateway
apiVersion: kubeovn.io/v1
metadata:
name: gw1
spec:
vpc: vpc1
subnet: vpc1-subnet1
lanIp: 192.168.0.254
[root@mst # k get eip
NAME IP MAC NAT NATGWDP READY
eip-vpc1-01 172.20.20.103 00:00:00:F8:C0:1E fip gw1 true
eip-vpc2-01 172.20.20.104 00:00:00:1F:15:8F fip gw2 true
[root@mst # k get fip
NAME EIP V4IP INTERNALIP V6IP READY NATGWDP
fip-vpc1 eip-vpc1-01 172.20.20.103 192.168.0.2 true gw1
fip-vpc2 eip-vpc2-01 172.20.20.104 192.168.1.3 true gw2
[root@mst # k get vpc-dns
NAME ACTIVE VPC SUBNET
zbb-test-vpc1-dns true vpc1 vpc1-subnet1
[root@mst #
[root@mst # k ko nbctl lb-list
UUID LB PROTO VIP IPs
74e398cd-7dd7-4039-b4cc-bda33640b20b cluster-tcp-load tcp 10.103.127.234:6642 172.20.10.16:6642
tcp 10.106.133.236:9402 10.16.0.11:9402
tcp 10.111.147.101:6643 172.20.10.16:6643
tcp 10.111.152.5:10660 172.20.10.9:10660
tcp 10.96.0.1:443 172.20.10.16:6443
tcp 10.96.0.3:53 10.16.0.17:53,10.16.0.24:53
tcp 10.96.0.3:9153 10.16.0.17:9153,10.16.0.24:9153
tcp 10.96.194.157:8080 10.16.0.18:8080,10.16.0.4:8080
tcp 10.97.243.43:443 10.16.0.8:10250
tcp 10.99.186.164:6641 172.20.10.16:6641
tcp 10.99.29.126:10665 172.20.10.16:10665,172.20.10.9:10665
tcp [fd00:10:96::45e]:8080 [fd00:10:16::12]:8080,[fd00:10:16::4]:8080
c429dc34-6b8c-4f80-9306-02d75a0f75d0 cluster-udp-load udp 10.96.0.3:53 10.16.0.17:53,10.16.0.24:53
894e091b-56b5-4066-8f8a-fb0a08072eb4 vpc-vpc1-tcp-loa tcp 10.96.0.3:53 192.168.0.3:53,192.168.0.6:53
tcp 10.96.0.3:9153 192.168.0.3:9153,192.168.0.6:9153
d2034a50-3e8f-43d7-8c87-2972a880aad7 vpc-vpc1-udp-loa udp 10.96.0.3:53 192.168.0.3:53,192.168.0.6:53