ovn chassis 相关用例分析

192 阅读9分钟

基于 ovn/tests/ovn.at 测试用例分析 chassis 相关的应用

1. 在 ovn gw chassis 之间会基于 bfd 互相探测


   # ovn_wait_for_bfd_up HV
   # BFD might be quite slow. While BFD is not up, all chassis will fight to claim the port
   # Wait for BFD between different chassis to be up
   ovn_wait_for_bfd_up() {
     for hv; do
       as $hv
       for chassis; do
         if test $hv != $chassis; then
             echo "checking bdf_status for $hv -> $chassis"
             OVS_WAIT_UNTIL([
                 bfd_status=$(as $hv ovs-vsctl get interface ovn-$chassis-0 bfd_status:state)
                 echo "bfd status = $bfd_status"
                 test "$bfd_status" = "up"
             ])
         fi
       done
     done
   }

同时该 PR 也包含了一些其他的功能的测试用例:

tests: fix multiple flaky test cases

- ovn-controller incremental processing
- ovn-ic -- port-bindings deletion upon TS deletion
- IP packet buffering
- ovn-controller-vtep - hv flows
- external logical port
- ovn-northd pause and resume
- ACLs on Port Groups
- DSCP marking and meter check
- Stateless Floating IP
- ovn-controller - check meters update
- conflict ACLs with address
- 1 LR with HA distributed router gateway port
- controller event
- basic connectivity with multiple requested-chassis

github.com/ovn-org/ovn…

2. 一个交换机可以通过两个 localnet ports,连接到两个物理网络

# In this test case we create a single switch connected to two physical
# networks via two localnet ports. Then we create two hypervisors, with 2
# ports on each. The test validates no interconnectivity between VIF ports
# located on chassis plugged to different physical networks.

# create the single switch with two locanet ports
ovn-nbctl ls-add ls1
for tag in 10 20; do
    ln_port_name=ln-$tag
    ovn-nbctl lsp-add ls1 $ln_port_name "" $tag
    ovn-nbctl lsp-set-addresses $ln_port_name unknown
    ovn-nbctl lsp-set-type $ln_port_name localnet
    ovn-nbctl lsp-set-options $ln_port_name network_name=phys-$tag
done

3. 当 lrp 将 chassis 放置于 node 上时,只会响应 arp,不会发送 arp请求

# ARP request from VTEP to LRP should be responded by ARP responder.
sha=f00000000003
spa=`ip_to_hex 192 168 1 2`
tpa=`ip_to_hex 192 168 1 1`
request=ffffffffffff${sha}08060001080006040001${sha}${spa}ffffffffffff${tpa}
as hv3 ovs-appctl netdev-dummy/receive vif3 $request
echo $request >> 1.expected
echo $request >> 2.expected

lrpmac=f000000000f1
response=${sha}${lrpmac}08060001080006040002${lrpmac}${tpa}${sha}${spa}
# since lrp1 has gateway chassis set on hv1, hv1 will suppress arp request and
# answer with arp reply by OVS directly to vtep lport. all other lports,
# except lport from which this request was initiated, will receive arp request.
# we expect arp reply packet on hv3
echo $response >> 3.expected



4. hypervisor 的 arp 表是预分发的


# Pre-populate the hypervisors' ARP tables so that we don't lose any
# packets for ARP resolution (native tunneling doesn't queue packets
# for ARP resolution).
OVN_POPULATE_ARP

ovn-nbctl create Logical_Router name=R1
ovn-nbctl create Logical_Router name=R2 options:chassis="hv2"

5. nat ip 的 garp 是 localnet 发送的



OVN_FOR_EACH_NORTHD([
AT_SETUP([send gratuitous arp for nat ips in localnet])
ovn_start
# Create logical switch
ovn-nbctl ls-add ls0
# Create gateway router
ovn-nbctl create Logical_Router name=lr0 options:chassis=hv1
# Add router port to gateway router
ovn-nbctl lrp-add lr0 lrp0 f0:00:00:00:00:01 192.168.0.1/24
ovn-nbctl lsp-add ls0 lrp0-rp -- set Logical_Switch_Port lrp0-rp \
    type=router options:router-port=lrp0 addresses='"f0:00:00:00:00:01"'
# Add nat-address option
ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses="f0:00:00:00:00:01 192.168.0.2"

# 这个地方也挺有意思,不知道为什么, 和 lrp patch 的 lsp 可以配置一个 nat 地址

和 lrp patch 的 lsp 可以配置一个 nat 地址

# 不太明白 为什么 nat 地址可以导致存在竞争状态

# Temporarily remove nat-addresses option to avoid race conditions
# due to GARP backoff
ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses=""
# Let's use gw router port now
hv1_uuid=$(ovn-sbctl --bare --columns _uuid list chassis hv1)
ovn-nbctl remove logical_router lr0 options chassis
ovn-nbctl lrp-set-gateway-chassis lrp0 hv1 20
OVS_WAIT_UNTIL([
    cr_lrp0_ch=$(ovn-sbctl --bare --columns chassis list port_binding cr-lrp0)
    test "$cr_lrp0_ch" = $hv1_uuid
])
ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses="f0:00:00:00:00:03 192.168.0.3"

reset_pcap_file snoopvif hv1/snoopvif
OVS_WAIT_UNTIL([test `wc -c < "hv1/snoopvif-tx.pcap"` -ge 140])

3. l3 gw 只在选定的 chassis 上的(localnet)才发送 garp(免费 arp)


OVN_FOR_EACH_NORTHD([
AT_SETUP([send gratuitous arp for l3gateway only on selected chassis])
AT_SKIP_IF([test $HAVE_TCPDUMP = no])
ovn_start

# Create logical switch
ovn-nbctl ls-add ls0
# Create gateway router
ovn-nbctl lr-add lr0
# Add router port to gateway router
ovn-nbctl lrp-add lr0 lr0-ls0 f0:00:00:00:00:01 192.168.0.1/24
ovn-nbctl lsp-add ls0 ls0-lr0 -- set Logical_Switch_Port ls0-lr0 \
    type=router options:router-port=lr0-ls0 addresses='"f0:00:00:00:00:01"'

# Create a localnet port.
ovn-nbctl lsp-add ls0 ln_port
ovn-nbctl lsp-set-addresses ln_port unknown
ovn-nbctl lsp-set-type ln_port localnet
ovn-nbctl --wait=hv lsp-set-options ln_port network_name=physnet1

# Prepare packets
touch empty_expected
echo "fffffffffffff0000000000108060001080006040001f00000000001c0a80001000000000000c0a80001" > arp_expected

4. snat 可以复用 lrp ip

OVN_FOR_EACH_NORTHD([
AT_SETUP([send gratuitous arp with nat-addresses router in localnet])
ovn_start
# Create logical switch
ovn-nbctl ls-add ls0
# Create gateway router
ovn-nbctl create Logical_Router name=lr0 options:chassis=hv1
# Add router port to gateway router
ovn-nbctl lrp-add lr0 lrp0 f0:00:00:00:00:01 192.168.0.1/24
# 这里的lrp ip 192.168.0.1 被下面的 snat 复用了
ovn-nbctl lsp-add ls0 lrp0-rp -- set Logical_Switch_Port lrp0-rp \
    type=router options:router-port=lrp0 addresses='"f0:00:00:00:00:01"'
# Add nat-address option
ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses="router"

# 这个地方其实也是一个疑问点,为啥地址可以不是一个ip ???

# Add NAT rules
# 这里用了 lrp 的 underlay ip 192.168.0.1
AT_CHECK([ovn-nbctl lr-nat-add lr0 snat 192.168.0.1 10.0.0.0/24])
AT_CHECK([ovn-nbctl lr-nat-add lr0 dnat 192.168.0.2 10.0.0.1])
# Add load balancers
AT_CHECK([ovn-nbctl lb-add lb0 192.168.0.3:80 10.0.0.2:80,10.0.0.3:80])
AT_CHECK([ovn-nbctl lr-lb-add lr0 lb0])
AT_CHECK([ovn-nbctl lb-add lb1 192.168.0.3:8080 10.0.0.2:8080,10.0.0.3:8080])
AT_CHECK([ovn-nbctl lr-lb-add lr0 lb1])

5. 基于 ha chasis group 维护 lrp 对应的 gw chassis


# Wait for BFD to be up, then for ovn-controller to handle that change
ovn_wait_for_bfd_up hv1 gw1 gw2
check ovn-nbctl --wait=hv sync

test_ip_packet gw1 gw2 0

AT_CHECK(
  [ovn-nbctl --wait=hv \
    --id=@gc0 create Gateway_Chassis name=alice_gw1 \
                                     chassis_name=gw1 \
                                     priority=10 -- \
    --id=@gc1 create Gateway_Chassis name=alice_gw2 \
                                     chassis_name=gw2 \
                                     priority=20 -- \
    set Logical_Router_Port alice 'gateway_chassis=[@gc0,@gc1]' | uuidfilt], 0,
  [<0>
<1>
])

一个 lrp 可以对应一个 chassis list, list 中包含手动设置过带有优先级的 chassis

普通节点到 chassis gw 之间,建立了 bfd,根据 bfd 的链路检测结果,可以用于剔除故障的 chassis gw

6. ha chassis group 的使用


AS_BOX([Test the GARP for the router port ip - 192.168.1.1])
ovn-nbctl --wait=sb ha-chassis-group-add hagrp1

as hv1 reset_pcap_file hv1-vif1 hv1/vif1
as hv2 reset_pcap_file br-ex_n2 hv2/br-ex_n2
as hv4 reset_pcap_file br-ex_n2 hv4/br-ex_n2

ovn-nbctl --wait=sb ha-chassis-group-add-chassis hagrp1 hv2 30
ovn-nbctl --wait=sb ha-chassis-group-add-chassis hagrp1 hv4 20

hagrp1_uuid=`ovn-nbctl --bare --columns _uuid find ha_chassis_group name=hagrp1`
ovn-nbctl lrp-del-gateway-chassis alice hv2
ovn-nbctl --wait=sb set logical_router_port alice ha_chassis_group=$hagrp1_uuid

# When hv2 claims the gw router port cr-alice, it should send out

# XXX: hv2 绑定了该 lrp,那么该节点的 ovn-controller 会发出 GARP

# GARP for 192.168.1.1 and it should be received by foo1 on hv1.

# XXX: hv2 发出, hv1 会收到
AS_BOX([foo1 (on hv1) should receive GARP without VLAN tag])
exp_garp_on_foo1="ffffffffffff00000101020308060001080006040001000001010203c0a80101000000000000c0a80101"
echo $exp_garp_on_foo1 > foo1.expout

# XXX: hv2 发出的包 带有 vlan tag
AS_BOX([ovn-controller on hv2 should send garp with VLAN tag])
sent_garp="ffffffffffff0000010102038100000208060001080006040001000001010203c0a80101000000000000c0a80101"


7. vlan underlay switch 端口也可以直接绑定 HCG(ha chassis group)


ovn_start
net_add n1

check ovs-vsctl add-br br-phys
check ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
ovn_attach n1 br-phys 192.168.0.1

check ovn-nbctl ls-add ls

# create topology to allow to talk from localport through localnet to external port

check ovn-nbctl lsp-add ls lp
check ovn-nbctl lsp-set-addresses lp "00:00:00:00:00:01 10.0.0.1"
check ovn-nbctl lsp-set-type lp localport
check ovs-vsctl add-port br-int lp -- set Interface lp external-ids:iface-id=lp

check ovn-nbctl --wait=sb ha-chassis-group-add hagrp
check ovn-nbctl --wait=sb ha-chassis-group-add-chassis hagrp main 10
check ovn-nbctl lsp-add ls lext
check ovn-nbctl lsp-set-addresses lext "00:00:00:00:00:02 10.0.0.2"
check ovn-nbctl lsp-set-type lext external
hagrp_uuid=`ovn-nbctl --bare --columns _uuid find ha_chassis_group name=hagrp`
check ovn-nbctl set logical_switch_port lext ha_chassis_group=$hagrp_uuid

# 这里 lsp 直接绑定了 hcg


lsp 清理 HCG


# now disown both external ports, one by moving to another (non-existing)
# chassis, another by removing the port from any ha groups
check ovn-nbctl --wait=sb ha-chassis-group-add fake_hagrp
fake_hagrp_uuid=`ovn-nbctl --bare --columns _uuid find ha_chassis_group name=fake_hagrp`
check ovn-nbctl set logical_switch_port lext ha_chassis_group=$fake_hagrp_uuid
check ovn-nbctl clear logical_switch_port lext2 ha_chassis_group
check ovn-nbctl --wait=hv sync

8. 创建 ha chassis group


OVN_POPULATE_ARP

ovn-nbctl create Logical_Router name=R1

# Connect inside to R1
ovn-nbctl lrp-add R1 inside 00:00:01:01:02:03 192.168.1.1/24
ovn-nbctl lsp-add inside rp-inside -- set Logical_Switch_Port rp-inside \
    type=router options:router-port=inside \
    -- lsp-set-addresses rp-inside router

# Connect outside to R1 as distributed router gateway port on gw1+gw2
ovn-nbctl lrp-add R1 outside 00:00:02:01:02:04 192.168.0.101/24

ovn-nbctl --id=@gc0 create Gateway_Chassis \
                    name=outside_gw1 chassis_name=gw1 priority=20 -- \
          --id=@gc1 create Gateway_Chassis \
                    name=outside_gw2 chassis_name=gw2 priority=10 -- \
          set Logical_Router_Port outside 'gateway_chassis=[@gc0,@gc1]'

ovn-nbctl lsp-add outside rp-outside -- set Logical_Switch_Port rp-outside \
    type=router options:router-port=outside \
    -- lsp-set-addresses rp-outside router

# Create localnet port in outside
ovn-nbctl lsp-add outside ln-outside
ovn-nbctl lsp-set-addresses ln-outside unknown
ovn-nbctl lsp-set-type ln-outside localnet
ovn-nbctl lsp-set-options ln-outside network_name=phys

# Allow some time for ovn-northd and ovn-controller to catch up.
wait_for_ports_up
check ovn-nbctl --wait=hv sync

echo "---------NB dump-----"
ovn-nbctl show
echo "---------------------"
ovn-nbctl list logical_router
echo "---------------------"
ovn-nbctl list logical_router_port
echo "---------------------"

echo "---------SB dump-----"
ovn-sbctl list datapath_binding
echo "---------------------"
ovn-sbctl list port_binding
echo "---------------------"
ovn-sbctl dump-flows
echo "---------------------"
ovn-sbctl list chassis
ovn-sbctl list encap
echo "---------------------"
echo "------ ha_Chassis_Group dump (SBDB) -------"
ovn-sbctl list HA_Chassis_Group
echo "------ ha_Chassis dump (SBDB) -------"
ovn-sbctl list HA_Chassis
echo "------ Port_Binding chassisredirect -------"
ovn-sbctl find Port_Binding type=chassisredirect
echo "-------------------------------------------"

# There should be one ha_chassis_group with the name "outside"
check_row_count HA_Chassis_Group 1 name=outside

# There should be 2 ha_chassis rows in SB DB.
check_row_count HA_Chassis 2 'chassis!=[[]]'

ha_ch=$(fetch_column HA_Chassis_Group ha_chassis)
check_column "$ha_ch" HA_Chassis _uuid

# 查看 hv 和 gw 之间的 bfd

for chassis in gw1 gw2 hv1 hv2; do
    as $chassis
    echo "------ $chassis dump ----------"
    ovs-ofctl show br-int
    ovs-ofctl dump-flows br-int
    echo "--------------------------"
done
bfd_dump() {
    for chassis in gw1 gw2 hv1 hv2; do
        as $chassis
        echo "------ $chassis dump (BFD)----"
        echo "BFD (from $chassis):"
        # dump BFD config and status to the other chassis
        for chassis2 in gw1 gw2 hv1 hv2; do
            if [[ "$chassis" != "$chassis2" ]]; then
                echo " -> $chassis2:"
                echo "   $(ovs-vsctl --bare --columns bfd,bfd_status find Interface name=ovn-$chassis2-0)"
            fi
        done
        echo "--------------------------"
    done
}

9. 修改 全局 chassis 相关探测间隔



ovn-nbctl --wait=hv set NB_Global . options:"bfd-min-rx"=2000
ovn-nbctl remove NB_Global . options "bfd-min-rx"
ovn-nbctl remove NB_Global . options "bfd-min-tx"
ovn-nbctl remove NB_Global . options "bfd-mult"

10. ha chassis group 中的 chassis 的优先级是可以手动调整的


# at this point, we invert the priority of the gw chassis between hv2 and hv3

ovn-nbctl --wait=hv \
          --id=@gc0 create Gateway_Chassis \
                    name=outside_gw1 chassis_name=hv2 priority=1 -- \
          --id=@gc1 create Gateway_Chassis \
                    name=outside_gw2 chassis_name=hv3 priority=10 -- \
          set Logical_Router_Port lrp0 'gateway_chassis=[@gc0,@gc1]'

# We expect not to receive garp on hv2 after inverting the priority.
# Hence  reset hv2 after inverting priority as otherwise a garp might
# be received on hv2 between the reset and the priority change.

as hv2 reset_pcap_file br-phys_n1 hv2/br-phys_n1
as hv3 reset_pcap_file br-phys_n1 hv3/br-phys_n1
as hv1 reset_pcap_file snoopvif hv1/snoopvif


优先级切换后,lrp 绑定的 chassis 应该会发生切换

11. lsp 直接绑定 chassis


# Allow only chassis hv1 to bind logical port lsp0.
check ovn-nbctl lsp-set-options lsp0 requested-chassis=hv1

as hv1 check ovs-vsctl -- add-port br-int lsp0 -- \
    set Interface lsp0 external-ids:iface-id=lsp0
as hv2 check ovs-vsctl -- add-port br-int lsp0 -- \
    set Interface lsp0 external-ids:iface-id=lsp0

wait_row_count Chassis 1 name=hv1
wait_row_count Chassis 1 name=hv2
hv1_uuid=$(fetch_column Chassis _uuid name=hv1)
hv2_uuid=$(fetch_column Chassis _uuid name=hv2)

wait_column "$hv1_uuid" Port_Binding chassis logical_port=lsp0
wait_column "$hv1_uuid" Port_Binding requested_chassis logical_port=lsp0
wait_column "" Port_Binding additional_chassis logical_port=lsp0
wait_column "" Port_Binding requested_additional_chassis logical_port=lsp0

# Request port binding at an additional chassis
check ovn-nbctl lsp-set-options lsp0 \
    requested-chassis=hv1,hv2

wait_column "$hv1_uuid" Port_Binding chassis logical_port=lsp0
wait_column "$hv1_uuid" Port_Binding requested_chassis logical_port=lsp0
wait_column "$hv2_uuid" Port_Binding additional_chassis logical_port=lsp0
wait_column "$hv2_uuid" Port_Binding requested_additional_chassis logical_port=lsp0

# Check ovn-installed updated for both chassis
wait_for_ports_up

for hv in hv1 hv2; do
    OVS_WAIT_UNTIL([test `as $hv ovs-vsctl get Interface lsp0 external_ids:ovn-installed` = '"true"'])
done

# Check that setting iface:encap-ip populates Port_Binding:additional_encap
wait_row_count Encap 2 chassis_name=hv1
wait_row_count Encap 2 chassis_name=hv2
encap_hv1_uuid=$(fetch_column Encap _uuid chassis_name=hv1 type=geneve)
encap_hv2_uuid=$(fetch_column Encap _uuid chassis_name=hv2 type=geneve)

wait_column "" Port_Binding encap logical_port=lsp0
wait_column "" Port_Binding additional_encap logical_port=lsp0

as hv1 check ovs-vsctl -- \
    set Interface lsp0 external-ids:encap-ip=192.168.0.11
as hv2 check ovs-vsctl -- \
    set Interface lsp0 external-ids:encap-ip=192.168.0.12

wait_column "$encap_hv1_uuid" Port_Binding encap logical_port=lsp0
wait_column "$encap_hv2_uuid" Port_Binding additional_encap logical_port=lsp0

# Complete moving the binding to the new location
check ovn-nbctl lsp-set-options lsp0 requested-chassis=hv2

wait_column "$hv2_uuid" Port_Binding chassis logical_port=lsp0
wait_column "$hv2_uuid" Port_Binding requested_chassis logical_port=lsp0
wait_column "" Port_Binding additional_chassis logical_port=lsp0
wait_column "" Port_Binding requested_additional_chassis logical_port=lsp0

# Check ovn-installed updated for main chassis and removed from additional chassis
wait_for_ports_up
OVS_WAIT_UNTIL([test `as hv2 ovs-vsctl get Interface lsp0 external_ids:ovn-installed` = '"true"'])
OVS_WAIT_UNTIL([test x`as hv1 ovs-vsctl get Interface lsp0 external_ids:ovn-installed` = x])

# Check that additional_encap is cleared
wait_column "" Port_Binding additional_encap logical_port=lsp0

# Check that abrupted port migration clears additional_encap
check ovn-nbctl lsp-set-options lsp0 \
    requested-chassis=hv2,hv1
wait_column "$hv2_uuid" Port_Binding chassis logical_port=lsp0
wait_column "$hv2_uuid" Port_Binding requested_chassis logical_port=lsp0
wait_column "$hv1_uuid" Port_Binding additional_chassis logical_port=lsp0
wait_column "$hv1_uuid" Port_Binding requested_additional_chassis logical_port=lsp0
check ovn-nbctl lsp-set-options lsp0 requested-chassis=hv2
wait_column "" Port_Binding additional_encap logical_port=lsp0

# Migration with some race conditions
check ovn-nbctl lsp-set-options lsp0 \
    requested-chassis=hv2,hv1
wait_column "$hv2_uuid" Port_Binding chassis logical_port=lsp0
wait_column "$hv2_uuid" Port_Binding requested_chassis logical_port=lsp0
wait_column "$hv1_uuid" Port_Binding additional_chassis logical_port=lsp0
wait_column "$hv1_uuid" Port_Binding requested_additional_chassis logical_port=lsp0

# Check ovn-installed updated for both chassis
wait_for_ports_up

for hv in hv1 hv2; do
    OVS_WAIT_UNTIL([test `as $hv ovs-vsctl get Interface lsp0 external_ids:ovn-installed` = '"true"'])
done

# Complete moving the binding to the new location
sleep_controller hv2
check ovn-nbctl lsp-set-options lsp0 requested-chassis=hv1

wait_column "$hv1_uuid" Port_Binding chassis logical_port=lsp0
wait_column "$hv1_uuid" Port_Binding requested_chassis logical_port=lsp0
wait_column "" Port_Binding additional_chassis logical_port=lsp0
wait_column "" Port_Binding requested_additional_chassis logical_port=lsp0
wake_up_controller hv2
# Check ovn-installed updated for main chassis and removed from additional chassis
wait_for_ports_up
OVS_WAIT_UNTIL([test `as hv1 ovs-vsctl get Interface lsp0 external_ids:ovn-installed` = '"true"'])
OVS_WAIT_UNTIL([test x`as hv2 ovs-vsctl get Interface lsp0 external_ids:ovn-installed` = x])

OVN_CLEANUP([hv1],[hv2])

AT_CLEANUP
])

lsp 下一跳可以多个chassis (为了实现冗余高可用)

当然也可以手动切换 lsp 所在的 requested-chassis


# The test scenario will migrate Migrator port between hv1 and hv2 and check
# that connectivity to and from the port is functioning properly for both
# chassis locations. Connectivity will be checked for resources located at hv1
# (First) and hv2 (Second) as well as for hv3 (Third) that does not take part
# in port migration.
check ovn-nbctl lsp-set-options first requested-chassis=hv1
check ovn-nbctl lsp-set-options second requested-chassis=hv2
check ovn-nbctl lsp-set-options third requested-chassis=hv3

目前只梳理到这里 后续继续

image.png

12.

13.

14.

15.

16.

17.