1. 目前遇到一个环境 虚拟机使用 dpdk 后,无法收发包
问题表现:
虚拟机内部 ping 网关 ping 不通,
ovs-vsctl list interface 看到的 dpdk vhost unix sock 对应的 port 基于 ovs-tcpdump 根本就抓不到包,但是 ovs 服务正常,流表正常。
在 虚拟机内部 保持 ping 网关,但是 ping 不通, 而且 ifconfig 根本就没有 错误计数,错误计数始终为 0
目前使用的镜像是: docker pull 599230270/kube-ovn:v1.12.0-dpdk-no-avx512
该镜像是 Fedora 构建的, Fedora 本身是不支持 avx512 的。
由于 kubevirt 也是 红帽的,基于这个原因,不使用 avx512 估计是为了避坑
。
Fedora 是商业化的Red Hat Enterprise Linux发行版的上游源码。
该镜像是在 Fedora 支持多队列网卡的虚拟机中构建的,而我的宿主机是 基于 centos 的。 本身也是比较接近的。
centos fedora redhat 三者之间的关系:
RedHat Enterprise 发行之前:
RedHat Enterprise 发行后, 4.0 版本开始:
CentOS 停止,centos stream 发起之后:
fedora 作为 centos stream 的上游,编译出来的东西, centos 应该也是能用的。
1.1 目前存在两个环境 kube-ovn ovs-dpdk 版本是一致的,但是有一个环境不通。
通过对比 xml 发现 是没有配置共享内存,导致 dpdk vhost 基于 共享内存转发包,导致网络不通。
在定位中 发现 dpdk vhost unix sock 文件 和 linux 的 内核态 unix sock 存在不同。基于内核态的编码抓包是抓不到 dpdk vhost unix sock io 的数据流的
# 好的 虚拟机的 xml
<domain type='kvm' id='1'>
<name>default_vm-fedora-dpdk-1</name>
<uuid>83d8713e-a51a-5d06-ab4e-f11a670b5e98</uuid>
<metadata>
<kubevirt xmlns="http://kubevirt.io">
<uid/>
</kubevirt>
</metadata>
<memory unit='KiB'>4194304</memory>
<currentMemory unit='KiB'>4194304</currentMemory>
<memoryBacking>
<hugepages/>
<source type='memfd'/> # 这里有配置共享内存
</memoryBacking>
<vcpu placement='static'>4</vcpu>
<iothreads>1</iothreads>
<resource>
<partition>/machine</partition>
</resource>
<sysinfo type='smbios'>
<system>
<entry name='manufacturer'>KubeVirt</entry>
<entry name='product'>None</entry>
<entry name='uuid'>83d8713e-a51a-5d06-ab4e-f11a670b5e98</entry>
<entry name='family'>KubeVirt</entry>
</system>
</sysinfo>
<os>
<type arch='x86_64' machine='pc-q35-rhel9.2.0'>hvm</type>
<boot dev='hd'/>
<smbios mode='sysinfo'/>
</os>
<features>
<acpi/>
</features>
<cpu mode='custom' match='exact' check='full'>
<model fallback='forbid'>Broadwell</model>
<vendor>Intel</vendor>
<topology sockets='1' dies='1' cores='4' threads='1'/>
<feature policy='require' name='vme'/>
<feature policy='require' name='ss'/>
<feature policy='require' name='vmx'/>
<feature policy='require' name='f16c'/>
<feature policy='require' name='rdrand'/>
<feature policy='require' name='hypervisor'/>
<feature policy='require' name='arat'/>
<feature policy='require' name='tsc_adjust'/>
<feature policy='require' name='umip'/>
<feature policy='require' name='arch-capabilities'/>
<feature policy='require' name='xsaveopt'/>
<feature policy='require' name='pdpe1gb'/>
<feature policy='require' name='abm'/>
<feature policy='require' name='skip-l1dfl-vmentry'/>
<feature policy='require' name='pschange-mc-no'/>
<numa>
<cell id='0' cpus='0-3' memory='4194304' unit='KiB'/>
</numa>
</cpu>
<clock offset='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<emulator>/usr/libexec/qemu-kvm</emulator>
<disk type='file' device='disk' model='virtio-non-transitional'>
<driver name='qemu' type='qcow2' cache='none' error_policy='stop' discard='unmap'/>
<source file='/var/run/kubevirt-ephemeral-disks/disk-data/containerdisk/disk.qcow2' index='1'/>
<backingStore type='file' index='2'>
<format type='qcow2'/>
<source file='/var/run/kubevirt/container-disks/disk_0.img'/>
</backingStore>
<target dev='vda' bus='virtio'/>
<alias name='ua-containerdisk'/>
<address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
</disk>
<controller type='usb' index='0' model='none'>
<alias name='usb'/>
</controller>
<controller type='scsi' index='0' model='virtio-non-transitional'>
<alias name='scsi0'/>
<address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
</controller>
<controller type='virtio-serial' index='0' model='virtio-non-transitional'>
<alias name='virtio-serial0'/>
<address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
</controller>
<controller type='sata' index='0'>
<alias name='ide'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
</controller>
<controller type='pci' index='0' model='pcie-root'>
<alias name='pcie.0'/>
</controller>
<controller type='pci' index='1' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='1' port='0x10'/>
<alias name='pci.1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
</controller>
<controller type='pci' index='2' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='2' port='0x11'/>
<alias name='pci.2'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
</controller>
<controller type='pci' index='3' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='3' port='0x12'/>
<alias name='pci.3'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
</controller>
<controller type='pci' index='4' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='4' port='0x13'/>
<alias name='pci.4'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
</controller>
<controller type='pci' index='5' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='5' port='0x14'/>
<alias name='pci.5'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
</controller>
<controller type='pci' index='6' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='6' port='0x15'/>
<alias name='pci.6'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x5'/>
</controller>
<controller type='pci' index='7' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='7' port='0x16'/>
<alias name='pci.7'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x6'/>
</controller>
<controller type='pci' index='8' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='8' port='0x17'/>
<alias name='pci.8'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x7'/>
</controller>
<controller type='pci' index='9' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='9' port='0x18'/>
<alias name='pci.9'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</controller>
<interface type='vhostuser'>
<mac address='00:00:00:2c:ac:b6'/>
<source type='unix' path='/var/run/vm/dpdk/pod6c270ef2f25' mode='server'/>
<model type='virtio-non-transitional'/>
<driver name='vhost' queues='4' rx_queue_size='1024' tx_queue_size='1024'/>
<alias name='ua-net1'/>
<address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
</interface>
<serial type='unix'>
<source mode='bind' path='/var/run/kubevirt-private/5950bc0a-fde5-4b96-9958-e7ad9b87e11e/virt-serial0'/>
<target type='isa-serial' port='0'>
<model name='isa-serial'/>
</target>
<alias name='serial0'/>
</serial>
<console type='unix'>
<source mode='bind' path='/var/run/kubevirt-private/5950bc0a-fde5-4b96-9958-e7ad9b87e11e/virt-serial0'/>
<target type='serial' port='0'/>
<alias name='serial0'/>
</console>
<channel type='unix'>
<source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-1-default_vm-fedora-dp/org.qemu.guest_agent.0'/>
<target type='virtio' name='org.qemu.guest_agent.0' state='connected'/>
<alias name='channel0'/>
<address type='virtio-serial' controller='0' bus='0' port='1'/>
</channel>
<input type='mouse' bus='ps2'>
<alias name='input0'/>
</input>
<input type='keyboard' bus='ps2'>
<alias name='input1'/>
</input>
<graphics type='vnc' socket='/var/run/kubevirt-private/5950bc0a-fde5-4b96-9958-e7ad9b87e11e/virt-vnc' sharePolicy='ignore'>
<listen type='socket' socket='/var/run/kubevirt-private/5950bc0a-fde5-4b96-9958-e7ad9b87e11e/virt-vnc'/>
</graphics>
<audio id='1' type='none'/>
<video>
<model type='vga' vram='16384' heads='1' primary='yes'/>
<alias name='video0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
</video>
<memballoon model='virtio-non-transitional' freePageReporting='off'>
<stats period='10'/>
<alias name='balloon0'/>
<address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
</memballoon>
</devices>
<seclabel type='dynamic' model='dac' relabel='yes'>
<label>+107:+107</label>
<imagelabel>+107:+107</imagelabel>
</seclabel>
</domain>
# 坏的 虚拟机的 xml
<domain type='kvm' id='1'>
<name>default_vm-dpdk</name>
<uuid>662f44f6-ba16-5d08-95e7-75b1ca2ff10e</uuid>
<metadata>
<kubevirt xmlns="http://kubevirt.io">
<uid/>
</kubevirt>
</metadata>
<memory unit='KiB'>2097152</memory>
<currentMemory unit='KiB'>2097152</currentMemory>
<memoryBacking>
<hugepages/> # 没有配置共享内存,导致包转发无法使用共享内存
</memoryBacking>
<vcpu placement='static'>2</vcpu>
<iothreads>1</iothreads>
<resource>
<partition>/machine</partition>
</resource>
<sysinfo type='smbios'>
<system>
<entry name='manufacturer'>KubeVirt</entry>
<entry name='product'>None</entry>
<entry name='uuid'>662f44f6-ba16-5d08-95e7-75b1ca2ff10e</entry>
<entry name='family'>KubeVirt</entry>
</system>
</sysinfo>
<os>
<type arch='x86_64' machine='pc-q35-rhel9.2.0'>hvm</type>
<boot dev='hd'/>
<smbios mode='sysinfo'/>
</os>
<features>
<acpi/>
</features>
<cpu mode='custom' match='exact' check='full'>
<model fallback='forbid'>Cascadelake-Server</model>
<vendor>Intel</vendor>
<topology sockets='1' dies='1' cores='2' threads='1'/>
<feature policy='require' name='ss'/>
<feature policy='require' name='hypervisor'/>
<feature policy='require' name='tsc_adjust'/>
<feature policy='require' name='umip'/>
<feature policy='require' name='pku'/>
<feature policy='require' name='md-clear'/>
<feature policy='require' name='stibp'/>
<feature policy='require' name='arch-capabilities'/>
<feature policy='require' name='xsaves'/>
<feature policy='require' name='rdctl-no'/>
<feature policy='require' name='ibrs-all'/>
<feature policy='require' name='skip-l1dfl-vmentry'/>
<feature policy='require' name='mds-no'/>
<feature policy='require' name='pschange-mc-no'/>
<feature policy='disable' name='erms'/>
<feature policy='disable' name='avx512vnni'/>
<feature policy='disable' name='xgetbv1'/>
<feature policy='disable' name='pdpe1gb'/>
<feature policy='disable' name='mpx'/>
</cpu>
<clock offset='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<emulator>/usr/libexec/qemu-kvm</emulator>
<disk type='block' device='disk' model='virtio-non-transitional'>
<driver name='qemu' type='raw' cache='none' error_policy='stop' io='native' discard='unmap'/>
<source dev='/dev/root-device' index='2'/>
<backingStore/>
<target dev='vda' bus='virtio'/>
<alias name='ua-root-device'/>
<address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
</disk>
<disk type='file' device='disk' model='virtio-non-transitional'>
<driver name='qemu' type='raw' cache='none' error_policy='stop' discard='unmap'/>
<source file='/var/run/kubevirt-ephemeral-disks/cloud-init-data/default/vm-dpdk/configdrive.iso' index='1'/>
<backingStore/>
<target dev='vdb' bus='virtio'/>
<alias name='ua-cloudinitdisk'/>
<address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
</disk>
<controller type='usb' index='0' model='none'>
<alias name='usb'/>
</controller>
<controller type='scsi' index='0' model='virtio-non-transitional'>
<alias name='scsi0'/>
<address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
</controller>
<controller type='virtio-serial' index='0' model='virtio-non-transitional'>
<alias name='virtio-serial0'/>
<address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
</controller>
<controller type='sata' index='0'>
<alias name='ide'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
</controller>
<controller type='pci' index='0' model='pcie-root'>
<alias name='pcie.0'/>
</controller>
<controller type='pci' index='1' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='1' port='0x10'/>
<alias name='pci.1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
</controller>
<controller type='pci' index='2' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='2' port='0x11'/>
<alias name='pci.2'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
</controller>
<controller type='pci' index='3' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='3' port='0x12'/>
<alias name='pci.3'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
</controller>
<controller type='pci' index='4' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='4' port='0x13'/>
<alias name='pci.4'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
</controller>
<controller type='pci' index='5' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='5' port='0x14'/>
<alias name='pci.5'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
</controller>
<controller type='pci' index='6' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='6' port='0x15'/>
<alias name='pci.6'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x5'/>
</controller>
<controller type='pci' index='7' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='7' port='0x16'/>
<alias name='pci.7'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x6'/>
</controller>
<controller type='pci' index='8' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='8' port='0x17'/>
<alias name='pci.8'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x7'/>
</controller>
<interface type='vhostuser'>
<mac address='00:00:00:9d:24:bf'/>
<source type='unix' path='/var/run/kubevirt-private/dpdk01/sock' mode='server'/>
<model type='virtio-non-transitional'/>
<driver>
<host mrg_rxbuf='off'/>
</driver>
<alias name='ua-dpdk01'/>
<address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
</interface>
<interface type='vhostuser'>
<mac address='00:00:00:fa:29:11'/>
<source type='unix' path='/var/run/kubevirt-private/dpdk02/sock' mode='server'/>
<model type='virtio-non-transitional'/>
<driver>
<host mrg_rxbuf='off'/>
</driver>
<alias name='ua-dpdk02'/>
<address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
</interface>
<serial type='unix'>
<source mode='bind' path='/var/run/kubevirt-private/ca990405-9b47-4c8f-bc3a-cb63a4700980/virt-serial0'/>
<log file='/var/run/kubevirt-private/ca990405-9b47-4c8f-bc3a-cb63a4700980/virt-consolelog0.log' append='off'/>
<target type='isa-serial' port='0'>
<model name='isa-serial'/>
</target>
<alias name='serial0'/>
</serial>
<console type='unix'>
<source mode='bind' path='/var/run/kubevirt-private/ca990405-9b47-4c8f-bc3a-cb63a4700980/virt-serial0'/>
<log file='/var/run/kubevirt-private/ca990405-9b47-4c8f-bc3a-cb63a4700980/virt-consolelog0.log' append='off'/>
<target type='serial' port='0'/>
<alias name='serial0'/>
</console>
<channel type='unix'>
<source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-1-default_vm-dpdk/org.qemu.guest_agent.0'/>
<target type='virtio' name='org.qemu.guest_agent.0' state='connected'/>
<alias name='channel0'/>
<address type='virtio-serial' controller='0' bus='0' port='1'/>
</channel>
<input type='mouse' bus='ps2'>
<alias name='input0'/>
</input>
<input type='keyboard' bus='ps2'>
<alias name='input1'/>
</input>
<graphics type='vnc' socket='/var/run/kubevirt-private/ca990405-9b47-4c8f-bc3a-cb63a4700980/virt-vnc'>
<listen type='socket' socket='/var/run/kubevirt-private/ca990405-9b47-4c8f-bc3a-cb63a4700980/virt-vnc'/>
</graphics>
<audio id='1' type='none'/>
<video>
<model type='vga' vram='16384' heads='1' primary='yes'/>
<alias name='video0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
</video>
<memballoon model='virtio-non-transitional'>
<stats period='10'/>
<alias name='balloon0'/>
<address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
</memballoon>
</devices>
<seclabel type='dynamic' model='dac' relabel='yes'>
<label>+107:+107</label>
<imagelabel>+107:+107</imagelabel>
</seclabel>
</domain>
libvirt内存备份有三种类型:file、anonymous和memfd。它们的详细区别如下:
- file:将内存备份保存在文件中。这种方式可以通过文件共享来实现内存共享。
- anonymous:默认的内存备份类型。它将内存备份保存在匿名内存中,无法直接共享。
- memfd:使用Linux的memfd机制来创建内存备份,可以通过文件描述符共享内存。
这些备份类型的选择取决于具体的需求和环境。