KubeVirt VM Live Migration LSP Options Fix
Issue Reference
Problem Statement
KubeVirt VM live migration fails with error:
ovs interface xxx is not ready after 30s
Root Cause: During consecutive migrations (e.g., A→B, then B→A), the code may read stale migration state from vmi.Status.MigrationState that belongs to the previous migration. This causes incorrect node detection and skipping of LSP migration options setup.
Evidence from Issue #6220 Logs
# Controller log shows node info from PREVIOUS migration:
status Scheduling, target Node k8s-worker-02, source Node k8s-worker-01
# Code incorrectly determines source == target:
VM pod migration setup skipped, source node: k8s-worker-01, target node: k8s-worker-01
User @SkalaNetworks confirmed: "Kube-OVN thinks the source pod is the destination and the destination pod the source. So it decides to do nothing."
Root Cause Analysis (CORRECTED)
Key Understanding: Where Does Stale Data Come From?
Misconception: I initially thought stale data came from vmiMigration.Status.MigrationState. This is INCORRECT.
Truth: Each migration creates a NEW VirtualMachineInstanceMigration object. The stale data comes from vmi.Status.MigrationState (the VMI object).
KubeVirt Data Flow
| Object | Field | When Populated |
|---|---|---|
vmiMigration.Status.MigrationState | SourceNode/TargetNode | ONLY after migration completes (copied from vmi when IsFinal) |
vmi.Status.MigrationState | SourceNode/TargetNode | During MigrationScheduled phase (handleTargetPodHandoff) |
vmi.Status.MigrationState.MigrationUID | UID | Set when migration controller starts processing |
The Bug: Missing MigrationUID Validation
Master branch code reads from vmi.Status.MigrationState (correct source), BUT it doesn't validate that the MigrationUID matches the current migration:
// Master branch - Missing UID validation
if vmi.Status.MigrationState != nil {
srcNodeName = vmi.Status.MigrationState.SourceNode // May be STALE from previous migration!
targetNodeName = vmi.Status.MigrationState.TargetNode
}
Timeline of Bug:
- First migration A→B succeeds,
vmi.Status.MigrationStatecontains{Source: A, Target: B, MigrationUID: uid1} - Second migration B→A starts, new
vmiMigrationobject created withuid2 - Before KubeVirt updates
vmi.Status.MigrationStatewith new info... - Kube-OVN reads
vmi.Status.MigrationState→ gets stale data from first migration! - Code sees stale
{Source: A, Target: B}, but VM is now on B! - With incorrect source node, code detects
source == target→ SKIP!
The Fix: MigrationUID Validation
// Fixed code - Validate MigrationUID before using state
if vmi.Status.MigrationState != nil && vmi.Status.MigrationState.MigrationUID == vmiMigration.UID {
// Only use vmi.Status.MigrationState if MigrationUID matches current migration
srcNodeName = vmi.Status.MigrationState.SourceNode
targetNodeName = vmi.Status.MigrationState.TargetNode
}
This ensures we only use migration state that belongs to the current migration, not stale data from a previous one.
Key Facts Summary
Fact 1️⃣: Each Migration Creates a NEW Migration Object ✅
- Every migration creates a NEW
VirtualMachineInstanceMigrationobject - The new object's
Status.MigrationStateis initially nil - Does NOT inherit data from previous migrations
Fact 2️⃣: Source of Stale Data 🎯
Stale data comes from vmi.Status.MigrationState (the VMI object), NOT from vmiMigration.Status.MigrationState (the Migration object)!
Why?
- After the first migration succeeds,
vmi.Status.MigrationStatestill contains{Source: A, Target: B, MigrationUID: uid1} - When the second migration (B→A) starts, if Kube-OVN reads
vmi.Status.MigrationStatebefore KubeVirt updates the VMI status - It will read stale data from the previous migration!
Fact 3️⃣: Master Branch Already Reads from Correct Data Source ✅
Master branch already reads from vmi.Status.MigrationState (VMI object), not from vmiMigration.Status.MigrationState.
The Actual Bug 🐛
Master branch lacks MigrationUID validation:
// Master branch - Missing UID validation
if vmi.Status.MigrationState != nil {
srcNodeName = vmi.Status.MigrationState.SourceNode // May be STALE from previous migration!
}
The Fix ✅
The fix-migration branch adds MigrationUID validation:
// Fixed - Validate MigrationUID
if vmi.Status.MigrationState != nil &&
vmi.Status.MigrationState.MigrationUID == vmiMigration.UID {
// Only use when UID matches
}
This ensures we only use state from the current migration, not stale data from a previous one.
Background
LSP Migration Options in OVN
OVN supports VM live migration through special LSP options:
requested-chassis: Specifies which chassis(es) the port should be bound toactivation-strategy=rarp: Triggers port activation on the target chassis when RARP packet is received
Kube-OVN LSP Operations
| Function | Purpose | LSP Changes |
|---|---|---|
SetLogicalSwitchPortMigrateOptions | Migration start | requested-chassis=src,target, activation-strategy=rarp |
ResetLogicalSwitchPortMigrateOptions(failed=false) | Migration succeeded | requested-chassis=target (removes activation-strategy) |
ResetLogicalSwitchPortMigrateOptions(failed=true) | Migration failed | requested-chassis=src (removes activation-strategy) |
CleanLogicalSwitchPortMigrateOptions | Cleanup | Removes all migration options |
KubeVirt Migration Internals
Migration Phase Flow
MigrationPhaseUnset
↓
MigrationPending ← Migration object created
↓
MigrationScheduling ← Target pod scheduling starts
↓
MigrationScheduled ← Target pod scheduled, handleTargetPodHandoff() called
↓ *** vmi.Status.MigrationState.SourceNode/TargetNode SET HERE ***
MigrationPreparingTarget ← Target virt-handler preparing
↓
MigrationTargetReady ← Target ready for migration
↓
MigrationRunning ← QEMU migration in progress
↓
MigrationSucceeded/Failed ← Final state
*** vmiMigration.Status.MigrationState COPIED FROM vmi HERE ***
Key Data Structures
VirtualMachineInstance (VMI)
type VirtualMachineInstanceStatus struct {
NodeName string // Current node where VMI runs
MigrationState *VirtualMachineInstanceMigrationState
}
type VirtualMachineInstanceMigrationState struct {
SourceNode string // Set in handleTargetPodHandoff()
TargetNode string // Set in handleTargetPodHandoff()
TargetPod string
MigrationUID types.UID
StartTimestamp *metav1.Time
EndTimestamp *metav1.Time
Completed bool
Failed bool
// ...
}
VirtualMachineInstanceMigration (VMIMigration)
type VirtualMachineInstanceMigrationStatus struct {
Phase VirtualMachineInstanceMigrationPhase
MigrationState *VirtualMachineInstanceMigrationState // Copied from VMI when IsFinal()
}
Critical Timing Analysis
When is vmi.Status.MigrationState.SourceNode/TargetNode set?
In handleTargetPodHandoff() (migration.go#L1183-L1201):
func (c *Controller) handleTargetPodHandoff(migration, vmi, pod) error {
vmiCopy.Status.MigrationState.TargetNode = pod.Spec.NodeName
vmiCopy.Status.MigrationState.SourceNode = vmi.Status.NodeName
// ...
}
This happens during MigrationScheduled phase, when target pod is ready.
When is vmiMigration.Status.MigrationState populated?
In updateStatus() (migration.go#L548-L563):
func (c *Controller) updateStatus(migration, vmi, pods, syncError) error {
if migration.IsFinal() {
if vmi.IsMigrationSynchronized(migration) &&
migration.UID == vmi.Status.MigrationState.MigrationUID {
// ONLY copied when migration is FINAL (Succeeded/Failed)
migrationCopy.Status.MigrationState = vmi.Status.MigrationState
}
}
}
Key Finding: vmiMigration.Status.MigrationState is ONLY populated after migration completes!
Data Availability Matrix
| Phase | vmi.Status.MigrationState | vmiMigration.Status.MigrationState |
|---|---|---|
| Pending | nil or stale | nil |
| Scheduling | nil or stale | nil |
| Scheduled | SET (handleTargetPodHandoff) | nil |
| PreparingTarget | SET | nil |
| TargetReady | SET | nil |
| Running | SET | nil |
| Succeeded/Failed | SET | COPIED from vmi |
Current Implementation Analysis
Current Code (Simplified Version)
func (c *Controller) handleAddOrUpdateVMIMigration(key string) error {
vmiMigration, _ := c.config.KubevirtClient.VirtualMachineInstanceMigration(namespace).Get(...)
if vmiMigration.Status.MigrationState == nil {
return nil // Wait for next event
}
srcNodeName := vmiMigration.Status.MigrationState.SourceNode
targetNodeName := vmiMigration.Status.MigrationState.TargetNode
if srcNodeName == "" || targetNodeName == "" {
return nil // Wait for next event
}
switch vmiMigration.Status.Phase {
case MigrationScheduling:
SetLogicalSwitchPortMigrateOptions(portName, srcNodeName, targetNodeName)
case MigrationSucceeded:
ResetLogicalSwitchPortMigrateOptions(portName, srcNodeName, targetNodeName, false)
case MigrationFailed:
ResetLogicalSwitchPortMigrateOptions(portName, srcNodeName, targetNodeName, true)
}
}
Problem with Current Code
MigrationScheduling Phase Issue:
- At
MigrationScheduling,vmiMigration.Status.MigrationStateis nil (not yet copied from vmi) - Even if not nil,
SourceNode/TargetNodewould be empty - Code returns nil waiting for next event, but the data will NEVER be available until migration completes
- SetLogicalSwitchPortMigrateOptions will never be called during MigrationScheduling!
MigrationSucceeded/Failed Phase:
- At final phases,
vmiMigration.Status.MigrationStateIS available (copied from vmi) - Reset operations will work correctly
Why PR #6066 Introduced Pod-based Approach
PR #6066 recognized that during early migration phases, node information is not available in vmiMigration.Status.MigrationState. The solution was to:
- Get
sourceNodefromvmi.Status.NodeName(VMI's current location) - Get
targetNodefrom the target Pod'sSpec.NodeName(via label selector)
// PR #6066 approach (simplified)
case MigrationScheduling:
vmi, _ := c.config.KubevirtClient.VirtualMachineInstance(namespace).Get(vmiName)
sourceNode := vmi.Status.NodeName
pods, _ := c.config.KubeClient.CoreV1().Pods(namespace).List(
ListOptions{LabelSelector: "kubevirt.io/migration-job-name=" + migrationName})
targetNode := pods[0].Spec.NodeName
SetLogicalSwitchPortMigrateOptions(portName, sourceNode, targetNode)
Proposed Solutions
Option 1: Restore Pod-based Approach for MigrationScheduling
Pros:
- Proven to work (was in production via PR #6066)
- Gets correct node info at the right time
Cons:
- Additional API calls (get vmi, list pods)
- More complex code
Option 2: Use Different Phases
Instead of MigrationScheduling, use later phases where vmiMigration.Status.MigrationState is available:
| Phase | Action |
|---|---|
| MigrationScheduled or later | Set (if not already set) |
| MigrationSucceeded | Reset(false) |
| MigrationFailed | Reset(true) |
Problem: By MigrationScheduled, traffic should already be able to flow to target. Setting LSP options too late may cause packet loss.
Option 3: Query vmi.Status.MigrationState with MigrationUID Validation (Implemented)
This is the approach implemented in the current fix:
func (c *Controller) handleAddOrUpdateVMIMigration(key string) error {
vmiMigration, _ := Get(...)
vmi, _ := Get(vmiName)
var srcNodeName, targetNodeName string
// CRITICAL: Only use vmi.Status.MigrationState if MigrationUID matches current migration
if vmi.Status.MigrationState != nil &&
vmi.Status.MigrationState.MigrationUID == vmiMigration.UID {
srcNodeName = vmi.Status.MigrationState.SourceNode
targetNodeName = vmi.Status.MigrationState.TargetNode
}
switch vmiMigration.Status.Phase {
case MigrationScheduling:
// For early phases, get target from Pod if MigrationState not yet populated
if srcNodeName == "" {
srcNodeName = vmi.Status.NodeName // VMI's current location
}
pods, _ := List(LabelSelector: MigrationJobLabel=vmiMigration.UID)
targetNode := pods[0].Spec.NodeName
SetLogicalSwitchPortMigrateOptions(portName, srcNodeName, targetNode)
case MigrationSucceeded:
ResetLogicalSwitchPortMigrateOptions(..., false)
case MigrationFailed:
ResetLogicalSwitchPortMigrateOptions(..., true)
}
}
Key Points:
- MigrationUID validation prevents using stale state from previous migrations
- During MigrationScheduling, uses Pod label selector to get target node
- Falls back to
vmi.Status.NodeNameif SourceNode not yet set
Implementation Summary
The fix implements a multi-layer defense approach:
Layer 1: MigrationUID Validation (kubevirt.go)
Validate that vmi.Status.MigrationState.MigrationUID matches the current migration before using the state:
if vmi.Status.MigrationState != nil && vmi.Status.MigrationState.MigrationUID == vmiMigration.UID {
// MigrationUID matches - safe to use this state
migrationStateValid = true
srcNodeName = vmi.Status.MigrationState.SourceNode
targetNodeName = vmi.Status.MigrationState.TargetNode
} else {
// MigrationUID mismatch - state is STALE from previous migration
// Clean up any residual migrate options immediately
CleanLogicalSwitchPortMigrateOptions(portName)
}
Layer 2: Stale State Cleanup (kubevirt.go)
When stale state is detected, immediately clean LSP migrate options to prevent inconsistencies:
if vmiMigrationUID != migrationUID {
klog.Warningf("Migration %s - VMI MigrationState is STALE (UID mismatch), cleaning residual migrate options", key)
c.OVNNbClient.CleanLogicalSwitchPortMigrateOptions(portName)
}
Layer 3: Conflict Detection (ovn-nb-logical_switch_port.go)
SetLogicalSwitchPortMigrateOptions now validates that no conflicting migration is in progress:
// Check 1: If requested-chassis has two nodes (migration in progress)
if src != "" && target != "" {
if src != srcNodeName || target != targetNodeName {
return fmt.Errorf("conflicting migrate options: current=%s,%s but trying to set %s,%s",
src, target, srcNodeName, targetNodeName)
}
}
// Check 2: If requested-chassis has single node (previous migration completed)
// The single node must equal the new migration's source
if currentChassis != "" && currentChassis != srcNodeName {
return fmt.Errorf("inconsistent state: current requested-chassis=%s but new migration source=%s",
currentChassis, srcNodeName)
}
Layer 4: Pod Deletion Cleanup (pod.go - existing)
When VM Pod is deleted, LSP migrate options are cleaned:
if isVMPod && c.config.EnableKeepVMIP {
for _, port := range ports {
c.OVNNbClient.CleanLogicalSwitchPortMigrateOptions(port.Name)
}
}
Migration Lifecycle Markers
Clear log markers for debugging:
| Marker | Meaning |
|---|---|
>>> [MIGRATION START] | New migration started (MigrationPending) |
--- [MIGRATION PROGRESS] | Migration in progress (intermediate phases) |
>>> [MIGRATION LSP SET] | Setting LSP migrate options |
<<< [MIGRATION LSP RESET] | Resetting LSP migrate options |
<<< [MIGRATION END] | Migration completed (Succeeded/Failed) |
Validation Matrix
| Scenario | requested-chassis | Action | Result |
|---|---|---|---|
| First migration | (empty) | Set node1,node2 | ✅ Success |
| Idempotent | node1,node2 | Set node1,node2 | ✅ Skip (already set) |
| Migration in progress | node1,node2 | Set node2,node1 | ❌ Error: conflicting |
| After migration (consistent) | node2 | Set node2,node1 | ✅ Success (source matches) |
| After migration (inconsistent) | node2 | Set node1,node3 | ❌ Error: inconsistent state |
Testing Considerations
- Unit Tests: Mock KubeVirt client responses for different phases
- E2E Tests:
- Single migration: A→B
- Consecutive migrations: A→B→A (key scenario for this bug)
- Failed migration recovery
- VM Pod force deletion and reschedule
How This Fix Solves Issue #6220
The Bug Scenario
1. Migration A (node1 → node2) succeeds
- vmi.Status.MigrationState = {Source: node1, Target: node2, UID: uid1}
- LSP requested-chassis = "node2"
2. Migration B (node2 → node1) starts immediately
- New vmiMigration created with uid2
- KubeVirt hasn't updated vmi.Status.MigrationState yet
- Kube-OVN reads STALE state: {Source: node1, Target: node2, UID: uid1}
3. OLD CODE: Uses stale data without validation
- source=node1, target=node2 (WRONG! should be node2, node1)
- Detects "source == target" incorrectly → SKIPS LSP setup!
4. RESULT: OVS interface not ready → migration fails
The Fix
1. MigrationUID Validation
- vmi.Status.MigrationState.MigrationUID (uid1) != vmiMigration.UID (uid2)
- Stale state detected!
2. Immediate Cleanup
- CleanLogicalSwitchPortMigrateOptions() called
- Removes stale requested-chassis
3. Continue with Current Migration
- Get source from vmi.Status.NodeName = node2 (correct!)
- Get target from Pod label selector = node1 (correct!)
- SetLogicalSwitchPortMigrateOptions(node2, node1) ← CORRECT!
4. RESULT: LSP options set correctly → migration succeeds
References
- KubeVirt migration controller:
pkg/virt-controller/watch/migration/migration.go - KubeVirt VMI types:
staging/src/kubevirt.io/api/core/v1/types.go - OVN LSP migration: OVN documentation on live migration
- PR #6066: Original Pod-based approach for timing issues