图像匹配传统步骤

539 阅读3分钟

图像匹配任务的核心是找到两个相机拍摄图片间的相似几何(旋转、平移)

检测特征点(以SIFT为例)

num_feature表示特征点的数量,创建特征检测器sift_detector

contrastThreshold表示对比度阈值,用于过滤半均匀(低对比度)区域的弱特征。阈值越大,检测器产生的特征越少。 edgethreshold用于过滤边缘特征的阈值,它的含义与contrastThreshold不同,edgeThreshold越大被过滤掉的特性越少(保留的特性越多)。有时候可能需要降低检测阈值,否则小图像可能无法检测到预期数量的特征点。

sift_detector = cv2.SIFT_create(num_features, contrastThreshold=-10000, edgeThreshold=-10000)

提取特征,每个局部特征包含一个关键点(xy,可能是尺度,可能是方向)keypoints和一个描述向量(SIFT为128维)descriptions

keypoints, descriptors = ExtractSiftFeatures(images_dict[keys[0]], sift_detector, num_features)

匹配并筛除异常值

可以通过蛮力匹配两幅图像的局部特征来找到对应关系。以一对简单的图像为例,对于一个图像上的每个描述符,查找另一个图像上最相似的描述符,当crossCheck=True时,我们只保留双向匹配(即,两个特征从A到B和从B到A都是最近的邻居):

bf = cv2.BFMatcher(cv2.NORM_L2, crossCheck=True)

计算匹配对

cv_matches = bf.match(descriptors_1, descriptors_2)

暴力匹配包括了许多异常值。我们可以用RANSAC算法过滤,OpenCV给出了RANSAC之后的基本矩阵,以及输入匹配的掩码。解决方案显然要干净得多,尽管它可能仍然包含异常值。

cv2.findFundamentalMat:从两个图像中的对应点计算基本矩阵

F, inlier_mask = cv2.findFundamentalMat(cur_kp_1[matches[:, 0]], cur_kp_2[matches[:, 1]], cv2.USAC_MAGSAC, ransacReprojThreshold=0.25, confidence=0.99999, maxIters=10000)

F即我们想得到的预测值

检验方法

计算误差前首相要分解我们刚刚估计的基本矩阵F,为E, R, TE为Essential matrix,R为rotation matrix,T为translation vector 。

E, R, T = ComputeEssentialMatrix(F, calib_dict[image_id_1].K, calib_dict[image_id_2].K, inlier_kp_1, inlier_kp_2)

ComputeEssentialMatrix利用cheirality check,通过估计的基本矩阵和两幅图像中的对应点恢复相机的相对旋转和平移矩阵,返回通过检查的inliers数量。

def ComputeEssentialMatrix(F, K1, K2, kp1, kp2):
    '''Compute the Essential matrix from the Fundamental matrix, given the calibration matrices. Note that we ask participants to estimate F, i.e., without relying on known intrinsics.'''

    # Warning! Old versions of OpenCV's RANSAC could return multiple F matrices, encoded as a single matrix size 6x3 or 9x3, rather than 3x3.
    # We do not account for this here, as the modern RANSACs do not do this:
    # https://opencv.org/evaluating-opencvs-new-ransacs
    assert F.shape[0] == 3, 'Malformed F?'

    # Use OpenCV's recoverPose to solve the cheirality check:
    # https://docs.opencv.org/4.5.4/d9/d0c/group__calib3d.html#gadb7d2dfcc184c1d2f496d8639f4371c0
    E = np.matmul(np.matmul(K2.T, F), K1).astype(np.float64)

    kp1n = NormalizeKeypoints(kp1, K1)
    kp2n = NormalizeKeypoints(kp2, K2)
    num_inliers, R, T, mask = cv2.recoverPose(E, kp1n, kp2n)

    return E, R, T

QuaternionFromMatrix将旋转矩阵转换为四元组

q = QuaternionFromMatrix(R)
# 返回一个折叠成一维的数组
T = T.flatten()

得到这对图像的ground truth相对姿态差

R1_gt, T1_gt = calib_dict[image_id_1].R, calib_dict[image_id_1].T.reshape((3, 1))
R2_gt, T2_gt = calib_dict[image_id_2].R, calib_dict[image_id_2].T.reshape((3, 1))
dR_gt = np.dot(R2_gt, R1_gt.T)
dT_gt = (T2_gt - np.dot(dR_gt, T1_gt)).flatten()
q_gt = QuaternionFromMatrix(dR_gt)
q_gt = q_gt / (np.linalg.norm(q_gt) + eps)

给定ground truth和预测值,计算上面例子的误差

err_q, err_t = ComputeErrorForOneExample(q_gt, dT_gt, q, T, scaling_dict[scene])

ComputeErrorForOneExample函数返回两个错误:旋转错误和平移错误。ComputeMaa根据不同的阈值对这些数据进行组合,从而计算出平均精度。

def ComputeErrorForOneExample(q_gt, T_gt, q, T, scale):
    '''Compute the error metric for a single example.

    The function returns two errors, over rotation and translation. These are combined at different thresholds by ComputeMaa in order to compute the mean Average Accuracy.'''

    q_gt_norm = q_gt / (np.linalg.norm(q_gt) + eps)
    q_norm = q / (np.linalg.norm(q) + eps)

    loss_q = np.maximum(eps, (1.0 - np.sum(q_norm * q_gt_norm) ** 2))
    err_q = np.arccos(1 - 2 * loss_q)

    # Apply the scaling factor for this scene.
    T_gt_scaled = T_gt * scale
    T_scaled = T * np.linalg.norm(T_gt) * scale / (np.linalg.norm(T) + eps)

    err_t = min(np.linalg.norm(T_gt_scaled - T_scaled), np.linalg.norm(T_gt_scaled + T_scaled))

    return err_q * 180 / np.pi, err_t