尝试使用配准解决无人机平台中多目标跟踪存在相机角度变化巨大的问题,如下图所示
数据集序列:uav0000249_00001_v
| 序列35帧 | 序列36帧 |
|---|---|
其中由于无人机的运动,36帧中白色汽车相对前一帧发生了向右的平移,如果只是使用Kalman滤波预测白色汽车bbox在36帧的位置使用IOU进行关联的话则会失败。因此自然可以使用配准消除相机运动带来的影响。
配准模型
考虑到相机是发生运动,因此使用透视变换来进行配准,透视变换的原理如下:
源图像中点为,目标图像相应中点为,透视变换矩阵为M
,
求解M使用opencv,代码如下:
image0 = cv2.imread(os.path.join(seq_dir, seq, seq_files[i]))
image1 = cv2.imread(os.path.join(seq_dir, seq, seq_files[i+1]))
image0 = cv2.cvtColor(image0, cv2.COLOR_BGR2GRAY)
image1 = cv2.cvtColor(image1, cv2.COLOR_BGR2GRAY)
surf = cv2.xfeatures2d.SURF_create()
kp0, des0 = surf.detectAndCompute(image0, None)
kp1, des1 = surf.detectAndCompute(image1, None)
FLANN_INDEX_KDTREE = 0
index_params = dict(algorithm=FLANN_INDEX_KDTREE, trees=5)
search_params = dict(checks=50)
flann = cv2.FlannBasedMatcher(index_params, search_params)
matchs = flann.knnMatch(des0, des1, k=2)
# store all the good matchs as per Lowe's ratio test
good = []
for m, n in matchs:
if m.distance < 0.7 * n.distance:
good.append(m)
if len(good) > MIN_MATCH_COUNT:
src_pts = np.float32([kp0[m.queryIdx].pt for m in good]).reshape(-1, 1, 2)
dst_pts = np.float32([kp1[m.trainIdx].pt for m in good]).reshape(-1, 1, 2)
M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)
else:
M = np.eye(3, 3)
使用ransac求解M,其中kp0和kp1的座标原点是图像左上角,因此在跟踪中不需要先对物体的bbox座标处理(一般会想到把原点移动到图像中心),直接可以使用仿射变换进行求解。
基于配准的kalman滤波
使用配准解决无人机运行机运动个人想到有两种思路:
- 先将图像i帧配准到图像i+1帧,然后对图像i和图像i+1帧分别做目标检测,i帧检测的目标使用kalman滤波预测其在i+1帧中的位置,使用IOU进行关联。
- 先对图像i和图像i+1分别做目标检测,使用kalman预测i中物体在i+1中的位置,使用M将预测位置变换到i+1中实际位置,使用IOU进行关联。
个人采用第二种思路进行滤波后关联,代码逻辑如下
strack_pool = joint_stracks(tracked_stracks, self.lost_stracks)
# Predict the current location with KF
for strack in strack_pool:
strack.predict()
# use affine to update the pos for strack_pool
affine_transform(strack_pool, affine_mat)
''' association, with IOU'''
detections = [detections[i] for i in u_detection]
r_tracked_stracks = [strack_pool[i] for i in u_track if strack_pool[i].state == TrackState.Tracked]
dists = matching.iou_distance(r_tracked_stracks, detections)
matches, u_track, u_detection = matching.linear_assignment(dists, thresh=0.5)
for itracked, idet in matches:
track = r_tracked_stracks[itracked]
det = detections[idet]
if track.state == TrackState.Tracked:
track.update(det, self.frame_id)
activated_starcks.append(track)
else:
track.re_activate(det, self.frame_id, new_id=False)
refind_stracks.append(track)
for it in u_track:
track = r_tracked_stracks[it]
if not track.state == TrackState.Lost:
track.mark_lost()
lost_stracks.append(track)
实验结果如下, 统计kalman预测i帧bbox位置和i+1帧中bbox的iou分布。
| kalman滤波 | kalman滤波+配准 |
|---|---|
affine_transform实现
bbox表示使用tl和br的座标,考虑到bbox可能存在旋转的情况,加入tr和bl的座标,分别对4个点进行变换,最后求得变换后的外接bbox,代码如下:
def affine_transform(stracks, affine_mat):
for i in range(len(stracks)):
mean = stracks[i].mean.copy()
bbox_infer = tlwh(mean)
bbox_infer[2:] += bbox_infer[:2]
bbox_expand = np.ones((3, 4))
bbox_expand[:2, 0] = bbox_infer[:2]
bbox_expand[:2, 1] = bbox_infer[2:]
# tr
bbox_expand[:2, 2] = bbox_infer[2], bbox_infer[1]
# bl
bbox_expand[:2, 3] = bbox_infer[0], bbox_infer[3]
bbox_expand = np.dot(affine_mat, bbox_expand)
for t in range(bbox_expand.shape[1]):
bbox_expand[:2, t] /= bbox_expand[2, t]
# get the out bounding bbox
bbox_infer[0] = min(bbox_expand[0, :])
bbox_infer[1] = min(bbox_expand[1, :])
bbox_infer[2] = max(bbox_expand[0, :])
bbox_infer[3] = max(bbox_expand[1, :])
bbox_infer[2:] -= bbox_infer[:2]
mean[:4] = tlwh_to_xyah(bbox_infer)
stracks[i].mean = mean