一、模型源码及预训练权重:
Yolov5-5.0:
Yolov5-6.0:
二、问题
1、Yolov5-6.0 Arial.ttf会自动下载,但是无法下载,因此报错。
解决办法:
在文件yolov5/utils/plots.py中,以下代码:
if RANK in (-1, 0):
check_font() # download TTF if necessary
# YOLOv5 Annotator for train/val mosaics and jpgs and detect/hub inference annotations
def __init__(self, im, line_width=None, font_size=None, font='Arial.ttf', pil=False, example='abc'):
改为:
class Annotator:
#if RANK in (-1, 0):
#check_font() # download TTF if necessary
# YOLOv5 Annotator for train/val mosaics and jpgs and detect/hub inference annotations
def __init__(self, im, line_width=None, font_size=None, font='', pil=False, example='abc'):
2、YOLOV5 | Run detect.py 报错AttributeError: ‘Upsample‘ object has no attribute ‘recompute_scale_factor‘。
将上述语句改为:
return F.interpolate(input, self.size, self.scale_factor, self.mode, self.align_corners)
三、改进
1、增加soft-NMS
utils.general.py中增加如下代码:
def my_soft_nms(bboxes, scores, iou_thresh=0.5, sigma=0.5, score_threshold=0.25):
bboxes = bboxes.contiguous()
x1 = bboxes[:, 0]
y1 = bboxes[:, 1]
x2 = bboxes[:, 2]
y2 = bboxes[:, 3]
# 计算每个box的面积
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
# 首先对所有得分进行一次降序排列,仅此一次,以提高后续查找最大值速度. oeder为降序排列后的索引
_, order = scores.sort(0, descending=True)
# NMS后,保存留下来的边框
keep = []
while order.numel() > 0:
if order.numel() == 1: # 仅剩最后一个box的索引
i = order.item()
keep.append(i)
break
else:
i = order[0].item() # 保留首个得分最大的边框box索引,i为scores中实际坐标
keep.append(i)
# 巧妙使用tersor.clamp()函数求取order中当前框[0]之外每一个边框,与当前框[0]的最大值和最小值
xx1 = x1[order[1:]].clamp(min=x1[i])
yy1 = y1[order[1:]].clamp(min=y1[i])
xx2 = x2[order[1:]].clamp(max=x2[i])
yy2 = y2[order[1:]].clamp(max=y2[i])
# 求取order中其他每一个边框与当前边框的交集面积
inter = (xx2 - xx1).clamp(min=0) * (yy2 - yy1).clamp(min=0)
# 计算order中其他每一个框与当前框的IoU
iou = inter / (areas[i] + areas[order[1:]] - inter) # 共order.numel()-1个
idx = (iou > iou_thresh).nonzero().squeeze() # 获取order中IoU大于阈值的其他边框的索引
if idx.numel() > 0:
iou = iou[idx]
newScores = torch.exp(-torch.pow(iou, 2) / sigma) # 计算边框的得分衰减
scores[order[idx + 1]] *= newScores # 更新那些IoU大于阈值的边框的得分
newOrder = (scores[order[1:]] > score_threshold).nonzero().squeeze()
if newOrder.numel() == 0:
break
else:
newScores = scores[order[newOrder + 1]]
maxScoreIndex = torch.argmax(newScores)
if maxScoreIndex != 0:
newOrder[[0, maxScoreIndex],] = newOrder[[maxScoreIndex, 0],]
# 更新order.
order = order[newOrder + 1]
# 返回保留下来的所有边框的索引值,类型torch.LongTensor
return torch.LongTensor(keep)
non_max_suppression中替换为如下内容:
i = my_soft_nms(boxes, scores, iou_thres) # SOFT NMS
如下图所示:
2、增加SE注意力机制
(1)先把注意力结构代码放到common.py文件中,将这段代码粘贴到common.py文件中
class SE(nn.Module):
def __init__(self, c1, r=16):
super(SELayer, self).__init__()
self.avgpool = nn.AdaptiveAvgPool2d(1)
self.l1 = nn.Linear(c1, c1 // r, bias=False)
self.relu = nn.ReLU(inplace=True)
self.l2 = nn.Linear(c1 // r, c1, bias=False)
self.sig = nn.Sigmoid()
def forward(self, x):
b, c, _, _ = x.size()
y = self.avgpool(x).view(b, c)
y = self.l1(y)
y = self.relu(y)
y = self.l2(y)
y = self.sig(y)
y = y.view(b, c, 1, 1)
return x * y.expand_as(x)
(2)找到yolo.py文件里的parse_model函数,将类名加入进去
(3)修改配置文件(我这里拿yolov5s.yaml举例子),将注意力层加到你想加入的位置;常用的一般是添加到backbone的最后一层,或者C3里面,这里是加在了最后一层
当在网络中添加了新的层之后,那么该层网络后续的层的编号都会发生改变,看下图,原本Detect指定的是[ 17 , 20 , 23 ]层,所以在我们添加了SE注意力层之后也要Detect对这里进行修改,即原来的17层变成了18 层;原来的20层变成了21 层;原来的23层变成了24 层;所以Detecet的from系数要改为[ 18 , 21 , 24 ]
同样的,Concat的from系数也要修改,这样才能保持原网络结构不发生特别大的改变,我们刚才把SE层加到了第9层,所以第9层之后的编号都会加1,这里我们要把后面两个Concat的from系数分别由[ − 1 , 14 ] , [ − 1 , 10 ]改为[ − 1 , 15 ] , [ − 1 , 11 ]
3、增加计数
将detect.py中的这段代码改为:
最后在if save_img:中增加如下代码:
增加的完整代码如下:
person_count = 0
tie_count = 0
person_color = (0, 0, 0)
tie_color = (0, 0, 0)
# Write results+Count
# count=0
for *xyxy, conf, cls in reversed(det):
if save_txt: # Write to file
xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh
line = (cls, *xywh, conf) if opt.save_conf else (cls, *xywh) # label format
with open(txt_path + '.txt', 'a') as f:
f.write(('%g ' * len(line)).rstrip() % line + '\n')
if save_img or view_img: # Add bbox to image
label = f'{names[int(cls)]} {conf:.2f}'
plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3)
##########################分类计数(coco.yaml)##########################
# names中person对应的数组下标
if int(cls) == 0:
person_count += 1
person_color=colors[int(cls)]
# names中tie对应的数组下标
if int(cls) == 27:
tie_count += 1
tie_color = colors[int(cls)]
# count = count+1
text = 'person_num: %d ' % (person_count)
cv2.putText(im0, text, (10, 25), cv2.FONT_HERSHEY_SIMPLEX, 1, person_color, 2)
text = 'tie_num: %d ' % (tie_count)
cv2.putText(im0, text, (10, 60), cv2.FONT_HERSHEY_SIMPLEX, 1, tie_color, 2)