前言
本文接着上一篇【论文复现】针对yoloV5-L部分的YoloBody部分重构(Slim-neck by GSConv)中叙述的论文[1]接着复现的yoloV4,这两篇文章的基础函数基本一致,没有其他函数的加入,因此本文也将省去函数拆解部分。若大家想知道的更详细,可以移步【论文复现】针对yoloV5-L部分的YoloBody部分重构(Slim-neck by GSConv),望见谅!感谢!
本文同上篇一样:不赘述原理部分,将把精力放在复现论文中关于YoloBady主题结构上面,YoloBady取自bubbliiiing的 yoloV4 网络,所以关于yoloV4的原始部分也请大家移步yoloV4的官方讲解和bubbliiiing的讲解。
关键词:小目标检测、边缘设备、轻量化、效率精度
参数流程
开门见山,不多废话。我们直接在论文的第五页与第六页处可以得如图1的关于yoloV4的网络Slim-neck by GSConv部分与原版部分的比较参数流程表,左侧的为Slim-neck by GSConv版的。我们结合参数流程表与论文中的第13页的流程图(图2),一一对应各个环节的参数可以得到图三的参数流程图。
图1
图2
图3
在图三中我们默认输入的图像shape=[640,640],相比较V4与V5,v4的结构层数较V5的要复杂,卷积层的推演以及参数的合并部分需要大家小心注意。
YoloBady
from nets import *
class YoloBody(nn.Module):
def __init__(self, anchors_mask, num_classes, phi, pretrained=False, input_shape=[640, 640]):
super(YoloBody, self).__init__()
depth_dict = {'n': 0.33, 's': 0.33, 'm': 0.67, 'l': 1.00, 'x': 1.33, }
width_dict = {'n': 0.25, 's': 0.50, 'm': 0.75, 'l': 1.00, 'x': 1.25, }
dep_mul, wid_mul = depth_dict[phi], width_dict[phi]
base_channels = int(wid_mul * 64) # 64
base_depth = max(round(dep_mul * 3), 1) # 3
# -----------------------------------------------#
# 输入图片是640, 640, 3
# 初始的基本通道是64
# -----------------------------------------------#
self.backbone = CSPDarknet(base_channels, base_depth, phi, pretrained)
self.upsample = nn.Upsample(scale_factor=2, mode="nearest")
self.concat = Concat(dimension=1)
self.SPPF = SPPF(base_channels * 16, base_channels * 8) # 1024 ---> 512
self.P4Conv = Conv(base_channels * 8, base_channels * 4) # 1,512,40,40 ---> 1,256,40,40
self.P3Conv = Conv(base_channels * 4, base_channels * 2) # 1,512,40,40 ---> 1,256,40,40
self.P5GSConv = GSConv(base_channels * 8, base_channels * 4) # 1,512,20,20 ---> 1,256,20,20
self.P4VoV = VoVGSCSP(base_channels * 8, base_channels * 4) # 1,512,40,40 ---> 1,256,40,40
self.P4GSConv = GSConv(base_channels * 4, base_channels * 2) # 1,256,40,40 ---> 1,128,40,40
self.P3VoV = VoVGSCSP(base_channels * 4, base_channels * 2) # 1,256,80,80 ---> 1,128,80,80
self.P3GSConvH = GSConv(base_channels * 2, base_channels * 4) # 1,128,80,80 ---> 1,256,80,80
self.P3GSConv = GSConv(base_channels * 2, base_channels * 4, 3, 2) # 1,128,80,80 ---> 1,256,40,40
self.Head2VoV = VoVGSCSP(base_channels * 8, base_channels * 4) # 1,512,40,40 ---> 1,256,40,40
self.Head2GSConv = GSConv(base_channels * 4, base_channels * 8) # 1,256,40,40 ---> 1,512,40,40
self.Head3GSConv1 = GSConv(base_channels * 4, base_channels * 8, 3, 2) # 1,256,20,20 ---> 1,512,20,20
self.Head3VoV = VoVGSCSP(base_channels * 16, base_channels * 8) # 1,1024,20,20 ---> 1,512,20,20
self.Head3GSConv2 = GSConv(base_channels * 8, base_channels * 16) # 1,512,20,20 ---> 1,1024,20,20
self.yolo_head_P3 = nn.Conv2d(base_channels * 4, len(anchors_mask[2]) * (5 + num_classes), 1)
self.yolo_head_P4 = nn.Conv2d(base_channels * 8, len(anchors_mask[1]) * (5 + num_classes), 1)
self.yolo_head_P5 = nn.Conv2d(base_channels * 16, len(anchors_mask[0]) * (5 + num_classes), 1)
def forward(self, x):
P3, P4, P5 = self.backbone(x)
P5SPPF = self.SPPF(P5)
P5 = self.P5GSConv(P5SPPF)
P5P5SPPF_Up = self.upsample(P5)
P4 = self.P4Conv(P4)
P4 = self.concat([P4, P5P5SPPF_Up])
P4VoV = self.P4VoV(P4)
P4 = self.P4GSConv(P4VoV)
P4_Up = self.upsample(P4)
P3 = self.P3Conv(P3)
P3 = self.concat([P3, P4_Up])
P3 = self.P3VoV(P3)
Head1 = self.P3GSConvH(P3)
P3G = self.P3GSConv(P3)
P3C = self.concat([P3G, P4VoV])
Head2VoV = self.Head2VoV(P3C)
Head2 = self.Head2GSConv(Head2VoV)
Head3G1 = self.Head3GSConv1(Head2VoV)
Head3C = self.concat([Head3G1, P5SPPF])
Head3V = self.Head3VoV(Head3C)
Head3 = self.Head3GSConv2(Head3V)
Out1 = self.yolo_head_P3(Head1)
Out2 = self.yolo_head_P4(Head2)
Out3 = self.yolo_head_P5(Head3)
return Out3, Out2, Out1
if __name__ == "__main__":
anchors_mask = [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
num_classes = 80
phi = 'l'
model = YoloBody(anchors_mask, num_classes, phi, pretrained=False)
x = torch.ones((1, 3, 640, 640))
Out3, Out2, Out1 = model(x)
print(Out3.shape, Out2.shape, Out1.shape)
结果展示图:
下图取自论文。左侧为官方源代码结果图,右侧为Slim-neck by GSConv改进后的结果图。
结尾
由于本人能力有限若文中有纰漏还请多多指正,感谢大家的阅读,希望本文对大家有所帮助,需要代码可以进入我的仓库自取。结合bubbliiiing的代码和我重构的YoloBady即可。
[1] 《Slim-neck by GSConv: A better design paradigm of detector architectures for autonomous vehicles》
[2] github.com/alanli1997/…
[3] link.juejin.cn/?target=htt…