08-卷积操作

165 阅读2分钟

卷积操作

注意官网中torch.nn和torch.nn.functional的区别:

前者是对后者的封装,更方便使用

后者用起来更加麻烦,更加细致

卷积层

Convolution Layers注释
nn.Conv1d 一维Applies a 1D convolution over an input signal composed of several input planes.
nn.Conv2d 二维Applies a 2D convolution over an input signal composed of several input planes.
nn.Conv3dApplies a 3D convolution over an input signal composed of several input planes.
nn.ConvTranspose1dApplies a 1D transposed convolution operator over an input image composed of several input planes.
nn.ConvTranspose2dApplies a 2D transposed convolution operator over an input image composed of several input planes.
nn.ConvTranspose3dApplies a 3D transposed convolution operator over an input image composed of several input planes.
nn.LazyConv1dA torch.nn.Conv1d module with lazy initialization of the in_channels argument of the Conv1d that is inferred from the input.size(1).
nn.LazyConv2dA torch.nn.Conv2d module with lazy initialization of the in_channels argument of the Conv2d that is inferred from the input.size(1).
nn.LazyConv3dA torch.nn.Conv3d module with lazy initialization of the in_channels argument of the Conv3d that is inferred from the input.size(1).
nn.LazyConvTranspose1dA torch.nn.ConvTranspose1d module with lazy initialization of the in_channels argument of the ConvTranspose1d that is inferred from the input.size(1).
nn.LazyConvTranspose2dA torch.nn.ConvTranspose2d module with lazy initialization of the in_channels argument of the ConvTranspose2d that is inferred from the input.size(1).
nn.LazyConvTranspose3dA torch.nn.ConvTranspose3d module with lazy initialization of the in_channels argument of the ConvTranspose3d that is inferred from the input.size(1).
nn.UnfoldExtracts sliding local blocks from a batched input tensor.
nn.FoldCombines an array of sliding local blocks into a large containing tensor.

TORCH.NN.FUNCTIONAL.CONV2D

torch.nn.functional.conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1) → Tensor

08-1.png

input: 注意尺寸有4个参数 -> torch.reshape(input, (N,C,H,W))

N: bach_size(输入图片数量)

C: 通道数量:每一个坐标元素所包含的数据的个数

H, W: 高和宽

weight: 卷积核

bias: 偏置

stride: 步进,可以是单个数字也可以是二元组

padding: 填充,可以是单个数字也可以是二元组

卷积计算过程

08-2.png

08-3.png

stride=1时:

1.对应位置相乘求和:1+4+0+0+1+0+2+2+0=10

2.对应位置相乘求和:2+0+3+0+2+0+4+1+0=12

3....

08-4.png

加入padding后

08-5.png

示例

import torch
import torch.nn.functional as F
​
input = torch.tensor([[1,2,0,3,1],
                    [0,1,2,3,1],
                    [1,2,1,0,0],
                    [5,2,3,1,1],
                    [2,1,0,1,1]])
​
kernel = torch.tensor([[1,2,1],
                       [0,1,0],
                       [2,1,0]])  # 内核input = torch.reshape(input, (1,1,5,5))
kernel = torch.reshape(kernel,(1,1,3,3))
​
print(input.shape)  # torch.Size([1, 1, 5, 5])
print(kernel.shape)  # torch.Size([1, 1, 3, 3])
​
output = F.conv2d(input, weight=kernel, stride=1)
print(output)
'''
tensor([[[[10, 12, 12],
          [18, 16, 16],
          [13,  9,  3]]]])
'''
​
output2 = F.conv2d(input, weight=kernel, stride=2)
print(output2)
'''
tensor([[[[10, 12],
          [13,  3]]]])
'''
​
output3 = F.conv2d(input, weight=kernel, stride=1, padding=1)
print(output3)
'''
tensor([[[[ 1,  3,  4, 10,  8],
          [ 5, 10, 12, 12,  6],
          [ 7, 18, 16, 16,  8],
          [11, 13,  9,  3,  4],
          [14, 13,  9,  7,  4]]]])
'''