卷积操作
注意官网中torch.nn和torch.nn.functional的区别:
前者是对后者的封装,更方便使用
后者用起来更加麻烦,更加细致
卷积层
| Convolution Layers | 注释 |
|---|---|
nn.Conv1d 一维 | Applies a 1D convolution over an input signal composed of several input planes. |
nn.Conv2d 二维 | Applies a 2D convolution over an input signal composed of several input planes. |
nn.Conv3d | Applies a 3D convolution over an input signal composed of several input planes. |
nn.ConvTranspose1d | Applies a 1D transposed convolution operator over an input image composed of several input planes. |
nn.ConvTranspose2d | Applies a 2D transposed convolution operator over an input image composed of several input planes. |
nn.ConvTranspose3d | Applies a 3D transposed convolution operator over an input image composed of several input planes. |
nn.LazyConv1d | A torch.nn.Conv1d module with lazy initialization of the in_channels argument of the Conv1d that is inferred from the input.size(1). |
nn.LazyConv2d | A torch.nn.Conv2d module with lazy initialization of the in_channels argument of the Conv2d that is inferred from the input.size(1). |
nn.LazyConv3d | A torch.nn.Conv3d module with lazy initialization of the in_channels argument of the Conv3d that is inferred from the input.size(1). |
nn.LazyConvTranspose1d | A torch.nn.ConvTranspose1d module with lazy initialization of the in_channels argument of the ConvTranspose1d that is inferred from the input.size(1). |
nn.LazyConvTranspose2d | A torch.nn.ConvTranspose2d module with lazy initialization of the in_channels argument of the ConvTranspose2d that is inferred from the input.size(1). |
nn.LazyConvTranspose3d | A torch.nn.ConvTranspose3d module with lazy initialization of the in_channels argument of the ConvTranspose3d that is inferred from the input.size(1). |
nn.Unfold | Extracts sliding local blocks from a batched input tensor. |
nn.Fold | Combines an array of sliding local blocks into a large containing tensor. |
TORCH.NN.FUNCTIONAL.CONV2D
torch.nn.functional.conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1) → Tensor
input: 注意尺寸有4个参数 -> torch.reshape(input, (N,C,H,W))
N: bach_size(输入图片数量)
C: 通道数量:每一个坐标元素所包含的数据的个数
H, W: 高和宽
weight: 卷积核
bias: 偏置
stride: 步进,可以是单个数字也可以是二元组
padding: 填充,可以是单个数字也可以是二元组
卷积计算过程
stride=1时:
1.对应位置相乘求和:1+4+0+0+1+0+2+2+0=10
2.对应位置相乘求和:2+0+3+0+2+0+4+1+0=12
3....
加入padding后
示例
import torch
import torch.nn.functional as F
input = torch.tensor([[1,2,0,3,1],
[0,1,2,3,1],
[1,2,1,0,0],
[5,2,3,1,1],
[2,1,0,1,1]])
kernel = torch.tensor([[1,2,1],
[0,1,0],
[2,1,0]]) # 内核
input = torch.reshape(input, (1,1,5,5))
kernel = torch.reshape(kernel,(1,1,3,3))
print(input.shape) # torch.Size([1, 1, 5, 5])
print(kernel.shape) # torch.Size([1, 1, 3, 3])
output = F.conv2d(input, weight=kernel, stride=1)
print(output)
'''
tensor([[[[10, 12, 12],
[18, 16, 16],
[13, 9, 3]]]])
'''
output2 = F.conv2d(input, weight=kernel, stride=2)
print(output2)
'''
tensor([[[[10, 12],
[13, 3]]]])
'''
output3 = F.conv2d(input, weight=kernel, stride=1, padding=1)
print(output3)
'''
tensor([[[[ 1, 3, 4, 10, 8],
[ 5, 10, 12, 12, 6],
[ 7, 18, 16, 16, 8],
[11, 13, 9, 3, 4],
[14, 13, 9, 7, 4]]]])
'''