自定义损失函数 & 卷积池化原理
一、[Conv][Pool]实现原理¶
之前一直觉得卷积层和池化层的计算如下所示:
- 卷积层:
- 池化层:
其中:
- 表示输出维度的特征数
- 表示输入维度的特征数
- 表示卷积核的大小
- 表示零填充的大小
- 表示步长
但是,通过实践发现,PyTorch并没有严格按照上述公式实现,其实现参考A guide to convolution arithmetic for deeplearnin arxiv.org/pdf/1603.07… 中2.4节以及第3节所示
- 卷积层:
- 池化层:
其中, 表示向下取整运算(floor),所以在PyTorch实现中,不同输入大小(比如224和227)能够得到相同大小的输出。
- PyTorch关于卷积层和池化层的实现参考:
二、自定义损失函数
PyTorch在torch.nn模块提供了许多常用的损失函数,同时也提供了自定义方法。以交叉熵损失(CrossEntropyLoss)为例。
- 参考资料:Pytorch如何自定义损失函数(Loss Function)? www.zhihu.com/question/66…
1. 交叉熵损失
- 参考文章:
- 交叉熵损失:blog.zhujian.life/posts/2626b…
- CrossEntropyLoss:pytorch.org/docs/stable…
损失函数如下:
其中,
- 是batchsize,
- 是类别数,
- 表示每个样本所属的正确类别(标签),
- 是示性函数(相等时为1,不相等为0)
求导结果如下:
2. PyTorch实现代码
- 参考文章:pytorch: 自定义损失函数Loss blog.csdn.net/xholes/arti…
使用pytorch自带的函数进行交叉熵损失的前向实现,那么自动实现了反向求导功能,不需要重写backward方法
import torch
import torch.nn as nn
# 自定义模型类,继承nn.Module
class CustomCrossEntropyLoss(nn.Module):
# 定义前向传播
def forward(self, inputs, targets):
# 利用断言实现条件判断,为false则抛出异常
assert len(inputs.shape) == 2 and len(targets) == inputs.shape[0]
loss = 0.0
# 实现交叉熵损失函数
total = torch.log(torch.sum(torch.exp(inputs), dim=1))
batch = inputs.shape[0]
# 将得到的张量进行求和得到标量
loss = torch.sum(total - inputs[torch.arange(batch), targets])
return loss / batch
3. Numpy实现代码
- 参考文章:CREATING EXTENSIONS USING NUMPY AND SCIPY pytorch.org/tutorials/a…
使用numpy实现交叉熵损失,需要重新实现前向和反向功能
import torch
import torch.nn as nn
from torch.autograd import Function
import numpy as np
class NumpyCrossEntropyLossFunction(Function):
@staticmethod
def forward(ctx, inputs, labels):
# 保存反向传播时需要使用的数据
ctx.save_for_backward(inputs.detach(), labels.detach())
# 注意转换数据格式
# scores = inputs.detach().numpy()
# labels = labels.detach().numpy()
# 避免类型错误
scores = inputs.detach().cpu().numpy()
labels = labels.detach().cpu().numpy()
assert len(scores.shape) == 2
assert len(labels.shape) == 1
scores -= np.max(scores, axis=1, keepdims=True)
expscores = np.exp(scores)
probs = expscores / np.sum(expscores, axis=1, keepdims=True)
N = labels.shape[0]
correct_probs = probs[range(N), labels]
loss = -1.0 / N * np.sum(np.log(correct_probs))
return torch.as_tensor(loss, dtype=inputs.dtype)
@staticmethod
def backward(ctx, grad_output):
grad_output = grad_output.detach().numpy()
inputs, labels = ctx.saved_tensors
# scores = inputs.numpy()
# labels = labels.numpy()
# 避免类型错误
scores = inputs.cpu().numpy()
labels = labels.cpu().numpy()
scores -= np.max(scores, axis=1, keepdims=True)
expscores = np.exp(scores)
probs = expscores / np.sum(expscores, axis=1, keepdims=True)
grad_out = probs
N = labels.shape[0]
grad_out[range(N), labels] -= 1
# 避免类型错误
# return torch.from_numpy(grad_out / N), None
return torch.from_numpy(grad_out / N).cuda(), None
class NumpyCrossEntropyLoss(nn.Module):
def forward(self, inputs, labels):
return NumpyCrossEntropyLossFunction.apply(inputs, labels)
前向传输需要注意:
- 输出结果为torch.Tensor格式数据,
- 输入参数需要先调用detach(),再调用numpy()转换成numpy格式
- 输入参数可以调用函数save_for_backward保存,用于反向求导
反向求导需要注意:
- 输入参数为下一层的梯度,同样需要调用detach().numpy()函数转换。如果本身是最后一层,则返回1.0
- 调用函数saved_tensors得到保存的前向数据
- 返回结果个数由前向输入参数决定
4. 测试一
# 定义数据
inputs = torch.Tensor([[1.1785, -0.0969, 0.5756, -1.2113, -0.1120],
[-0.5199, -0.8051, 1.0953, 0.1480, 0.2879],
[2.3401, 0.6403, 1.4306, 0.0982, -0.7363]])
inputs.requires_grad = True
targets = torch.Tensor([4, 3, 1]).type(torch.long)
# 使用torch自带的交叉熵损失函数
criterion = nn.CrossEntropyLoss()
loss = criterion(inputs, targets)
print(loss)
loss.backward()
print(inputs.grad)
# 将上面所得梯度置空,便于后续处理
inputs.grad = None
# 使用自定义的Numpy交叉熵损失函数
criterion2 = NumpyCrossEntropyLoss()
loss = criterion2.forward(inputs, targets)
print(loss)
loss.backward()
print(inputs.grad)
tensor(2.0187, grad_fn=<NllLossBackward0>)
tensor([[ 0.1520, 0.0424, 0.0832, 0.0139, -0.2915],
[ 0.0304, 0.0228, 0.1528, -0.2741, 0.0681],
[ 0.1918, -0.2983, 0.0772, 0.0204, 0.0088]])
tensor(2.0187, grad_fn=<NumpyCrossEntropyLossFunctionBackward>)
tensor([[ 0.1520, 0.0424, 0.0832, 0.0139, -0.2915],
[ 0.0304, 0.0228, 0.1528, -0.2741, 0.0681],
[ 0.1918, -0.2983, 0.0772, 0.0204, 0.0088]])
5. 测试二
使用LeNet-5模型训练数据集FASHION-MNIST
# -*- coding: utf-8 -*-
"""
@author: zj
@file: loss_lenet-5.py
@time: 2020-01-16
"""
import logging
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torch.utils.data as data
from torchvision.datasets import FashionMNIST
import torchvision.transforms as transforms
logging.basicConfig(format='%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s', level=logging.DEBUG)
def load_data():
# 遇到冒号就缩进
try:
transform = transforms.Compose([
transforms.Grayscale(),
transforms.Resize(size=(32, 32)),
transforms.ToTensor(),
transforms.Normalize(mean=(0.5,), std=(0.5,))
])
# 没有数据集就直接下载
train_dataset = FashionMNIST('./data/fashionmnist/', train=True, download=True, transform=transform)
test_dataset = FashionMNIST('./data/fashionmnist', train=False, download=True, transform=transform)
train_dataloader = data.DataLoader(train_dataset, batch_size=128, shuffle=True, num_workers=4)
test_dataloader = data.DataLoader(test_dataset, batch_size=128, shuffle=True, num_workers=4)
except:
load_data()
return train_dataloader, test_dataloader
class LeNet5(nn.Module):
def __init__(self):
super(LeNet5, self).__init__()
self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5, stride=1, padding=0, bias=True)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5, stride=1, padding=0, bias=True)
self.conv3 = nn.Conv2d(in_channels=16, out_channels=120, kernel_size=5, stride=1, padding=0, bias=True)
self.pool = nn.MaxPool2d((2, 2), stride=2)
self.fc1 = nn.Linear(in_features=120, out_features=84, bias=True)
self.fc2 = nn.Linear(84, 10, bias=True)
def forward(self, input):
x = self.pool(F.relu(self.conv1(input)))
x = self.pool(F.relu(self.conv2(x)))
x = self.conv3(x)
x = x.view(-1, self.num_flat_features(x))
x = F.relu(self.fc1(x))
return self.fc2(x)
def num_flat_features(self, x):
size = x.size()[1:] # all dimensions except the batch dimension
num_features = 1
for s in size:
num_features *= s
return num_features
def compute_accuracy(loader, net, device):
total_accuracy = 0
num = 0
for item in loader:
data, labels = item
data = data.to(device)
labels = labels.to(device)
scores = net.forward(data)
predicted = torch.argmax(scores, dim=1)
total_accuracy += torch.mean((predicted == labels).float()).item()
num += 1
return total_accuracy / num
if __name__ == '__main__':
train_dataloader, test_dataloader = load_data()
device = torch.device('cuda:0' if torch.cuda.is_available() else "cpu")
net = LeNet5().to(device)
criterion = nn.CrossEntropyLoss().to(device)
optimer = optim.Adam(net.parameters(), lr=1e-3)
logging.info("开始训练")
epoches = 5
for i in range(epoches):
num = 0
total_loss = 0
for j, item in enumerate(train_dataloader, 0):
data, labels = item
data = data.to(device)
labels = labels.to(device)
scores = net.forward(data)
loss = criterion.forward(scores, labels)
total_loss += loss.item()
optimer.zero_grad()
loss.backward()
optimer.step()
num += 1
avg_loss = total_loss / num
logging.info('epoch: %d loss: %.6f' % (i + 1, total_loss / num))
train_accuracy = compute_accuracy(train_dataloader, net, device)
test_accuracy = compute_accuracy(test_dataloader, net, device)
logging.info('train accuracy: %f test accuracy: %f' % (train_accuracy, test_accuracy))
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to ./data/fashionmnist/FashionMNIST\raw\train-images-idx3-ubyte.gz
0%| | 0/26421880 [00:00<?, ?it/s]
Extracting ./data/fashionmnist/FashionMNIST\raw\train-images-idx3-ubyte.gz to ./data/fashionmnist/FashionMNIST\raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to ./data/fashionmnist/FashionMNIST\raw\train-labels-idx1-ubyte.gz
0%| | 0/29515 [00:00<?, ?it/s]
Extracting ./data/fashionmnist/FashionMNIST\raw\train-labels-idx1-ubyte.gz to ./data/fashionmnist/FashionMNIST\raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to ./data/fashionmnist/FashionMNIST\raw\t10k-images-idx3-ubyte.gz
0%| | 0/4422102 [00:00<?, ?it/s]
Extracting ./data/fashionmnist/FashionMNIST\raw\t10k-images-idx3-ubyte.gz to ./data/fashionmnist/FashionMNIST\raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ./data/fashionmnist/FashionMNIST\raw\t10k-labels-idx1-ubyte.gz
0%| | 0/5148 [00:00<?, ?it/s]
Extracting ./data/fashionmnist/FashionMNIST\raw\t10k-labels-idx1-ubyte.gz to ./data/fashionmnist/FashionMNIST\raw
2022-06-15 12:18:17,910 3628225910.py[line:95] INFO 开始训练
2022-06-15 12:18:25,811 3628225910.py[line:115] INFO epoch: 1 loss: 0.675206
2022-06-15 12:18:30,125 3628225910.py[line:118] INFO train accuracy: 0.825599 test accuracy: 0.815170
2022-06-15 12:18:33,067 3628225910.py[line:115] INFO epoch: 2 loss: 0.431103
2022-06-15 12:18:37,182 3628225910.py[line:118] INFO train accuracy: 0.850024 test accuracy: 0.836234
2022-06-15 12:18:39,962 3628225910.py[line:115] INFO epoch: 3 loss: 0.364287
2022-06-15 12:18:44,236 3628225910.py[line:118] INFO train accuracy: 0.879820 test accuracy: 0.868572
2022-06-15 12:18:47,027 3628225910.py[line:115] INFO epoch: 4 loss: 0.328362
2022-06-15 12:18:51,157 3628225910.py[line:118] INFO train accuracy: 0.889614 test accuracy: 0.879252
2022-06-15 12:18:53,874 3628225910.py[line:115] INFO epoch: 5 loss: 0.307533
2022-06-15 12:18:57,997 3628225910.py[line:118] INFO train accuracy: 0.897116 test accuracy: 0.884691
- 注意:若上面出现 【[WinError 10054] 远程主机强迫关闭了一个现有的连接。】错误,那么可以采取
- 更换一个wifi链接来解决
- 使用try - except解决
- 科学上网
使用自定义NumpyCrossEntropyLoss替换,实现如下:
# from custom_cross_entropy_loss import NumpyCrossEntropyLoss
criterion = NumpyCrossEntropyLoss().to(device)
# criterion = nn.CrossEntropyLoss().to(device)
# -*- coding: utf-8 -*-
"""
@author: zj
@file: loss_lenet-5.py
@time: 2020-01-16
"""
import logging
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torch.utils.data as data
from torchvision.datasets import FashionMNIST
import torchvision.transforms as transforms
logging.basicConfig(format='%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s', level=logging.DEBUG)
def load_data():
# 遇到冒号就缩进
try:
transform = transforms.Compose([
transforms.Grayscale(),
transforms.Resize(size=(32, 32)),
transforms.ToTensor(),
transforms.Normalize(mean=(0.5,), std=(0.5,))
])
# 没有数据集就直接下载
train_dataset = FashionMNIST('./data/fashionmnist/', train=True, download=True, transform=transform)
test_dataset = FashionMNIST('./data/fashionmnist', train=False, download=True, transform=transform)
train_dataloader = data.DataLoader(train_dataset, batch_size=128, shuffle=True, num_workers=4)
test_dataloader = data.DataLoader(test_dataset, batch_size=128, shuffle=True, num_workers=4)
except:
load_data()
return train_dataloader, test_dataloader
class LeNet5(nn.Module):
def __init__(self):
super(LeNet5, self).__init__()
self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5, stride=1, padding=0, bias=True)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5, stride=1, padding=0, bias=True)
self.conv3 = nn.Conv2d(in_channels=16, out_channels=120, kernel_size=5, stride=1, padding=0, bias=True)
self.pool = nn.MaxPool2d((2, 2), stride=2)
self.fc1 = nn.Linear(in_features=120, out_features=84, bias=True)
self.fc2 = nn.Linear(84, 10, bias=True)
def forward(self, input):
x = self.pool(F.relu(self.conv1(input)))
x = self.pool(F.relu(self.conv2(x)))
x = self.conv3(x)
x = x.view(-1, self.num_flat_features(x))
x = F.relu(self.fc1(x))
return self.fc2(x)
def num_flat_features(self, x):
size = x.size()[1:] # all dimensions except the batch dimension
num_features = 1
for s in size:
num_features *= s
return num_features
def compute_accuracy(loader, net, device):
total_accuracy = 0
num = 0
for item in loader:
data, labels = item
data = data.to(device)
labels = labels.to(device)
scores = net.forward(data)
predicted = torch.argmax(scores, dim=1)
total_accuracy += torch.mean((predicted == labels).float()).item()
num += 1
return total_accuracy / num
if __name__ == '__main__':
train_dataloader, test_dataloader = load_data()
device = torch.device('cuda:0' if torch.cuda.is_available() else "cpu")
net = LeNet5().to(device)
# 替换交叉熵损失函数,换成自定义函数
criterion = NumpyCrossEntropyLoss().to(device)
optimer = optim.Adam(net.parameters(), lr=1e-3)
logging.info("开始训练")
epoches = 5
for i in range(epoches):
num = 0
total_loss = 0
for j, item in enumerate(train_dataloader, 0):
data, labels = item
data = data.to(device)
labels = labels.to(device)
scores = net.forward(data)
loss = criterion.forward(scores, labels)
total_loss += loss.item()
optimer.zero_grad()
loss.backward()
optimer.step()
num += 1
avg_loss = total_loss / num
logging.info('epoch: %d loss: %.6f' % (i + 1, total_loss / num))
train_accuracy = compute_accuracy(train_dataloader, net, device)
test_accuracy = compute_accuracy(test_dataloader, net, device)
logging.info('train accuracy: %f test accuracy: %f' % (train_accuracy, test_accuracy))
2022-06-15 12:25:25,690 931105398.py[line:97] INFO 开始训练
2022-06-15 12:25:28,552 931105398.py[line:117] INFO epoch: 1 loss: 0.642837
2022-06-15 12:25:32,755 931105398.py[line:120] INFO train accuracy: 0.841229 test accuracy: 0.831982
2022-06-15 12:25:35,449 931105398.py[line:117] INFO epoch: 2 loss: 0.402902
2022-06-15 12:25:39,547 931105398.py[line:120] INFO train accuracy: 0.864483 test accuracy: 0.855024
2022-06-15 12:25:42,293 931105398.py[line:117] INFO epoch: 3 loss: 0.350990
2022-06-15 12:25:46,423 931105398.py[line:120] INFO train accuracy: 0.881313 test accuracy: 0.871341
2022-06-15 12:25:49,279 931105398.py[line:117] INFO epoch: 4 loss: 0.323117
2022-06-15 12:25:53,596 931105398.py[line:120] INFO train accuracy: 0.887554 test accuracy: 0.874802
2022-06-15 12:25:56,486 931105398.py[line:117] INFO epoch: 5 loss: 0.305991
2022-06-15 12:26:00,656 931105398.py[line:120] INFO train accuracy: 0.895200 test accuracy: 0.878461
- 注意;之前使用nn自带损失函数训练时,我们使用的是Nvidia进行训练,其张量操作在显卡内进行,需要先转换成CPU Tensor再转换成numpy格式,否则出现以下错误:
TypeError: can't convert CUDA tensor to numpy. Use Tensor.cpu() to copy the
所以我们需要修改NumpyCrossEntropyLoss类中的张量,同时还需要将backward返回的结果转换成cuda Tensor格式,修改如下:
class NumpyCrossEntropyLossFunction(Function):
@staticmethod
def forward(ctx, inputs, labels):
...
# 注意转换数据格式
scores = inputs.detach().cpu().numpy()
labels = labels.detach().cpu().numpy()
...
...
return torch.as_tensor(loss, dtype=inputs.dtype)
@staticmethod
def backward(ctx, grad_output):
...
...
scores = inputs.cpu().numpy()
labels = labels.cpu().numpy()
...
...
return torch.from_numpy(grad_out / N).cuda(), None
三、小结
自定义损失函数时,尽量使用pytorch实现,这样就不需要自己实现自动求导功能