深度学习笔记 - Pytorch实现CIFAR10图像识别

136 阅读4分钟

今天继续巩固CNN和网络训练的函数用法

一. 实验环境

  • 语言环境:Python3.8
  • 编译器:jupyter notebook
  • 深度学习环境:Pytorch
  • 整体使用ipynotebook编码,便于使用中间结果测试

检查GPU是否可用

device = torch.device("cuda:0" if torch.cuda.is_available() else 'cpu')
print(device)
# cuda:0

二. 数据集处理

实验中使用torchvision封装好的数据集,测试代码如下: 这次自己踩的坑是torch.utils.data.DataLoader注意大小写,写成小写的会识别不出

import torchvision
from torch.utils.data import DataLoader

train_ds = torchvision.datasets.CIFAR10('data', 
                                      train=True, 
                                      transform=torchvision.transforms.ToTensor(), # 将数据类型转化为Tensor
                                      download=True)

test_ds  = torchvision.datasets.CIFAR10('data', 
                                      train=False, 
                                      transform=torchvision.transforms.ToTensor(), # 将数据类型转化为Tensor
                                      download=True)

batch_size = 32
train_dl = DataLoader(train_ds, batch_size=batch_size, shuffle=True)
test_dl = DataLoader(test_ds, batch_size=batch_size, shuffle=False)

data, label = next(iter(train_dl))
print(data.shape)
print(label.shape)

# torch.Size([32, 3, 32, 32])
# torch.Size([32])

三. 网络搭建

对于图像数据的处理方法:

  • 卷积之后要跟池化层
  • 在分类头之前要tensor.view或者torch.flatten拉平shape
  • 卷积或者MLP后要加激活函数,常用relu,用法:F.relu(data)

卷积层:

torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None)  

池化层:

torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)

中间懒得算卷积维度的话,就测试打印中间结果

class CNN(nn.Module):
    def __init__(self, num_class) -> None:
        super().__init__()
        self.num_class = num_class
        self.hidden_dimension = 256
        self.conv1 = nn.Conv2d(3, 32, kernel_size=3, dilation=1)
        self.pool1 = nn.MaxPool2d(2)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, dilation=1)
        self.pool2 = nn.MaxPool2d(2)
        self.MLP = nn.Linear(64, num_class)

    def forward(self, data):
        return self.conv1(data)

net = CNN(num_class=10)
output = net(data)
print(output.shape)
# torch.Size([32, 32, 30, 30])

手动搭建的网络结构,后续重复的结构可以考虑用nn.Seqential() 优化

import torch.nn.functional as F
class CNN(nn.Module):
    def __init__(self, num_class) -> None:
        super().__init__()
        self.num_class = num_class
        self.conv1 = nn.Conv2d(3, 32, kernel_size=3, dilation=1)
        self.pool1 = nn.MaxPool2d(2)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, dilation=1)
        self.pool2 = nn.MaxPool2d(2)
        self.dropout = nn.Dropout(0.1)
        self.MLP = nn.Linear(2304, self.num_class)


    def forward(self, data):
        data = self.pool1(F.relu(self.conv1(data)))
        data = self.pool2(F.relu(self.conv2(data)))

        b, _, _, _ = data.shape
        data = data.view(b, -1)
        
        data = self.MLP(data) 
        data = self.dropout(data)

        return data

net = CNN(num_class=10).to(device)
output = net(data)
print(output.shape)
# torch.Size([32, 10])

四. 训练与测试函数

训练前需要定义超参数,几个必需的:学习率、优化器、损失函数。

learning_rate = 0.001
optimizer = Adam(net.parameters(), lr=learning_rate)
loss = CrossEntropyLoss()
epochs = 30

踩过的坑:

  • 定义优化器时,要将model.parameters()作为析构函数参数传入。
  • 注意device的使用,训练前将model.to(device), data.to(device)
  • 如何查看tensor和model的device?
    • print(next(model.parameters()).device)
    • print(data.device)
  • model.train()的作用是启用 Batch Normalization 和 Dropout。
  • model.eval()的作用是不启用 Batch Normalization 和 Dropout。
  • nn.CrossEntropyLoss()不用经过softmax,传入训练后数据[batch_size, num_class]以及标签[batch]即可。

这里偷懒了,直接用了之前写过的train和test函数。

  • train函数传入dataloader、model、loss、optimizer
  • test函数传入dataloader、model、loss
  • 返回的都是记录准确率和loss的列表,后续可以考虑加上验证集以及tensorboard、wandb等可视化
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)  # 训练集的大小,一共60000张图片
    num_batches = len(dataloader)   # 批次数目,1875(60000/32)

    train_loss, train_acc = 0, 0  # 初始化训练损失和正确率
    
    for X, y in dataloader:  # 获取图片及其标签
        X, y = X.to(device), y.to(device)
        
        # 计算预测误差
        pred = model(X)          # 网络输出
        loss = loss_fn(pred, y)  # 计算网络输出和真实值之间的差距,targets为真实值,计算二者差值即为损失
        
        # 反向传播
        optimizer.zero_grad()  # grad属性归零
        loss.backward()        # 反向传播
        optimizer.step()       # 每一步自动更新
        
        # 记录acc与loss
        train_acc  += (pred.argmax(1) == y).type(torch.float).sum().item()
        train_loss += loss.item()
            
    train_acc  /= size
    train_loss /= num_batches

    return train_acc, train_loss
    
    
def test (dataloader, model, loss_fn):
    size        = len(dataloader.dataset)  # 测试集的大小,一共10000张图片
    num_batches = len(dataloader)          # 批次数目,313(10000/32=312.5,向上取整)
    test_loss, test_acc = 0, 0
    
    # 当不进行训练时,停止梯度更新,节省计算内存消耗
    with torch.no_grad():
        for imgs, target in dataloader:
            imgs, target = imgs.to(device), target.to(device)
            
            # 计算loss
            target_pred = model(imgs)
            loss        = loss_fn(target_pred, target)
            
            test_loss += loss.item()
            test_acc  += (target_pred.argmax(1) == target).type(torch.float).sum().item()

    test_acc  /= size
    test_loss /= num_batches

    return test_acc, test_loss  

踩得最大的坑 定义模型时,在forward函数里面定义了一个nn.Linear。导致模型.to(device)时,这个模块没有加载到GPU上。
解决方法:仅在模型__init__方法里定义网络模块。

五. 主函数测试

from torch.optim import Adam
from torch.nn import CrossEntropyLoss

learning_rate = 0.001
optimizer = Adam(net.parameters(), lr=learning_rate)
loss = CrossEntropyLoss()
epochs = 30

train_acc = []
train_loss = []
test_loss = []
test_acc = []

for i in range(epochs):
    net.train()
    acc_train, loss_train = train(train_dl, net, loss, optimizer)
    
    net.eval()
    acc_test, loss_test = test(test_dl, net, loss)

    template = ('Epoch:{:2d}, Train_acc:{:.3f}%, Train_loss:{:.3f}, Test_acc:{:.1f}%,Test_loss:{:.3f}')
    print(template.format(i + 1, acc_train, loss_train, acc_test, loss_test))

print("done")

运行结果:

image.png