计算机视觉/机器学习常用提分方法TTA/Focal/Cross Validation + Ensemble

262 阅读2分钟

目录

  • TTA
  • FocalLoss()
  • Cross Validation + Ensemble

1 TTA

使用Test Time Augmentation(TTA)。TTA使用了原本的test_tfm构建的testloader,另外使用train_tfm构建其他5个testloader,
6个testloader的权重分别为0.60.10.10.10.101。
顾名思义是测试时增强的意思,可将准确率提高若干个百分点,它就是测试时增强(test time augmentation, TTA)。
这里会为原始图像造出多个不同版本,包括不同区域裁剪和更改缩放程度等,并将它们输入到模型中;然后对多个版本进行计算得到平均输出,作为图像的最终输出分数。
  • 方法1具体思路

1.使用了原本的test_tfm构建的testloader

test_set = FoodDataset(test_dir, tfm=test_tfm)
test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False, num_workers=0, pin_memory=True)

2.使用train_tfm构建其他5个testloader

#这里是使用train_tfm构建其他5个testloader
test_tfms = [test_tfm1, test_tfm2, test_tfm3, test_tfm4, test_tfm5]
test_loaders = []
for i in range(5):
    test_set_i = FoodDataset(test_dir, tfm=train_tfm)
    test_loader_i = DataLoader(test_set_i, batch_size=batch_size, shuffle=False, num_workers=0, pin_memory=True)
    test_loaders.append(test_loader_i)
  • 方法2 导包
#或者导包进行
!pip install ttach
import ttach as tta
tta_model = tta.ClassificationTTAWrapper(model,transforms=tta.aliases.d4_transform(),merge_mode='mean')

2 FocalLoss()

Focal Loss是一种用于解决类别不平衡问题的损失函数,特别适用于目标检测和图像分类任务。
在训练深度学习模型时,类别不平衡指的是训练集中某些类别的样本数量远远多于其他类别,
这可能导致模型在训练和预测过程中对少数类别表现较差。
from torch.autograd import Variable
import torch.nn.functional as F

class FocalLoss(nn.Module):
    def __init__(self, gamma=0, alpha=None, size_average=True):
        super(FocalLoss, self).__init__()
        self.gamma = gamma
        self.alpha = alpha
        if isinstance(alpha,(float,int)): self.alpha = torch.Tensor([alpha,1-alpha])
        if isinstance(alpha,list): self.alpha = torch.Tensor(alpha)
        self.size_average = size_average

    def forward(self, input, target):
        if input.dim()>2:
            input = input.view(input.size(0),input.size(1),-1)  # N,C,H,W => N,C,H*W
            input = input.transpose(1,2)    # N,C,H*W => N,H*W,C
            input = input.contiguous().view(-1,input.size(2))   # N,H*W,C => N*H*W,C
        target = target.view(-1,1)

        logpt = F.log_softmax(input)
        logpt = logpt.gather(1,target)
        logpt = logpt.view(-1)
        pt = Variable(logpt.data.exp())

        if self.alpha is not None:
            if self.alpha.type()!=input.data.type():
                self.alpha = self.alpha.type_as(input.data)
            at = self.alpha.gather(0,target.data.view(-1))
            logpt = logpt * Variable(at)

        loss = -1 * (1-pt)**self.gamma * logpt
        if self.size_average: return loss.mean()
        else: return loss.sum()
      
      
criterion = FocalLoss()

3 Cross Validation + Ensemble

  1. 使用了4-fold,得到4个模型.
  2. 做推理的时候,每张图片有4个输出,将4个输出求和,然后使用argmax得到分类结果。
  3. 然后在结果的时候,和TTA一起,进行融合
device = "cuda" if torch.cuda.is_available() else "cpu"
print(device)

models = []
for i in range(0, 4):
    fold = i + 1
    model_best = Classifier(Residual_Block, num_layers).to(device)
    model_best.load_state_dict(torch.load(f"Fold_{fold}_best.ckpt"))
    model_best.eval()
    models.append(model_best)

preds = [[], [], [], [], [], []] 
with torch.no_grad():
    for data, _ in test_loader:
        batch_preds = [] 
        for model_best in models:
            batch_preds.append(model_best(data.to(device)).cpu().data.numpy())
        batch_preds = sum(batch_preds)
        preds[0].extend(batch_preds.squeeze().tolist())
       
        
    for i, loader in enumerate(test_loaders):
        for data, _ in loader:
            batch_preds = []
            for model_best in models:
                batch_preds.append(model_best(data.to(device)).cpu().data.numpy())
            batch_preds = sum(batch_preds)
            preds[i+1].extend(batch_preds.squeeze().tolist())

preds = np.array(preds)
print(preds.shape)
preds = 0.6* preds[0] + 0.1 * preds[1] + 0.1 * preds[2] + 0.1 * preds[3] + 0.1 * preds[4] + 0.1 * preds[5]
print(preds.shape)
prediction = np.argmax(preds, axis=1)