目录
- TTA
- FocalLoss()
- Cross Validation + Ensemble
1 TTA
使用Test Time Augmentation(TTA)。TTA使用了原本的test_tfm构建的testloader,另外使用train_tfm构建其他5个testloader,
6个testloader的权重分别为0.6和0.1,0.1,0.1,0.1,01。
顾名思义是测试时增强的意思,可将准确率提高若干个百分点,它就是测试时增强(test time augmentation, TTA)。
这里会为原始图像造出多个不同版本,包括不同区域裁剪和更改缩放程度等,并将它们输入到模型中;然后对多个版本进行计算得到平均输出,作为图像的最终输出分数。
- 方法1具体思路
1.使用了原本的test_tfm构建的testloader
test_set = FoodDataset(test_dir, tfm=test_tfm)
test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False, num_workers=0, pin_memory=True)
2.使用train_tfm构建其他5个testloader
#这里是使用train_tfm构建其他5个testloader
test_tfms = [test_tfm1, test_tfm2, test_tfm3, test_tfm4, test_tfm5]
test_loaders = []
for i in range(5):
test_set_i = FoodDataset(test_dir, tfm=train_tfm)
test_loader_i = DataLoader(test_set_i, batch_size=batch_size, shuffle=False, num_workers=0, pin_memory=True)
test_loaders.append(test_loader_i)
- 方法2 导包
#或者导包进行
!pip install ttach
import ttach as tta
tta_model = tta.ClassificationTTAWrapper(model,transforms=tta.aliases.d4_transform(),merge_mode='mean')
2 FocalLoss()
Focal Loss是一种用于解决类别不平衡问题的损失函数,特别适用于目标检测和图像分类任务。
在训练深度学习模型时,类别不平衡指的是训练集中某些类别的样本数量远远多于其他类别,
这可能导致模型在训练和预测过程中对少数类别表现较差。
from torch.autograd import Variable
import torch.nn.functional as F
class FocalLoss(nn.Module):
def __init__(self, gamma=0, alpha=None, size_average=True):
super(FocalLoss, self).__init__()
self.gamma = gamma
self.alpha = alpha
if isinstance(alpha,(float,int)): self.alpha = torch.Tensor([alpha,1-alpha])
if isinstance(alpha,list): self.alpha = torch.Tensor(alpha)
self.size_average = size_average
def forward(self, input, target):
if input.dim()>2:
input = input.view(input.size(0),input.size(1),-1) # N,C,H,W => N,C,H*W
input = input.transpose(1,2) # N,C,H*W => N,H*W,C
input = input.contiguous().view(-1,input.size(2)) # N,H*W,C => N*H*W,C
target = target.view(-1,1)
logpt = F.log_softmax(input)
logpt = logpt.gather(1,target)
logpt = logpt.view(-1)
pt = Variable(logpt.data.exp())
if self.alpha is not None:
if self.alpha.type()!=input.data.type():
self.alpha = self.alpha.type_as(input.data)
at = self.alpha.gather(0,target.data.view(-1))
logpt = logpt * Variable(at)
loss = -1 * (1-pt)**self.gamma * logpt
if self.size_average: return loss.mean()
else: return loss.sum()
criterion = FocalLoss()
3 Cross Validation + Ensemble
- 使用了4-fold,得到4个模型.
- 做推理的时候,每张图片有4个输出,将4个输出求和,然后使用argmax得到分类结果。
- 然后在结果的时候,和TTA一起,进行融合
device = "cuda" if torch.cuda.is_available() else "cpu"
print(device)
models = []
for i in range(0, 4):
fold = i + 1
model_best = Classifier(Residual_Block, num_layers).to(device)
model_best.load_state_dict(torch.load(f"Fold_{fold}_best.ckpt"))
model_best.eval()
models.append(model_best)
preds = [[], [], [], [], [], []]
with torch.no_grad():
for data, _ in test_loader:
batch_preds = []
for model_best in models:
batch_preds.append(model_best(data.to(device)).cpu().data.numpy())
batch_preds = sum(batch_preds)
preds[0].extend(batch_preds.squeeze().tolist())
for i, loader in enumerate(test_loaders):
for data, _ in loader:
batch_preds = []
for model_best in models:
batch_preds.append(model_best(data.to(device)).cpu().data.numpy())
batch_preds = sum(batch_preds)
preds[i+1].extend(batch_preds.squeeze().tolist())
preds = np.array(preds)
print(preds.shape)
preds = 0.6* preds[0] + 0.1 * preds[1] + 0.1 * preds[2] + 0.1 * preds[3] + 0.1 * preds[4] + 0.1 * preds[5]
print(preds.shape)
prediction = np.argmax(preds, axis=1)