数据增强又称为数据增广,数据扩增,它是对训练集进行变换,使训练集更丰富,从而让模型更具泛化能力
transforms-裁剪
transforms.CentorCrop
transforms.CenterCrop 功能:从图像中心裁剪图片
- size:所需裁剪图片尺寸
transforms.RandomCrop
transforms.RandomCrop 功能:从图片中随机裁剪出尺寸为size的图片
- size:所需裁剪图片尺寸
- padding:设置填充大小当为a时,上下左右均填充a个像素当为(a,b)时,上下填充b个像素,左右填充a个像素当为(a,b,c,d)时,左,上,右,下分别填充a,b,c,d
- pad_if_need:若图像小于设定size,则填充
- padding_mode:填充模式,有4种模式:
- 1、constant:像素值由fill设定
- 2、edge:像素值由图像边缘像素决定
- 3、reflect:镜像填充,最后一个像素不镜像,eg:[1,2,3,4]→[3,2,1,2,3,4,3,2]
- 4、symmetric:镜像填充,最后一个像素镜像,eg:[1,2,3,4]→[2,1,1,2,3,4,4,3]
- fill:constant时,设置填充的像素值
RandomResizedCrop
RandomResizedCrop 功能:随机大小、长宽比裁剪图片
- size:所需裁剪图片尺寸
- scale:随机裁剪面积比例,默认(0.08,1)
- ratio:随机长宽比,默(3/4,4/3)
- interpolation:插值方法
- PIL.Image.NEAREST
- PIL.Image.BILINEAR
- PIL.Image.BICUBIC
FiveCrop与TenCrop
FiveCrop TenCrop 功能:在图像的上下左右以及中心裁剪出尺寸为size的5张图片,TenCrop对这5张图片进行水平或者垂直镜像获得10张图片
- size:所需裁剪图片尺寸
- vertical_flip:是否垂直翻转
transforms-翻转、旋转
RandomHorizontalFlip与RandomVerticalFlip
RandomHorizontalFlip RandomVerticalFlip 功能:依概率水平(左右)或垂直(上下)翻转图片
- p:翻转概率(默认是0.5)
RandomRotation
RandomRotation 功能:随机旋转图片
- degrees:旋转角度当为a时,在(-a,a)之间选择旋转角度当为(a,b)时,在(a,b)之间选择旋转角度
- resample:重采样方法
- expand:是否扩大图片,以保持原图信息
- center:旋转点设置,默认中心旋转
expand:
centor:
transforms-图像变换
transforms.Pad
Pad功能:对图片边缘进行填充
- padding:设置填充大小当为a时,上下左右均填充a个像素当为(a,b)时,上下填充b个像素,左右填充a个像素当为(a,b,c,d)时,左,上,右,下分别填充a,b,c,d
- padding_mode:填充模式,有4种模式,constant、edge、reflect和symmetric
- fill:constant时,设置填充的像素值,(R,G,B)or(Gray)
transforms.ColorJitter
ColorJitter 功能:调整亮度、对比度、饱和度和色相
- brightness:亮度调整因子当为a时,从[max(0,1-a),1+a]中随机选择当为(a,b)时,从[a,b]中
- contrast:对比度参数,同brightness
- saturation:饱和度参数,同brightness
- hue:色相参数,当为a时,从[-a,a]中选择参数,注:0<=a<=0.5当为(a,b)时,从[a,b]中选择参数,注:-0.5<=a<=b<=0.5
Grayscale与RandomGrayscale
Grayscale RandomGrayscale 功能:依概率将图片转换为灰度图
- num_ouput_channels:输出通道数只能设1或3
- p:概率值,图像被转换为灰度图的概率
RandomAffine
RandomAffine 功能:对图像进行仿射变换,仿射变换是二维的线性变换,由五种基本原子变换构成,分别是旋转、平移、缩放、错切和翻转
- degrees:旋转角度设置
- translate:平移区间设置,如(a,b), a设置宽(width),b设置高(height),图像在宽维度平移的区间为-img_widtha<dx<img_widtha
- scale:缩放比例(以面积为单位)
- fill_color:填充颜色设置
效果:
RandomErasing
RandomErasing 功能:对图像进行随机遮挡
- p:概率值,执行该操作的概率
- scale:遮挡区域的面积
- ratio:遮挡区域长宽比
- value:设置遮挡区域的像素值,(R,G,B)or(Gray)
参考文献:《Random Erasing Data Augmentation》
RandomErasing(p=0.5, scale=(0.02, 0.33), ratio=(0.3, 3.3), value=0, inplace=False)
transforms.Lambda
功能:用户自定义lambda方法
- lambd:lambda匿名函数
transforms的其他操作
transforms.RandomChoice
功能:从一系列transforms方法中随机挑选一个
transforms.RandomApply
功能:依据概率执行一组transforms操作
transforms.RandomOrder
功能:对一组transforms操作打乱顺序
自定义transforms
自定义transforms要素:1. 仅接收一个参数,返回一个参数 2. 注意上下游的输出与输入
transforms是在Compose中被调用的。
自定义的transforms要实现以下的结构:
示例
椒盐噪声
椒盐噪声又称为脉冲噪声,是一种随机出现的白点或者黑点,白点称为盐噪声,黑色为椒噪声
信噪比(Signal-NoiseRate,SNR)是衡量噪声的比例,图像中为图像像素的占比
import os
import numpy as np
import torch
import random
import torchvision.transforms as transforms
from PIL import Image
from matplotlib import pyplot as plt
from torch.utils.data import DataLoader
from tools.my_dataset import RMBDataset
from tools.common_tools import transform_invert
def set_seed(seed=1):
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
set_seed(1) # 设置随机种子
# 参数设置
MAX_EPOCH = 10
BATCH_SIZE = 1
LR = 0.01
log_interval = 10
val_interval = 1
rmb_label = {"1": 0, "100": 1}
class AddPepperNoise(object):
"""增加椒盐噪声
Args:
snr (float): Signal Noise Rate
p (float): 概率值,依概率执行该操作
"""
def __init__(self, snr, p=0.9):
assert isinstance(snr, float) or (isinstance(p, float))
self.snr = snr
self.p = p
def __call__(self, img):
"""
Args:
img (PIL Image): PIL Image
Returns:
PIL Image: PIL image.
"""
if random.uniform(0, 1) < self.p:
img_ = np.array(img).copy()
h, w, c = img_.shape
signal_pct = self.snr
noise_pct = (1 - self.snr)
mask = np.random.choice((0, 1, 2), size=(h, w, 1), p=[signal_pct, noise_pct/2., noise_pct/2.])
mask = np.repeat(mask, c, axis=2)
img_[mask == 1] = 255 # 盐噪声
img_[mask == 2] = 0 # 椒噪声
return Image.fromarray(img_.astype('uint8')).convert('RGB')
else:
return img
# ============================ step 1/5 数据 ============================
split_dir = os.path.join("..", "..", "data", "rmb_split")
train_dir = os.path.join(split_dir, "train")
valid_dir = os.path.join(split_dir, "valid")
norm_mean = [0.485, 0.456, 0.406]
norm_std = [0.229, 0.224, 0.225]
train_transform = transforms.Compose([
transforms.Resize((224, 224)),
AddPepperNoise(0.9, p=0.5),
transforms.ToTensor(),
transforms.Normalize(norm_mean, norm_std),
])
valid_transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(norm_mean, norm_std)
])
# 构建MyDataset实例
train_data = RMBDataset(data_dir=train_dir, transform=train_transform)
valid_data = RMBDataset(data_dir=valid_dir, transform=valid_transform)
# 构建DataLoder
train_loader = DataLoader(dataset=train_data, batch_size=BATCH_SIZE, shuffle=True)
valid_loader = DataLoader(dataset=valid_data, batch_size=BATCH_SIZE)
# ============================ step 5/5 训练 ============================
for epoch in range(MAX_EPOCH):
for i, data in enumerate(train_loader):
inputs, labels = data # B C H W
img_tensor = inputs[0, ...] # C H W
img = transform_invert(img_tensor, train_transform)
plt.imshow(img)
plt.show()
plt.pause(0.5)
plt.close()
附录
import os
import numpy as np
import torch
import random
from torch.utils.data import DataLoader
import torchvision.transforms as transforms
from tools.my_dataset import RMBDataset
from PIL import Image
from matplotlib import pyplot as plt
def set_seed(seed=1):
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
set_seed(1) # 设置随机种子
# 参数设置
MAX_EPOCH = 10
BATCH_SIZE = 1
LR = 0.01
log_interval = 10
val_interval = 1
rmb_label = {"1": 0, "100": 1}
def transform_invert(img_, transform_train):
"""
将data 进行反transfrom操作
:param img_: tensor
:param transform_train: torchvision.transforms
:return: PIL image
"""
if 'Normalize' in str(transform_train):
norm_transform = list(filter(lambda x: isinstance(x, transforms.Normalize), transform_train.transforms))
mean = torch.tensor(norm_transform[0].mean, dtype=img_.dtype, device=img_.device)
std = torch.tensor(norm_transform[0].std, dtype=img_.dtype, device=img_.device)
img_.mul_(std[:, None, None]).add_(mean[:, None, None])
img_ = img_.transpose(0, 2).transpose(0, 1) # C*H*W --> H*W*C
img_ = np.array(img_) * 255
if img_.shape[2] == 3:
img_ = Image.fromarray(img_.astype('uint8')).convert('RGB')
elif img_.shape[2] == 1:
img_ = Image.fromarray(img_.astype('uint8').squeeze())
else:
raise Exception("Invalid img shape, expected 1 or 3 in axis 2, but got {}!".format(img_.shape[2]) )
return img_
# ============================ step 1/5 数据 ============================
split_dir = os.path.join("..", "..", "data", "rmb_split")
train_dir = os.path.join(split_dir, "train")
valid_dir = os.path.join(split_dir, "valid")
norm_mean = [0.485, 0.456, 0.406]
norm_std = [0.229, 0.224, 0.225]
train_transform = transforms.Compose([
transforms.Resize((224, 224)),
# 1 CenterCrop
# transforms.CenterCrop(512), # 512
# 2 RandomCrop
# transforms.RandomCrop(224, padding=16),
# transforms.RandomCrop(224, padding=(16, 64)),
# transforms.RandomCrop(224, padding=16, fill=(255, 0, 0)),
# transforms.RandomCrop(512, pad_if_needed=True), # pad_if_needed=True
# transforms.RandomCrop(224, padding=64, padding_mode='edge'),
# transforms.RandomCrop(224, padding=64, padding_mode='reflect'),
# transforms.RandomCrop(1024, padding=1024, padding_mode='symmetric'),
# 3 RandomResizedCrop
# transforms.RandomResizedCrop(size=224, scale=(0.5, 0.5)),
# 4 FiveCrop
# transforms.FiveCrop(112),
# transforms.Lambda(lambda crops: torch.stack([(transforms.ToTensor()(crop)) for crop in crops])),
# 5 TenCrop
# transforms.TenCrop(112, vertical_flip=False),
# transforms.Lambda(lambda crops: torch.stack([(transforms.ToTensor()(crop)) for crop in crops])),
# 1 Horizontal Flip
# transforms.RandomHorizontalFlip(p=1),
# 2 Vertical Flip
# transforms.RandomVerticalFlip(p=0.5),
# 3 RandomRotation
# transforms.RandomRotation(90),
# transforms.RandomRotation((90), expand=True),
# transforms.RandomRotation(30, center=(0, 0)),
# transforms.RandomRotation(30, center=(0, 0), expand=True), # expand only for center rotation
transforms.ToTensor(),
transforms.Normalize(norm_mean, norm_std),
])
valid_transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(norm_mean, norm_std)
])
# 构建MyDataset实例
train_data = RMBDataset(data_dir=train_dir, transform=train_transform)
valid_data = RMBDataset(data_dir=valid_dir, transform=valid_transform)
# 构建DataLoder
train_loader = DataLoader(dataset=train_data, batch_size=BATCH_SIZE, shuffle=True)
valid_loader = DataLoader(dataset=valid_data, batch_size=BATCH_SIZE)
# ============================ step 5/5 训练 ============================
for epoch in range(MAX_EPOCH):
for i, data in enumerate(train_loader):
inputs, labels = data # B C H W
img_tensor = inputs[0, ...] # C H W
img = transform_invert(img_tensor, train_transform)
plt.imshow(img)
plt.show()
plt.pause(0.5)
plt.close()
# bs, ncrops, c, h, w = inputs.shape
# for n in range(ncrops):
# img_tensor = inputs[0, n, ...] # C H W
# img = transform_invert(img_tensor, train_transform)
# plt.imshow(img)
# plt.show()
# plt.pause(1)
import os
import numpy as np
import torch
import random
from matplotlib import pyplot as plt
from torch.utils.data import DataLoader
import torchvision.transforms as transforms
from tools.my_dataset import RMBDataset
from tools.common_tools import transform_invert
def set_seed(seed=1):
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
set_seed(1) # 设置随机种子
# 参数设置
MAX_EPOCH = 10
BATCH_SIZE = 1
LR = 0.01
log_interval = 10
val_interval = 1
rmb_label = {"1": 0, "100": 1}
# ============================ step 1/5 数据 ============================
split_dir = os.path.join("..", "..", "data", "rmb_split")
train_dir = os.path.join(split_dir, "train")
valid_dir = os.path.join(split_dir, "valid")
norm_mean = [0.485, 0.456, 0.406]
norm_std = [0.229, 0.224, 0.225]
train_transform = transforms.Compose([
transforms.Resize((224, 224)),
# 1 Pad
# transforms.Pad(padding=32, fill=(255, 0, 0), padding_mode='constant'),
# transforms.Pad(padding=(8, 64), fill=(255, 0, 0), padding_mode='constant'),
# transforms.Pad(padding=(8, 16, 32, 64), fill=(255, 0, 0), padding_mode='constant'),
# transforms.Pad(padding=(8, 16, 32, 64), fill=(255, 0, 0), padding_mode='symmetric'),
# 2 ColorJitter
# transforms.ColorJitter(brightness=0.5),
# transforms.ColorJitter(contrast=0.5),
# transforms.ColorJitter(saturation=0.5),
# transforms.ColorJitter(hue=0.3),
# 3 Grayscale
# transforms.Grayscale(num_output_channels=3),
# 4 Affine
# transforms.RandomAffine(degrees=30),
# transforms.RandomAffine(degrees=0, translate=(0.2, 0.2), fillcolor=(255, 0, 0)),
# transforms.RandomAffine(degrees=0, scale=(0.7, 0.7)),
# transforms.RandomAffine(degrees=0, shear=(0, 0, 0, 45)),
# transforms.RandomAffine(degrees=0, shear=90, fillcolor=(255, 0, 0)),
# 5 Erasing
# transforms.ToTensor(),
# transforms.RandomErasing(p=1, scale=(0.02, 0.33), ratio=(0.3, 3.3), value=(254/255, 0, 0)),
# transforms.RandomErasing(p=1, scale=(0.02, 0.33), ratio=(0.3, 3.3), value='1234'),
# 1 RandomChoice
# transforms.RandomChoice([transforms.RandomVerticalFlip(p=1), transforms.RandomHorizontalFlip(p=1)]),
# 2 RandomApply
# transforms.RandomApply([transforms.RandomAffine(degrees=0, shear=45, fillcolor=(255, 0, 0)),
# transforms.Grayscale(num_output_channels=3)], p=0.5),
# 3 RandomOrder
# transforms.RandomOrder([transforms.RandomRotation(15),
# transforms.Pad(padding=32),
# transforms.RandomAffine(degrees=0, translate=(0.01, 0.1), scale=(0.9, 1.1))]),
transforms.ToTensor(),
transforms.Normalize(norm_mean, norm_std),
])
valid_transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(norm_mean, norm_std)
])
# 构建MyDataset实例
train_data = RMBDataset(data_dir=train_dir, transform=train_transform)
valid_data = RMBDataset(data_dir=valid_dir, transform=valid_transform)
# 构建DataLoder
train_loader = DataLoader(dataset=train_data, batch_size=BATCH_SIZE, shuffle=True)
valid_loader = DataLoader(dataset=valid_data, batch_size=BATCH_SIZE)
# ============================ step 5/5 训练 ============================
for epoch in range(MAX_EPOCH):
for i, data in enumerate(train_loader):
inputs, labels = data # B C H W
img_tensor = inputs[0, ...] # C H W
img = transform_invert(img_tensor, train_transform)
plt.imshow(img)
plt.show()
plt.pause(0.5)
plt.close()