PIL库的基础

在使用torchvision.transforms时，其中的数据增强通常使用的图片类型为PIL.Image。因此如果使用imageio或者skimage库读取的图片类型为np.ndarray就需要进行转换。本节来对PIL库做一个简单的总结。

Image的基础信息

shape与类型

PIL库中常用的模块是Image模块。读取图片有两种方式：分别是Image.open()或者Image.fromarray()。

img = imageio.imread("0.jpg")
img1 = Image.fromarray(img)

img2 = Image.open("0.jpg")

img1或者img2的类型为<class 'PIL.Image.Image'>，并不是np.array。其shape为HWC，torchvision.transforms中的数据增强默认使用的是PIL格式的图像，因此如果读入的是np.ndarray格式需要转换为PIL.Image类型。

属性

image = Image.open("0.jpg")
print('width: ', image.width)
print('height: ', image.height)
print('size: ', image.size)
print('mode: ', image.mode)
print('format: ', image.format)
print('category: ', image.category)
print('readonly: ', image.readonly)
print('info: ', image.info)

#output
width:  320
height:  223
size:  (320, 223)
mode:  RGB
format:  JPEG
category:  0
readonly:  1
info:  {'jfif': 257, 'jfif_version': (1, 1), 'dpi': (72, 72), 'jfif_unit': 1, 'jfif_density': (72, 72)}

数据格式

PIL中保存的数据以uint8的格式存储，其表示八位无符号数，取值范围为0~255。在转换时如果不满足此格式条件，会报错。

PIL库的基础函数操作

待添加....

PIL与numpy相互转换

为什么需要转换

常用于读取图片的库有imageio和skimage以及cv2，其读取数据的类型如下所示。由于torchvision.transforms默认使用的图片处理库为PIL，因此需要将np.ndarray转换为PIL处理。 PIL.Image类型与np.ndarray类型可以相互转换，转换过程中shape不变，仍然是HWC。

img_ski = io.imread("0.jpg")
img_cv2 = cv2.imread("0.jpg")
img_imgIO = imageio.imread("0.jpg")

print(type(img_ski))
print(type(img_cv2))
print(type(img_imgIO))

#output:
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
<class 'imageio.core.util.Array'>

注意：imageio.core.util.Array类型的相关解释，其为np.ndarray的子类型。【stackoverflow】

imageio.core.util.Array is a subclass of the NumPy array, so it is correct to say that imread returns a NumPy array. You can find the source code for Array at github.com/imageio/ima….

Image与np.ndarray的转换

Image转为np.ndarray

image = Image.open("0.jpg")
img = np.array(image)
print(img.shape)

#output:
(223, 320, 3)

np.ndarray转为Image

img = imageio.imread("0.jpg")
img1 = Image.fromarray(img)
img1.show()

类型报错范例：

img = np.ones((448, 448, 3))
print(img.dtype)
img1 = Image.fromarray(img)

#output：
float64

TypeError: Cannot handle this data type: (1, 1, 3), <f8

从上述的输出可以看出，img的数据类型为float64，而PIL需要的数据类型为uint8，因此会报错。修改方式为：

img = np.ones((448, 448, 3))
print(img.dtype)
img2 = Image.fromarray(np.uint8(img))

PIL与transforms

transforms是torch中的数据增强库，有关PIL的常用库为ToPILImage和ToTensor。

ToPILImage

ToPILImage是为了将np.ndarray或者tensor转换为PIL.Image类型，其中进行了通道转换，最终的转换效果为HWC。

#是tensor才进行通道变换
npimg = pic
if isinstance(pic, torch.FloatTensor) and mode != 'F':
    pic = pic.mul(255).byte()
if isinstance(pic, torch.Tensor):
    npimg = np.transpose(pic.numpy(), (1, 2, 0))

ToTensor

PIL读取的图片shape为HWC，torch中tensor默认的shape为CHW，此转换操作在ToTensor()已经实现，因此不需要另外操作。

但是在show图片时，tensor的shape为CHW，而matplotlib中需要HWC因此还需要再变换回来。

def to_tensor(pic):
    """Convert a ``PIL Image`` or ``numpy.ndarray`` to tensor.
    Args:
        pic (PIL Image or numpy.ndarray): Image to be converted to tensor.
    Returns:
        Tensor: Converted image.
    """
    ......
    # ---------# handle numpy array------------
    if isinstance(pic, np.ndarray):
        if pic.ndim == 2:
            pic = pic[:, :, None]

        img = torch.from_numpy(pic.transpose((2, 0, 1))) #此处进行了变换
        # backward compatibility
        if isinstance(img, torch.ByteTensor):
            return img.float().div(255)
        else:
            return img
    ......

    # -----------handle PIL Image----------------
    #从此处可以看出PIL格式转变为torch中的tensor还是通过np.array转换得到的。
    if pic.mode == 'I':
        img = torch.from_numpy(np.array(pic, np.int32, copy=False))
    elif pic.mode == 'I;16':
        img = torch.from_numpy(np.array(pic, np.int16, copy=False))
    elif pic.mode == 'F':
        img = torch.from_numpy(np.array(pic, np.float32, copy=False))
    elif pic.mode == '1':
        img = 255 * torch.from_numpy(np.array(pic, np.uint8, copy=False))
    else:
        img = torch.ByteTensor(torch.ByteStorage.from_buffer(pic.tobytes()))

    img = img.view(pic.size[1], pic.size[0], len(pic.getbands()))
    # put it from HWC to CHW format
    img = img.permute((2, 0, 1)).contiguous()  #此处进行了转换
    if isinstance(img, torch.ByteTensor):
        return img.float().div(255)
    else:
        return img

总结：此次记录了三个方面的内容,分别是PIL的基础内容、PIL与np.ndarray的转换、PIL在transforms中的两个类。PIL读取数据时的基础信息，如shape为HWC，数据类型为uint8,以及一些基础属性。在读取数据时读取类型常常为np.ndarray，因此需要通过转换得到PIL.Image类型的图片。同时在transforms中也有相应的处理类，分别是ToPILImage以及ToTensor，前者的功能可以将np.ndarray或者tensor转为PIL，在其中会调整通道为HWC，而后者会将PIL转为tensor，同时也会调整通道。