本文已参与「新人创作礼」活动，一起开启掘金创作之路

引言

网上关于图像归一化的方法比较少，通常都是直接利用reszie函数进行，本文主要介绍了一种图像的矩归一化方法，具有可将图像中心和幕布中心对齐的特点。

矩归一化

图像的矩归一化是图像预处理的一种方法，相对于直接的线性归一化，矩归一化的优势在于它利用了图像的矩信息，将原图像归一化到幕布大小的同时使得原图的中质心与幕布的中心对齐，同时尽可能的去除了原图中的边界区域（margin）。其坐标映射公式如下：

x'=\alpha(x-x_c)+x_c'\\ y'=\beta(y-y_c)+y_c'\tag{1}

其中 $(x_c,y_c),(x_c',y_c')$ 分别为原始图片，归一化后图片的中心。对于归一化之后的图片而言，我们希望图像应该处于幕布的中心，则 $x_c'=\frac{W_2}{2},y_c'=\frac{H_2}{2}$ 。而对于原图中心则设定为图片的质心，可以通过图像的一阶矩和零阶矩求得，以此希望图片的灰度值分散在幕布中心的周围，则 $x_c=\frac{m_{10}}{m_{00}},y_c=\frac{m_{01}}{m_{00}}$ 。 $\alpha,\beta$ 为原图和归一化之后图片的宽度比、长度比，即：

\alpha=\frac{W_2}{W_1}\\ \beta=\frac{H_2}{H_1}\tag{2}

但由于我们将质心设定为图像的中心，其长宽已不再由原图的实际的空间上的大小决定，而是由图像的二阶中心矩相关信息所确定，则：

W_1=r\sqrt{\frac{\mu_{20}}{m_{00}}}\\ H_1=r\sqrt{\frac{\mu_{02}}{m_{00}}}\tag{3}

此时就完成了图像的矩归一化操作，直接对整个归一化过程来看其实就是利用图像的矩信息，获取了图像的质心并框选出包含主要灰度值的区域，而后再将原图像中的点在质心坐标系的位置仿射变化至幕布中心坐标系中，非离散位置的值采用插值来填补。

矩归一化映射原理

如在前一节提到的，矩归一化是通过坐标映射实现的，将目标图片中的坐标变换为原图主要灰度值区域的坐标，本质上而言这一映射就是个坐标系的变换。原图坐标系以 $(x_c,y_c)$ 为坐标原点，目标图像坐标系以 $(x_c',y_c')$ 为原点，则相应的坐标变换为：

\frac{(x-x_c)}{W}=\frac{(x'-x_c')}{W'}\\ \frac{(y-y_c)}{H}=\frac{(y-y_c')}{H'}

则得到目标图像坐标和原图坐标的变换关系：

x'=\frac{W'}{W}(x-x_c)+x_c'\\ y'=\frac{H'}{H}(y-y_c)+y_c'

其中 $W,H$ 由图像的二阶中心矩相关信息所确定。

编程实现

实操过程中，为了保证变换前后图像的纵横比不变，我们也会设定长边和短边的缩放倍数一致，即 $ratio=min(\alpha,\beta)$ .由于图像的矩归一化实际上就是图像的仿射变换，我们可以写出变换矩阵，再通过warpAffine()函数实现矩归一化，变换矩阵为：

\begin{bmatrix} r&0&-x_c*r+\frac{W_2-1}{2}\\ 0&r&-y_c*r+\frac{H_2-1}{2} \end{bmatrix}

给出示例代码如下：

import cv2
import numpy as np
import matplotlib.pyplot as plt


def moment_preprocess(img,dst_w=220,dst_h=150):
    normalized_img = denoise(img)
    inverted_img = 255 - normalized_img
    resized_img=resize_img(inverted_img)
    cropped_img=crop_center(resized_img,(dst_h,dst_w))
    return cropped_img


def denoise(img):
    threshold, binarized_img = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)  # 大于OTSU的为噪声，设置为255
    r, c = np.where(binarized_img == 255)  # 有笔画的位置
    img[img>threshold]=255
    return img


def resize_img(img, img_size=(170,242), K=2.2):
    dst_h,dst_w=img_size
    moments = cv2.moments(img, True)  # 计算图像的矩，不考虑灰度的变化，二值化图像01
    xc = moments['m10'] / moments['m00']
    yc = moments['m01'] / moments['m00']
    ratio = min(dst_w * np.sqrt(moments['m00']) / (2 * K * np.sqrt(moments['mu20'])),
                dst_h * np.sqrt(moments['m00']) / (2 * K * np.sqrt(moments['mu02'])))
    mat = np.array([[ratio, 0, -xc * ratio + (dst_w - 1) / 2], [0, ratio, -yc * ratio + (dst_h - 1) / 2]])
    trans_img = cv2.warpAffine(img, mat, (dst_w, dst_h),flags=cv2.INTER_LINEAR)
    return trans_img


def crop_center(img, input_shape):
    dst_h,dst_w = img.shape
    h_scale=float(img.shape[0])/dst_h
    w_scale=float(img.shape[1])/dst_w
    if w_scale>h_scale:
        resized_height=dst_h
        resized_width=int(round(img.shape[1]/h_scale))
    else:
        resized_width=dst_w
        resized_height=int(round(img.shape[0]/w_scale))
    img=cv2.resize(img.astype(np.float32),(resized_width,resized_height))
    if w_scale>h_scale:
        start = int(round((resized_width-dst_w)/2.0))
        return img[:, start:start+dst_w]
    else:
        start = int(round((resized_height-dst_h)/2.0))
        return img[start:start+dst_h, :]
 if __name__ == "__main__":
    path = r'E:\\temp\\some_signature.png'
    img = cv2.imread(path, 0)
    normalized = 255-denoise(img)
    resized = resize_img(normalized)
    cropped= crop_center(resized,(150,220))

    f, ax = plt.subplots(4,1, figsize=(6,15))
    ax[0].imshow(img, cmap='Greys_r')
    ax[1].imshow(normalized)
    ax[2].imshow(resized)
    ax[3].imshow(cropped)

    ax[0].set_title('Original')
    ax[1].set_title('Background removed/centered')
    ax[2].set_title('Resized')
    ax[3].set_title('Cropped center of the image')

代码中完成了对一前景为黑色背景白色的图片进行了去噪、反色、归一化处理的过程。结果如下：

在这里插入图片描述

若想了解图像矩归一化及图像矩的原理，请移步另一篇博客[图像原点矩、二阶中心矩物理意义推导](图像原点矩、二阶中心矩物理意义推导 - 掘金 (juejin.cn))

参考文献

《Handwritten digit recognition: investigation of normalization andfeature extraction techniques》

[字符识别系列][一] 字符识别中的图像归一化算法简介