使用自动编码器来生成Keras的数字

本文包含一个自动编码器的实时实现，我们将使用非常知名的公共基准数据集MNIST数据来训练和评估。

TKTejasKhare24.00

5月14日, 2021

Article

从上一篇文章《理解自动编码器--一种无监督学习方法》中，你现在一定对自动编码器有了很好的了解，它们在哪里被使用，以及如何训练一个自动编码器。

你一定很想建立你自己的自动编码器来生成东西。因此，在这篇文章中，我们将专注于加载我们的数据集，建立编码器模型，建立解码器模型，最后通过可视化的输出来测试其性能。

我们将在这个项目中使用 tf.keras库。将要使用的数据集是MNIST 数据，其中包含0到9的手写数字。它总共包含60000张图像和10000张灰度图像的测试集，尺寸为28×28。

让我们开始我们的代码 -

注：该代码是由作者编写和测试的。输出的图像是jupyter笔记本单元的屏幕截图。

1.导入整个项目所需的库

from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.layers import Conv2D, Flatten
from tensorflow.keras.layers import Reshape, Conv2DTranspose
from tensorflow.keras.models import Model
# You can directly import inbuilt MNIST dataset from  tensorflow.keras.datasets
from tensorflow.keras.datasets import mnist
from tensorflow.keras import backend as K
import numpy as np
import matplotlib.pyplot as plt

2. 用Keras的load_data()函数加载MNIST数据。

(x_train, _), (x_test, _) = mnist.load_data()

3.对数据进行重塑和归一化，使计算量最小化

# reshape to (28, 28, 1) and normalize input images
image_size = x_train.shape[1]
x_train = np.reshape(x_train, [-1, image_size, image_size, 1])
x_test = np.reshape(x_test, [-1, image_size, image_size, 1])
x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255

4.初始化网络参数

input_shape = (image_size, image_size, 1)
batch_size = 16
kernel_size = 3
latent_dim = 16
# encoder/decoder number of CNN layers and filters per layer
layer_filters = [32, 64]

5.建立编码器模型

如果你想了解更多关于这里使用的激活，请看这篇文章 -神经网络的激活函数

inputs = Input(shape=input_shape, name='encoder_input')
x = inputs

# Encoder model of conv2d(32) and conv2d(64) stacked together.
# where 32, and 64 are number of filters
for filters in layer_filters:    
    x = Conv2D(filters=filters,
               kernel_size=kernel_size,
               activation='relu',
               strides=2,
               padding='same')(x)

注意：你可能会得到一些警告，如--"更新说明--调用初始化器 instance...."你可以忽略并继续执行代码。

6.为解码器模型初始化正确的形状

"""
This step is important because we want to pass 
a specific shape to our decoder. 
Implementing this step would ensure we don't do the layer wise
calculation manually 
""" 

shape = K.int_shape(x)

将通过解码器的第一层（即Conv2DTranspose）的形状是（7，7，64）

7.初始化Latent空间

x = Flatten()(x)
latent = Dense(latent_dim, name='latent_vector')(x)

8. 初始化编码器模型

# latent is the output shape which we flattened earlier in previous step 
encoder = Model(inputs,
                latent,
                name='encoder')
encoder.summary()

10.建立解码器模型

# latent_dim is a parameter which we defined initially in step 4
latent_inputs = Input(shape=(latent_dim,), name='decoder_input')

# use the shape (7, 7, 64) that was earlier saved
x = Dense(shape[1] * shape[2] * shape[3])(latent_inputs)

# from vector to suitable shape for transposed conv
x = Reshape((shape[1], shape[2], shape[3]))(x)

# stack of Conv2DTranspose(64) and Conv2DTranspose(32)
for filters in layer_filters[::-1]:    
    x = Conv2DTranspose(filters=filters,
                        kernel_size=kernel_size,
                        activation='relu',
                        strides=2,
                        padding='same')(x)

11.初始化解码器的输出

outputs = Conv2DTranspose(filters=1,
                          kernel_size=kernel_size,
                          activation='sigmoid',
                          padding='same',
                          name='decoder_output')(x)

# instantiate decoder model
decoder = Model(latent_inputs, outputs, name='decoder')
decoder.summary()

以下是解码器的总结，看起来像----。

12.初始化整个自动编码器

# autoencoder = encoder + decoder
autoencoder = Model(inputs,
                    decoder(encoder(inputs)), 
                    name='autoencoder')
autoencoder.summary()

下面是自动编码器的摘要，看起来像----。

13.训练自动编码器

如果你想了解更多关于自动编码器中的优化器和损失函数的信息，请看这些文章，分别是： - 优化方法

# Mean Square Error (MSE) loss function, Adam optimizer
autoencoder.compile(loss='mse', optimizer='adam')
# train the autoencoder
autoencoder.fit(x_train,
                x_train,
                validation_data=(x_test, x_test),     
                epochs=1,               
                batch_size=batch_size)

14.获得预测结果

你可以使用tensorflow.keras.models中Model()类的predict()函数。

x_decoded = autoencoder.predict(x_test)

注意：传递给predict函数的参数应该是一个测试数据集，因为如果传递训练样本，自动编码器会产生完全相同的结果。这将意味着自动编码器只是简单地复制数据，并将其粘贴在解码器的输出中。

15.最后将结果可视化

imgs = np.concatenate([x_test[:8], x_decoded[:8]])
imgs = imgs.reshape((4, 4, image_size, image_size))
imgs = np.vstack([np.hstack(i) for i in imgs])
plt.figure()
plt.axis('off')
plt.title('Input: 1st 2 rows, Decoded: last 2 rows')
plt.imshow(imgs, interpolation='none', cmap='gray')
plt.show()

在这里，我们正在创建一个图像，它将包含总共16个图像叠加在一起。首先，两行是输入的测试集，其余的是其对应的世代。

16.观察结果

从上面的可视化图像中，你可以清楚地看到，生成的图像有点模糊不清。这意味着我们可以有一个更深的编码器和解码器模型来提取更多的特征。你还可以观察到，数字 "9 "是给定集合中最模糊的预测，这意味着自动编码器对于 "9 "类的训练并不像其他的那样好。

17.总结

自动编码器训练后的损失是0.0173，验证损失是0.0097。我们可以通过在编码器和解码器中分别堆叠更多的卷积层和转置卷积层来使我们的自动编码器变得更好。此外，玩弄网络参数也会对性能产生良好的影响。

这篇文章就写到这里。希望你现在能开发出你自己的自动编码器，我建议你也可以用不同的数据集来尝试，比如说，时尚MNIST。

谢谢你，祝你好运 :)

投票2票

kerasmnistautoencoder

使用Keras的自动编码器生成数字