这是我参与更文挑战的第 24 天,活动详情查看: 更文挑战
在李宏毅教授分析 Alexnet ,图中识别出一些 pattern 和颜色,其实我们第一层输入图片,如果第二层filter 是将第一层filter 作为输入,输入到特征层。
from keras.applications import VGG16
from keras import backend as K
import matplotlib.pyplot as plt
在 keras 提供了预测经典模型例如 VGG16,我们多半的神经网都是基于这些经典神经网络上进行训练的。
对 block3 的一个层卷积的滤镜,150 x 150 去,我们找到什么样图片可以 block conv1 编号为 0 的特征反映,我们需要定义函数,特征所有像素的求和平均最大表示特征图被激活。
VGG16 我们研究的对象就是 block3_conv1 层的第一个滤镜(编号1), 输入到 VGG 什么样的图片让block3_conv1 层的第一个滤镜(编号1 激活,也就是这个滤镜对这样图片最敏感,那么我们如何数学模型来说明和衡量这件事,那么也就是图片和滤镜计算后的值的所有值均值最大,就说明滤镜对这张图片最敏感。那么这就是我们目标函数,我们要做的就是找一张图片输入让该滤镜的均值最大。用 x 表示我们要找的图片, w我们用梯度上升来代替微分过程,我们这里输入图片也就是 x 是矩阵,我们对这个矩阵每一个分量求导
model = VGG16(weights='imagenet',include_top=False)
layer_name = 'block3_conv1'
filter_index = 0
梯度上升
不过如果希望步伐稍微大一点,如果步伐大,可能造成不稳定,梯度向量的做一个 L2 的正则化。
layer_output = model.get_layer(layer_name).output
loss = K.mean(layer_output[:,:,:,filter_index])
layer_output
<tf.Tensor 'block1_conv1/Relu:0' shape=(?, ?, ?, 64) dtype=float32>
print(loss)
Tensor("Mean_1170:0", shape=(), dtype=float32)
grads = K.gradients(loss,model.input)[0]
grads
<tf.Tensor 'gradients_585/block1_conv1/convolution_grad/Conv2DBackpropInput:0' shape=(?, ?, ?, 3) dtype=float32>
grads /= (K.sqrt(K.mean(K.square(grads))) + 1e-5)
iterate = K.function([model.input],[loss,grads])
import numpy as np
loss_value,grads_value = iterate([np.zeros((1,150,150,3))])
input_img_data = np.random.random((1,150,150,3)) * 20 + 128.
step = 1.
for i in range(40):
loss_value,grads_value = iterate([input_img_data])
input_img_data += grads_value * step
这个函数用于对图片进行压缩,所谓压缩我们要做的工作就是先将图片
def depress_image(x):
x -= x.mean()
x /= (x.std() + 1e-5)
x *= 0.1
x += 0.5
x = np.clip(x,0,1)
x *= 255
x = np.clip(x,0,255).astype('uint8')
return x
这个函数
def generate_pattern(layer_name, filter_index, size=150):
layer_output = model.get_layer(layer_name).output
loss = K.mean(layer_output[:,:,:,filter_index])
grads = K.gradients(loss,model.input)[0]
grads /= (K.sqrt(K.mean(K.square(grads))) + 1e-5)
iterate = K.function([model.input],[loss,grads])
input_img_data = np.random.random((1,size,size,3)) * 20 + 128.
step = 1.
for i in range(40):
loss_value,grads_value = iterate([input_img_data])
input_img_data += grads_value * step
img = input_img_data[0]
return depress_image(img)
plt.imshow(generate_pattern('block1_conv1',0))
plt.show()
plt.imshow(generate_pattern('block3_conv1',0))
plt.show()
plt.imshow(generate_pattern('block4_conv1',1))
plt.show()
for layer_name in ['block1_conv1','block2_conv1','block3_conv1','block4_conv1']:
size = 64
margin = 5
results = np.zeros(( 8 * size + 7 * margin, 8 * size + 7 * margin,3))
for i in range(8):
for j in range(8):
filter_img = generate_pattern(layer_name, i + (j*8),size=size)
horizontal_start = i * size + i * margin
horizontal_end = horizontal_start + size
vertical_start = j * size + j * margin
vertical_end = vertical_start + size
results[horizontal_start:horizontal_end,vertical_start:vertical_end,:] = filter_img
plt.figure(figsize=(20,20))
plt.imshow(results)
plt.show()
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, None, None, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, None, None, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, None, None, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, None, None, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, None, None, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, None, None, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, None, None, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, None, None, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, None, None, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, None, None, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, None, None, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, None, None, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, None, None, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, None, None, 512) 0
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________