代码
from keras import Input, Model
from tensorflow.keras.layers import Embedding, Dense, Conv1D, GlobalMaxPooling1D, Concatenate, Dropout
from matrix import embedding_matrix
from trainf1 import metric_F1score
from totoken import tokenizer
class TextCNN(object):
def __init__(self, maxlen, max_features, embedding_dims,
class_num=1,
last_activation='sigmoid'):
self.maxlen = maxlen
self.max_features = max_features
self.embedding_dims = embedding_dims
self.class_num = class_num
self.last_activation = last_activation
def get_model(self):
input = Input((self.maxlen,))
# Embedding part can try multichannel as same as origin paper
embedding = Embedding(self.max_features, self.embedding_dims, input_length=self.maxlen,
weights=[embedding_matrix])(input)
convs = []
for kernel_size in [3, 4, 5]:
c = Conv1D(128, kernel_size, activation='relu')(embedding)
c = GlobalMaxPooling1D()(c)
convs.append(c)
x = Concatenate()(convs)
output = Dense(self.class_num, activation=self.last_activation)(x)
model = Model(inputs=input, outputs=output)
return model
model = TextCNN(maxlen=30, max_features=len(tokenizer.word_index) + 1,
embedding_dims=256, class_num=3, last_activation='softmax').get_model()
model.compile('adam', 'categorical_crossentropy', metrics=['accuracy', metric_F1score])
model.summary()
控制台
input_1 (InputLayer) [(None, 30)] 0 []
embedding (Embedding) (None, 30, 256) 8525824 ['input_1[0][0]']
conv1d (Conv1D) (None, 28, 128) 98432 ['embedding[0][0]']
conv1d_1 (Conv1D) (None, 27, 128) 131200 ['embedding[0][0]']
conv1d_2 (Conv1D) (None, 26, 128) 163968 ['embedding[0][0]']
global_max_pooling1d (GlobalMa (None, 128) 0 ['conv1d[0][0]']
xPooling1D)
global_max_pooling1d_1 (Global (None, 128) 0 ['conv1d_1[0][0]']
MaxPooling1D)
global_max_pooling1d_2 (Global (None, 128) 0 ['conv1d_2[0][0]']
MaxPooling1D)
concatenate (Concatenate) (None, 384) 0 ['global_max_pooling1d[0][0]',
'global_max_pooling1d_1[0][0]',
'global_max_pooling1d_2[0][0]']
dense (Dense) (None, 3) 1155 ['concatenate[0][0]']
==================================================================================================
Total params: 8,920,579
Trainable params: 8,920,579
Non-trainable params: 0
分析
1.第一层InputLayer,网络的第一层为输入层,将外部传入的文本最大长度input_shape接入模型,所以此时没有参数,Param为0。由code可以得出,TextCNN中设定的max_len为30,所以在控制台输出的时候会有一个30;而另外一个数据的值为None,可以认为是数据输入的batch.
2.第二层embedding,这一层为嵌入层,
3.第三、四、五层Conv1D、Conv1D_1、Conv1D_2,这是第卷积层,该层的卷积核大小为kernel_size,卷积核个数为128个.于是,Param=(卷积核大小 x 词向量维度+1) x 卷积核个数,其中+1是考虑到了bias(偏置值).同时,用大小为kernel_size(由上下文可以看出,这里利用循环构造了三个卷积层,每个卷积层的size分别为3、4、5)的卷积核对长度为30的文本进行采样时(默认步长为1,vaild padding),采样结束后文本长度为 30+1-3=28(三层卷积层采样结束后文本长度分别为28、27、26),且共生成128个结果,所以数据的输出维度为(None,28,128)、(None,27,128)、(None,26,128).
4.第六、七、八层global_max_pooling1d、global_max_pooling1d_1、global_max_pooling1d_2,这三层为池化层,该层默认参数为池化尺寸为2,valid padding。池化作用为对数据进行降维,因此参数为0。数据经过池化层后,维度降为原来的一半,