多层感知机(MLP,Multilayer Perceptron)也叫人工神经网络(ANN,Artificial Neural Network),除了输入输出层,它中间可以有多个隐层,最简单的MLP只含一个隐层,即三层的结构(本案例就是三层)
基于data.csv数据,建立mlp模型,计算其在测试数据上的准确率,可视化模型预测结果:
1.进行数据分离:test_size=0.33,random_state=10 2.模型结构:一层隐藏层,有20个神经元
#loada the data
import pandas as pd
import numpy as np
data = pd.read_csv('data.csv')
#define the X and y
X = data.drop(['y'],axis=1)
y = data.loc[:,'y']
#visualize the data
%matplotlib inline
from matplotlib import pyplot as plt
fig1 = plt.figure(figsize=(5,5))
passed=plt.scatter(X.loc[:,'x1'][y==1],X.loc[:,'x2'][y==1])
failed=plt.scatter(X.loc[:,'x1'][y==0],X.loc[:,'x2'][y==0])
plt.legend((passed,failed),('passed','failed'))
plt.xlabel('x1')
plt.ylabel('x2')
plt.title('raw data')
plt.show()
#split the data(分离出训练集与测试集)
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.33,random_state=10)
print(X_train.shape,X_test.shape,X.shape)
(275, 2) (136, 2) (411, 2)
#set up the model
from keras.models import Sequential
from keras.layers import Dense, Activation
mlp = Sequential()
mlp.add(Dense(units=20, input_dim=2, activation='sigmoid')) #隐藏层
mlp.add(Dense(units=1,activation='sigmoid')) #输出层
mlp.summary() #显示模型的结构
Using TensorFlow backend.
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 20) 60
_________________________________________________________________
dense_2 (Dense) (None, 1) 21
=================================================================
Total params: 81
Trainable params: 81
Non-trainable params: 0
_________________________________________________________________
#compile the model(优化模型 优化方法、损失函数)
mlp.compile(optimizer='adam',loss='binary_crossentropy')
#train the model(训练模型 设置迭代次数epochs)
mlp.fit(X_train,y_train,epochs=3000)
#make prediction and calculate the accuracy(训练数据集预测并计算准确度)
y_train_predict = mlp.predict_classes(X_train)
from sklearn.metrics import accuracy_score
accuracy_train = accuracy_score(y_train,y_train_predict)
print(accuracy_train)
0.9090909090909091
#make prediction based on the test data(测试数据集预测并计算准确度)
y_test_predict = mlp.predict_classes(X_test)
accuracy_test = accuracy_score(y_test,y_test_predict)
print(accuracy_test)
0.9264705882352942
#generate new data for plot
xx, yy = np.meshgrid(np.arange(0,1,0.01),np.arange(0,1,0.01))
x_range = np.c_[xx.ravel(),yy.ravel()] #生成点集
y_range_predict = mlp.predict_classes(x_range)
# print(type(y_range_predict))
# print(y_range_predict)
#format the output 转换输出类型格式
y_range_predict_form = pd.Series(i[0] for i in y_range_predict)
# print(y_range_predict_form)
fig2 = plt.figure(figsize=(5,5))
passed_predict=plt.scatter(x_range[:,0][y_range_predict_form==1],x_range[:,1][y_range_predict_form==1])
failed_predict=plt.scatter(x_range[:,0][y_range_predict_form==0],x_range[:,1][y_range_predict_form==0])
passed=plt.scatter(X.loc[:,'x1'][y==1],X.loc[:,'x2'][y==1])
failed=plt.scatter(X.loc[:,'x1'][y==0],X.loc[:,'x2'][y==0])
plt.legend((passed,failed,passed_predict,failed_predict),('passed','failed','passed_predict','failed_predict'))
plt.xlabel('x1')
plt.ylabel('x2')
plt.title('prediction result')
plt.show()