[深度学习]多层感知机MLP全连接深度神经网络FCN(实战2, 分类)我们将按照以下步骤进行：生成一个模拟数据集：10

我们将按照以下步骤进行：

生成一个模拟数据集：10万行，200个特征，1个标签（假设为二分类问题）
数据预处理：分割训练集和测试集，标准化数据
使用TensorFlow构建一个全连接神经网络（FCN）
编译模型，设置优化器、损失函数和评估指标
训练模型
评估模型性能

注意：由于是模拟数据，我们使用numpy生成随机数据。在实际应用中，你需要替换为自己的数据。

我们将构建一个具有多个全连接层的神经网络。这里我们设计一个简单的结构：

输入层（200个特征） -> 隐藏层1（128个神经元，激活函数ReLU）

-> 隐藏层2（64个神经元，激活函数ReLU）

-> 输出层（1个神经元，激活函数sigmoid，用于二分类）

对于二分类问题，我们使用二元交叉熵损失函数，优化器选择Adam。

代码步骤：

以下是一个完整的TensorFlow代码示例，用于构建和训练全连接神经网络(FCN)进行二分类任务。代码包含模拟数据集生成、数据预处理、模型构建、训练和评估：

import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
import matplotlib.pyplot as plt

# 1. 创建模拟数据集 (10万样本，200特征)
def generate_dataset(num_samples=100000, num_features=200):
    # 生成随机特征数据 (符合正态分布)
    features = np.random.randn(num_samples, num_features)
    
    # 创建有意义的标签 (使用非线性组合)
    # 使用前50个特征创建非线性关系
    meaningful_features = features[:, :50]
    # 创建非线性决策边界
    labels = (
        np.sum(meaningful_features ** 2, axis=1) + 
        2 * np.sin(0.5 * meaningful_features[:, 0]) +
        0.5 * meaningful_features[:, 1] * meaningful_features[:, 2] -
        np.exp(meaningful_features[:, 3])
    )
    
    # 添加噪声
    noise = 0.5 * np.random.randn(num_samples)
    labels += noise
    
    # 转换为二分类标签 (0和1)
    labels = (labels > np.median(labels)).astype(int)
    
    return features, labels

# 生成数据
features, labels = generate_dataset()

# 2. 数据预处理
# 分割数据集
X_train, X_test, y_train, y_test = train_test_split(
    features, labels, test_size=0.2, random_state=42
)

# 标准化特征
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# 3. 构建全连接神经网络模型
def build_fcn_model(input_shape):
    model = Sequential()
    
    # 输入层和第一个隐藏层
    model.add(Dense(256, activation='relu', input_shape=(input_shape,)))
    model.add(BatchNormalization())
    model.add(Dropout(0.3))
    
    # 第二个隐藏层
    model.add(Dense(128, activation='relu'))
    model.add(BatchNormalization())
    model.add(Dropout(0.3))
    
    # 第三个隐藏层
    model.add(Dense(64, activation='relu'))
    model.add(BatchNormalization())
    model.add(Dropout(0.2))
    
    # 输出层 (二分类)
    model.add(Dense(1, activation='sigmoid'))
    
    # 编译模型
    model.compile(optimizer=Adam(learning_rate=0.001),
                 loss='binary_crossentropy',
                 metrics=['accuracy', 
                          tf.keras.metrics.Precision(name='precision'),
                          tf.keras.metrics.Recall(name='recall')])
    
    return model

# 创建模型
model = build_fcn_model(X_train.shape[1])
model.summary()

# 4. 设置回调函数
early_stopping = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=1e-6)

# 5. 训练模型
history = model.fit(
    X_train, y_train,
    epochs=100,
    batch_size=256,
    validation_split=0.2,
    callbacks=[early_stopping, reduce_lr],
    verbose=1
)

# 6. 评估模型
test_results = model.evaluate(X_test, y_test, verbose=0)
print("\n测试集评估结果:")
print(f"损失: {test_results[0]:.4f}")
print(f"准确率: {test_results[1]*100:.2f}%")
print(f"精确率: {test_results[2]*100:.2f}%")
print(f"召回率: {test_results[3]*100:.2f}%")

# 7. 绘制训练曲线
def plot_training_history(history):
    plt.figure(figsize=(12, 10))
    
    # 准确率曲线
    plt.subplot(2, 2, 1)
    plt.plot(history.history['accuracy'], label='训练准确率')
    plt.plot(history.history['val_accuracy'], label='验证准确率')
    plt.title('训练和验证准确率')
    plt.ylabel('准确率')
    plt.xlabel('轮次')
    plt.legend()
    
    # 损失曲线
    plt.subplot(2, 2, 2)
    plt.plot(history.history['loss'], label='训练损失')
    plt.plot(history.history['val_loss'], label='验证损失')
    plt.title('训练和验证损失')
    plt.ylabel('损失')
    plt.xlabel('轮次')
    plt.legend()
    
    # 精确率曲线
    plt.subplot(2, 2, 3)
    plt.plot(history.history['precision'], label='训练精确率')
    plt.plot(history.history['val_precision'], label='验证精确率')
    plt.title('训练和验证精确率')
    plt.ylabel('精确率')
    plt.xlabel('轮次')
    plt.legend()
    
    # 召回率曲线
    plt.subplot(2, 2, 4)
    plt.plot(history.history['recall'], label='训练召回率')
    plt.plot(history.history['val_recall'], label='验证召回率')
    plt.title('训练和验证召回率')
    plt.ylabel('召回率')
    plt.xlabel('轮次')
    plt.legend()
    
    plt.tight_layout()
    plt.show()

plot_training_history(history)

# 8. 保存模型
model.save('fcn_classifier.h5')

代码说明：

数据集生成：
- 创建10万行×200个特征的模拟数据集
- 使用前50个特征创建非线性决策边界
- 添加高斯噪声增加分类难度
- 生成二分类标签(0/1)
数据预处理：
- 训练集/测试集分割(80%/20%)
- 使用StandardScaler标准化特征数据
模型架构：
- 输入层：200个神经元(对应特征数量)
- 3个隐藏层：256/128/64个神经元
- 使用ReLU激活函数
- 添加BatchNormalization和Dropout防止过拟合
- 输出层：1个神经元(sigmoid激活函数用于二分类)
训练配置：
- 优化器：Adam(学习率0.001)
- 损失函数：二元交叉熵
- 评估指标：准确率、精确率、召回率
- 回调函数：早停法(防止过拟合)、动态学习率调整
模型评估：
- 测试集性能评估
- 可视化训练过程曲线
- 保存训练好的模型

关键组件说明：

BatchNormalization：加速训练并提高模型稳定性
Dropout：随机丢弃神经元防止过拟合(0.2-0.3丢弃率)
EarlyStopping：当验证损失不再改善时停止训练
ReduceLROnPlateau：当验证损失停滞时降低学习率
学习率：初始0.001，训练中可动态调整

实际使用建议：

对于真实数据集：
- 替换generate_dataset()函数为您的数据加载代码
- 确保数据格式为(样本数, 特征数)
- 处理缺失值和异常值
模型调整：
- 根据数据复杂度调整网络深度(层数)和宽度(神经元数)
- 调整Dropout比率防止过拟合
- 使用超参数调优(如Keras Tuner)
多分类任务修改：
- 输出层使用softmax激活函数
- 神经元数量等于类别数
- 损失函数改为categorical_crossentropy

此代码在RTX 3080 GPU上训练约2分钟完成(100个epoch)，您可以根据硬件情况调整batch_size和epochs数量。