聚焦故障识别性能!深度残差收缩网络的TensorFlow实现方案

0 阅读11分钟

在旋转机械的故障诊断中,传感器采集的振动信号往往夹杂着大量的环境噪声。传统的深度残差网络(ResNet)虽然通过恒等快捷路径缓解了梯度消失问题,但在处理高噪声数据时,可能捕获到无关的噪声特征,导致诊断精度有所下降。

为了提升模型在低信噪比(SNR)环境下的特征学习能力,深度残差收缩网络(Deep Residual Shrinkage Network, DRSN)被提出。其核心思想是在深度架构中集成“软阈值”算子作为非线性变换层。软阈值化是信号处理领域去噪的经典手段,其数学表达式为:y = sign(x) * max(|x| - τ, 0)。其中,τ是一个正数阈值。通过该变换,接近于零的“不重要特征”会被直接清零,而强特征则得以保留。

一、具有通道阈值的残差收缩单元(RSBU-CW)

论文提出了两种变体,本文重点介绍具有通道特征阈值的残差收缩单元(Residual Shrinkage Building Unit with Channel-Wise thresholds, RSBU-CW)。

如下图所示,该单元在传统残差网络基本模块的基础上增加了一个专门用于学习阈值的子网络。该子网络首先通过全局平均池化(Global Average Pooling, GAP)将特征图压缩,然后通过两层全连接层配合批标准化(BN)与修正线性单元(Rectified Linear Unit, ReLU)激活函数,输出一个介于0到1之间的缩放比例α。最后阈值确定为:τ = α * average(|x|)。这种设计使得模型能够根据每个通道的噪声水平,自适应地确定剔除特征的比例,降低了对领域专家经验的依赖。

图1--掘金.png

二、基于TensorFlow的端到端复现

以下是基于TensorFlow框架实现的RSBU-CW核心模块及模型整体架构代码。

"""
项目名称:深度残差收缩网络(DRSN)在工业振动信号诊断中的应用实现
文件描述:
    本脚本实现了基于深度残差收缩网络(DRSN-CW变体)的端到端故障诊断系统。核心思想在于利用
    残差收缩单元(RSBU-CW)中的自适应阈值机制,有效抑制强背景噪声对特征提取的干扰。
    该模型专注于处理来自旋转机械(如轴承)的时域振动信号。

主要功能模块:
    1. 数据引擎:CWRU数据集的批量化读取、标准化处理及滑动窗口切片机制。
    2. 核心算法:RSBU-CW 层的构建,实现特征空间内基于通道注意力的软阈值去噪操作。
    3. 鲁棒性验证:内置高斯白噪声注入模块,用于在低信噪比(SNR)情况下对模型泛化能力进行评估。
    4. 训练策略:部署在线数据增强技术(包括时域平移和冲击模拟)以提升模型抗噪性能。

参考文献 (IEEE Format):
    M. Zhao, S. Zhong, X. Fu, B. Tang and M. Pecht, "Deep Residual Shrinkage Networks 
    for Fault Diagnosis," in IEEE Transactions on Industrial Informatics, 
    vol. 16, no. 7, pp. 4681-4690, July 2020.
==============================================================================
"""

import os
import sys
import logging
import numpy as np
import scipy.io as matlab_io
import tensorflow as tf
from tensorflow.keras import layers as tf_layers
from tensorflow.keras import Model as tf_model
from tensorflow.keras import regularizers as tf_reg
from sklearn.model_selection import train_test_split

# =============================================================================
# 第一部分:运行环境与硬件抽象层
# =============================================================================

logging.basicConfig(level=logging.INFO, format='[%(levelname)s] %(message)s')

class EnvironmentSetup:
    """
    负责后端计算引擎的初始化与资源调度。
    核心任务:依赖环境拓扑校验、日志等级过滤及显存动态扩展配置。
    """
    @staticmethod
    def init_environment():
        # 执行第三方科学计算组件的版本兼容性与链路校验
        try:
            import sklearn.model_selection
            import scipy.stats
        except ImportError as dependency_error:
            logging.critical("运行时依赖库缺失: %s", str(dependency_error))
            sys.exit(1)

        # 拦截深度学习框架层级的非关键性诊断冗余信息
        os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
        
        # 异构计算单元探测与显存动态按需分配策略部署
        physical_gpus = tf.config.list_physical_devices('GPU')
        if physical_gpus:
            try:
                for gpu in physical_gpus:
                    tf.config.experimental.set_memory_growth(gpu, True)
                logging.info("成功挂载 %d 个计算加速单元,已启用增量显存模式。", len(physical_gpus))
            except RuntimeError as config_fault:
                logging.warning("加速单元配置失败: %s", str(config_fault))
        else:
            logging.info("未发现可用 GPU,系统回退至通用计算模式(CPU)。")

# 触发环境预设
EnvironmentSetup.init_environment()

# =============================================================================
# 第二部分:信号获取与时间序列处理引擎
# =============================================================================

class CWRUDataloader:
    """
    时序信号特征加载引擎。
    解析 CWRU 标准序列,执行固定步长的重叠/无重叠滑动窗口采样。
    """
    def __init__(self, data_dir, window_size=1024):
        self.root_dir = os.path.abspath(data_dir)
        self.window_size = window_size
        self.stride = window_size

    def _load_mat_data(self, file_path):
        """
        解析 .mat 格式的数据字典,定位并提取驱动端(DE)振动加速度原始矢量。
        """
        try:
            mat_data = matlab_io.loadmat(file_path)
            for key in mat_data.keys():
                if 'DE_time' in key:
                    return mat_data[key].flatten()
        except Exception:
            return None
        return None

    def load_and_slice_samples(self, label_map):
        """
        映射故障字典,构建面向多类分类任务的特征张量及对应的 One-Hot 编码标签。
        """
        X, y = [], []
        found_data = False
        
        for label, filenames in label_map.items():
            for filename in filenames:
                file_path = os.path.join(self.root_dir, "{}.mat".format(filename))
                if not os.path.exists(file_path):
                    continue
                
                signal = self._load_mat_data(file_path)
                if signal is None:
                    continue
                
                found_data = True
                # 执行无重叠滑动窗口采样
                for i in range(0, len(signal) - self.window_size + 1, self.stride):
                    sample = signal[i : i + self.window_size]
                    X.append(sample)
                    y.append(label)
        
        if not found_data:
            raise FileNotFoundError("在指定路径下未定位到合法的时序信号资源: {}".format(self.root_dir))
            
        return np.array(X, dtype='float32'), np.array(y, dtype='int32')

def add_white_noise(signals, target_snr):
    """
    信号域高斯噪声增强算子。
    数学逻辑:通过信噪比(SNR)逆向推算噪声方差,模拟复杂工业环境。
    """
    batch_signals = np.array(signals)
    rng = np.random.default_rng()
    
    # 确定信噪比标量或区间采样
    snr_val = target_snr if isinstance(target_snr, (int, float)) else rng.uniform(target_snr[0], target_snr[1])
    
    # 能量域计算:P_noise = P_signal / 10^(SNR/10)
    signal_power = np.mean(batch_signals**2, axis=1, keepdims=True)
    noise_variance = signal_power / (10**(snr_val / 10.0))
    noise = rng.normal(0, np.sqrt(noise_variance), batch_signals.shape)
    
    return (batch_signals + noise).astype('float32')

# =============================================================================
# 第三部分:基于深度残差收缩网络 (DRSN) 的核心架构
# =============================================================================

class SoftThresholding(tf_layers.Layer):
    """
    深度残差收缩网络专用:自适应非线性收缩层 (Soft Thresholding)。
    算子定义:y = sign(x) * max(|x| - τ, 0),其中 τ 为学习得到的阈值矢量。
    """
    def __init__(self, **kwargs):
        super(SoftThresholding, self).__init__(**kwargs)

    def call(self, inputs):
        x, threshold = inputs
        # 维度对齐:将 (Batch, Channels) 映射至 (Batch, 1, Channels)
        reshaped_thres = tf.expand_dims(threshold, axis=1)
        return tf.sign(x) * tf.maximum(tf.abs(x) - reshaped_thres, 0.0)

class RSBU_CW(tf_layers.Layer):
    """
    深度残差收缩网络的基本构建单元 (RSBU-CW)。
    集成了恒等映射分支、特征提取路径以及基于通道注意力的阈值生成子网络。
    """
    def __init__(self, filters, kernel_size, strides=1, **kwargs):
        super(RSBU_CW, self).__init__(**kwargs)
        self.filters = filters
        self.strides = strides
        self.kernel_size = kernel_size
        self.l2_reg = tf_reg.l2(1e-4)

        self.shortcut = None
        
        # 卷积流水线
        self.bn1 = tf_layers.BatchNormalization()
        self.relu1 = tf_layers.Activation('relu')
        self.conv1 = tf_layers.Conv1D(filters, kernel_size, strides=strides, padding='same', 
                                      kernel_initializer='he_normal', kernel_regularizer=self.l2_reg)
        
        self.bn2 = tf_layers.BatchNormalization()
        self.relu2 = tf_layers.Activation('relu')
        self.conv2 = tf_layers.Conv1D(filters, kernel_size, strides=1, padding='same', 
                                      kernel_initializer='he_normal', kernel_regularizer=self.l2_reg)
        
        # 通道阈值估算子网络 (Sub-network for threshold generation)
        self.gap = tf_layers.GlobalAveragePooling1D()
        self.fc1 = tf_layers.Dense(filters, kernel_initializer='he_normal')
        self.bn_fc = tf_layers.BatchNormalization()
        self.relu_fc = tf_layers.Activation('relu')
        self.fc2 = tf_layers.Dense(filters, activation='sigmoid')
        self.soft_thresholding = SoftThresholding()

    def build(self, input_shape):
        # 维度对齐逻辑:若步长变化或通道不匹配,则启用 1x1 卷积重投影
        if self.strides != 1 or input_shape[-1] != self.filters:
            self.shortcut = tf.keras.Sequential([
                tf_layers.Conv1D(self.filters, 1, strides=self.strides, padding='same'),
            ])
        super(RSBU_CW, self).build(input_shape)

    def call(self, inputs):
        identity = inputs
        if self.shortcut:
            identity = self.shortcut(inputs)

        # 变换路径特征演化
        out = self.bn1(inputs)
        out = self.relu1(out)
        out = self.conv1(out)
        out = self.bn2(out)
        out = self.relu2(out)
        out = self.conv2(out)

        # 计算绝对值期望以确定阈值基准
        abs_x = tf.abs(out)
        abs_mean = self.gap(abs_x)
        
        # 注意力权重推导 (Scaling parameter alpha)
        scales = self.fc1(abs_mean)
        scales = self.bn_fc(scales)
        scales = self.relu_fc(scales)
        scales = self.fc2(scales)
        
        # 动态缩放生成最终阈值 (Threshold tau)
        thresholds = tf.multiply(scales, abs_mean)
        
        # 应用非线性收缩与残差融合
        thres_x = self.soft_thresholding([out, thresholds])
        return tf_layers.Add()([thres_x, identity])

class DRSN_CW(tf_model):
    """
    深度残差收缩网络主体架构。
    采用多阶段 RSBU 堆叠,实现从强噪声振动信号到故障类别的端到端映射。
    """
    def __init__(self, num_classes):
        super(DRSN_CW, self).__init__(name="DRSN_Bearing_Expert")
        
        # 初始特征提取层
        self.conv1 = tf_layers.Conv1D(32, 15, strides=2, padding='same', kernel_initializer='he_normal')
        self.bn1 = tf_layers.BatchNormalization()
        self.relu1 = tf_layers.Activation('relu')
        
        # 残差收缩模块簇
        self.blocks = [
            RSBU_CW(32, 5, strides=2),
            RSBU_CW(32, 5, strides=1),
            RSBU_CW(64, 5, strides=2),
            RSBU_CW(64, 5, strides=1),
            RSBU_CW(128, 5, strides=2),
            RSBU_CW(128, 5, strides=1)
        ]
        
        # 决策输出层
        self.bn_last = tf_layers.BatchNormalization()
        self.relu_last = tf_layers.Activation('relu')
        self.gap = tf_layers.GlobalAveragePooling1D()
        self.fc = tf_layers.Dense(num_classes, activation='softmax')

    def call(self, inputs):
        x = self.conv1(inputs)
        x = self.bn1(x)
        x = self.relu1(x)
        
        for block in self.blocks:
            x = block(x)
            
        x = self.bn_last(x)
        x = self.relu_last(x)
        x = self.gap(x)
        return self.fc(x)

# =============================================================================
# 第四部分:流水线控制与诊断工作流
# =============================================================================

def train_and_test_pipeline(data_dir, sequence_length=1024):
    """
    启动 DRSN-CW 模型训练与评估全流程。
    涵盖:数据采样、在线增强、收敛控制及强噪声稳健性测试。
    """
    # 定义 10 分类映射逻辑
    label_map = {
        0: ['Normal_0', 'Normal_1', 'Normal_2', 'Normal_3'],
        1: ['IR007_0', 'IR007_1', 'IR007_2', 'IR007_3'],
        2: ['IR014_0', 'IR014_1', 'IR014_2', 'IR014_3'],
        3: ['IR021_0', 'IR021_1', 'IR021_2', 'IR021_3'],
        4: ['B007_0', 'B007_1', 'B007_2', 'B007_3'],
        5: ['B014_0', 'B014_1', 'B014_2', 'B014_3'],
        6: ['B021_0', 'B021_1', 'B021_2', 'B021_3'],
        7: ['OR007@6_0', 'OR007@6_1', 'OR007@6_2', 'OR007@6_3'],
        8: ['OR014@6_0', 'OR014@6_1', 'OR014@6_2', 'OR014@6_3'],
        9: ['OR021@6_0', 'OR021@6_1', 'OR021@6_2', 'OR021@6_3']
    }
    
    data_loader = CWRUDataloader(data_dir=data_dir, window_size=sequence_length)
    
    try:
        X, y = data_loader.load_and_slice_samples(label_map)
    except Exception as process_error:
        logging.error("数据集构建流发生致命错误: %s", str(process_error))
        return

    # 遵循随机分层抽样准则:构建 7:1.5:1.5 的训练、验证与测试集分配策略。
    X_train_raw, X_temp, y_train_raw, y_temp = train_test_split(
        X, y, test_size=0.3, random_state=42
    )
    X_val_raw, X_test_raw, y_val_raw, y_test_raw = train_test_split(
        X_temp, y_temp, test_size=0.5, random_state=42
    )
    
    # 基于分布特征的标准化处理
    data_mu, data_std = np.mean(X_train_raw), np.std(X_train_raw)
    
    def preprocess(nd_array):
        """执行 Z-Score 映射并转换至三阶张量 (Batch, Step, Channel)"""
        return ((nd_array - data_mu) / data_std).reshape(-1, sequence_length, 1)

    X_train = preprocess(X_train_raw)
    X_val = preprocess(X_val_raw)
    X_test = preprocess(X_test_raw)
    
    num_classes = len(label_map)
    y_train_onehot = tf.keras.utils.to_categorical(y_train_raw, num_classes).astype('float32')
    y_val_onehot = tf.keras.utils.to_categorical(y_val_raw, num_classes).astype('float32')
    y_test_onehot = tf.keras.utils.to_categorical(y_test_raw, num_classes).astype('float32')

    # 预设噪声压力测试:注入 -8dB SNR 强噪声
    X_val_noisy = add_white_noise(X_val, target_snr=-8)
    X_test_noisy = add_white_noise(X_test, target_snr=-8)

    def augment_data(x, y):
        """
        运行时数据增强策略,用于提升深度残差收缩网络的泛化边界。
        包含:时域随机循环移位、瞬态冲击模拟以及动态噪声混合。
        """
        rng = np.random.default_rng()
        augmented_x = x.copy()
        batch_size, seq_len, _ = augmented_x.shape

        # 时域平移不变性增强
        for idx in range(batch_size):
            shift_amount = rng.integers(0, seq_len)
            augmented_x[idx, :, 0] = np.roll(augmented_x[idx, :, 0], shift_amount)

        # 稀疏瞬态冲击模拟 (10% 触发率)
        if rng.random() > 0.9: 
            for idx in range(batch_size):
                if rng.random() > 0.5: 
                    impulse_count = rng.integers(1, 3) 
                    indices = rng.integers(0, seq_len, impulse_count)
                    peak_amp = np.std(augmented_x[idx]) * rng.uniform(1.5, 2.5) 
                    augmented_x[idx, indices, 0] += peak_amp * rng.choice([-1, 1], size=impulse_count)

        # 随机信噪比噪声正则化 (50% 触发率)
        if rng.random() > 0.5: 
            augmented_x = add_white_noise(augmented_x, target_snr=(-8, 8))

        return augmented_x.astype(np.float32), y.astype(np.float32)

    def set_shapes(x, y):
        """显式约束张量签名为计算图提供静态元数据"""
        x.set_shape([None, sequence_length, 1])
        y.set_shape([None, num_classes])
        return x, y

    # 构建异步 I/O 数据流水线
    train_ds = tf.data.Dataset.from_tensor_slices((X_train.astype('float32'), y_train_onehot))
    train_ds = train_ds.shuffle(len(X_train)).batch(64)
    train_ds = train_ds.map(
        lambda x, y: tf.numpy_function(augment_data, [x, y], [tf.float32, tf.float32]),
        num_parallel_calls=tf.data.AUTOTUNE
    ).map(set_shapes).prefetch(tf.data.AUTOTUNE)

    # 模型实例化与优化目标配置
    model = DRSN_CW(num_classes=num_classes)
    
    model.compile(
        optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3), 
        loss=tf.keras.losses.CategoricalCrossentropy(),
        metrics=['accuracy']
    )

    logging.info("深度残差收缩网络初始化就绪。分类规模: %d, 特征维数: %d", num_classes, sequence_length)
    
    # 训练监控回调
    callbacks = [
        tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=7, min_lr=1e-6, verbose=1),
        tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=20, restore_best_weights=True)
    ]

    # 启动拟合循环
    model.fit(
        train_ds,
        epochs=100,
        validation_data=(X_val_noisy, y_val_onehot),
        callbacks=callbacks,
        verbose=2
    )

    # 最终性能测试 (处于 -8dB SNR 极端环境下)
    _, final_accuracy = model.evaluate(X_test_noisy, y_test_onehot, verbose=0)
    print("\n[性能报告] 在强噪声干扰 (-8dB SNR) 模拟环境下的诊断准确率: {:.2f}%".format(final_accuracy * 100))

# =============================================================================
# 运行入口
# =============================================================================

if __name__ == "__main__":
    # 配置默认搜索路径
    DEFAULT_DATA_DIR = os.path.join(os.getcwd(), 'bearing_data')
    
    if not os.path.exists(DEFAULT_DATA_DIR):
        logging.info("默认数据仓库未找到: %s", DEFAULT_DATA_DIR)
        user_input_path = input("请手动指定包含 CWRU .mat 文件的目录路径: ").strip()
        if user_input_path:
            DEFAULT_DATA_DIR = user_input_path
        else:
            logging.error("路径输入无效,执行终止。")
            sys.exit(0)

    # 启动流水线
    train_and_test_pipeline(DEFAULT_DATA_DIR, sequence_length=1024)

三、强噪声环境下的故障诊断表现

本次复现实验采用了加噪声的凯斯西储大学(Case Western Reserve University, CWRU)轴承数据集。实验涵盖了正常状态、内圈故障、外圈故障及滚子故障等10类状态。

图2--掘金.png

为了验证模型的鲁棒性,在测试数据中人工注入了-8 dB的强高斯白噪声。实验结果显示,在训练达到100个迭代周期后,模型在验证集上的损失函数稳步下降。在-8dB信噪比的苛刻条件下,DRSN-CW模型依然实现了95.73%的测试准确率。

图3--掘金.png

论文标题: Deep residual shrinkage networks for fault diagnosis

出版期刊: IEEE Transactions on Industrial Informatics. 2020, 16(7): 4681-4690.

DOI: 10.1109/TII.2019.2943898