IEEE Trans Neural Syst Rehabil Eng. 2024;32:2096-2105. doi: 10.1109/TNSRE.2024.3394738.
Sleep staging serves as a fundamental assessment for sleep quality measurement and sleep disorder diagnosis. Although current deep learning approaches have successfully integrated multimodal sleep signals, enhancing the accuracy of automatic sleep staging, certain challenges remain, as follows: 1) optimizing the utilization of multi-modal information complementarity, 2) effectively extracting both long- and short-range temporal features of sleep information, and 3) addressing the class imbalance problem in sleep data. To address these challenges, this paper proposes a two-stream encode-decoder network, named TSEDSleepNet, which is inspired by the depth sensitive attention and automatic multi-modal fusion (DSA2F) framework. In TSEDSleepNet, a two-stream encoder is used to extract the multiscale features of electrooculogram (EOG) and electroencephalogram (EEG) signals. And a self-attention mechanism is utilized to fuse the multiscale features, generating multi-modal saliency features. Subsequently, the coarser-scale construction module (CSCM) is adopted to extract and construct multi-resolution features from the multiscale features and the salient features. Thereafter, a Transformer module is applied to capture both long- and short-range temporal features from the multi-resolution features. Finally, the long- and short-range temporal features are restored with low-layer details and mapped to the predicted classification results. Additionally, the Lovász loss function is applied to alleviate the class imbalance problem in sleep datasets. Our proposed method was tested on the Sleep-EDF-39 and Sleep-EDF-153 datasets, and it achieved classification accuracies of 88.9% and 85.2% and Macro-F1 scores of 84.8% and 79.7%, respectively, thus outperforming conventional traditional baseline models. These results highlight the efficacy of the proposed method in fusing multi-modal information. This method has potential for application as an adjunct tool for diagnosing sleep disorders.
睡眠分期是评估睡眠质量和诊断睡眠障碍的基本方法。尽管目前的深度学习方法已经成功地整合了多模态睡眠信号,提高了自动睡眠分期的准确性,但仍然存在一些挑战,如下所示:1)优化多模态信息互补的利用,2)有效地提取睡眠信息的长短期特征,3)解决睡眠数据中的类别不平衡问题。为了解决这些挑战,本文提出了一种双流编解码器网络,称为 TSEDSleepNet,它受到深度敏感注意力和自动多模态融合(DSA2F)框架的启发。在 TSEDSleepNet 中,使用双流编码器提取脑电图(EEG)和眼动电图(EOG)信号的多尺度特征。并利用自注意力机制融合多尺度特征,生成多模态显著特征。随后,采用粗尺度构建模块(CSCM)从多尺度特征和显著特征中提取和构建多分辨率特征。然后,应用 Transformer 模块从多分辨率特征中捕获长短期时间特征。最后,通过低层次细节恢复长短期时间特征,并将其映射到预测的分类结果。此外,应用 Lovász 损失函数来缓解睡眠数据集的类别不平衡问题。我们的方法在 Sleep-EDF-39 和 Sleep-EDF-153 数据集上进行了测试,分别达到了 88.9%和 85.2%的分类准确率和 84.8%和 79.7%的宏 F1 得分,优于传统的基准模型。这些结果表明了该方法融合多模态信息的有效性。该方法有望作为诊断睡眠障碍的辅助工具。