Li Wei Peng, Chuah Joon Huang, Tan Guo Jeng, Liu Chengyu, Ting Hua-Nong
Department of Electrical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur, Kuala Lumpur, Malaysia.
Department of Medicine, Faculty of Medicine, Universiti Malaya, Kuala Lumpur, Kuala Lumpur, Malaysia.
PeerJ Comput Sci. 2025 Jul 25;11:e3038. doi: 10.7717/peerj-cs.3038. eCollection 2025.
Synchronized electrocardiogram (ECG) and phonocardiogram (PCG) signals provide complementary diagnostic insights crucial for improving the accuracy of cardiovascular disease (CVD) detection. However, existing deep learning methods often utilize single-modal data or employ simplistic early or late fusion strategies, which inadequately capture the complex, hierarchical interdependencies between these modalities, thereby limiting detection performance. This study introduces PACFNet, a novel progressive attention-based cross-modal feature fusion network, for end-to-end CVD detection. PACFNet features a three-branch architecture: two modality-specific encoders for ECG and PCG, and a progressive selective attention-based cross-modal fusion encoder. A key innovation is its four-layer progressive fusion mechanism, which integrates multi-modal information from low-level morphological details to high-level semantic representations. This is achieved by selective attention-based cross-modal fusion (SACMF) modules at each progressive level, employing cascaded spatial and channel attention to dynamically emphasize salient feature contributions across modalities, thus significantly enhancing feature learning. Signals are pre-processed using a beat-to-beat segmentation approach to analyze individual cardiac cycles. Experimental validation on the public PhysioNet 2016 dataset demonstrates PACFNet's state-of-the-art performance, with an accuracy of 97.7%, sensitivity of 98%, specificity of 97.3%, and an F1-score of 99.7%. Notably, PACFNet not only excels in multi-modal settings but also maintains robust diagnostic capabilities even with missing modalities, underscoring its practical effectiveness and reliability. The source code is publicly available on Zenodo (https://zenodo.org/records/15450169).
同步心电图(ECG)和心音图(PCG)信号提供了互补的诊断见解,对于提高心血管疾病(CVD)检测的准确性至关重要。然而,现有的深度学习方法通常利用单模态数据或采用简单的早期或晚期融合策略,这些策略无法充分捕捉这些模态之间复杂的、分层的相互依赖关系,从而限制了检测性能。本研究引入了PACFNet,一种新颖的基于渐进注意力的跨模态特征融合网络,用于端到端的CVD检测。PACFNet具有三分支架构:两个用于ECG和PCG的特定模态编码器,以及一个基于渐进选择性注意力的跨模态融合编码器。一个关键创新是其四层渐进融合机制,该机制将多模态信息从低级形态细节整合到高级语义表示。这是通过在每个渐进级别上基于选择性注意力的跨模态融合(SACMF)模块实现的,采用级联的空间和通道注意力来动态强调跨模态的显著特征贡献,从而显著增强特征学习。使用逐搏分割方法对信号进行预处理,以分析各个心动周期。在公开的PhysioNet 2016数据集上的实验验证表明,PACFNet具有领先的性能,准确率为97.7%,灵敏度为98%,特异性为97.3%,F1分数为99.7%。值得注意的是,PACFNet不仅在多模态设置中表现出色,而且即使在模态缺失的情况下也能保持强大的诊断能力,突出了其实际有效性和可靠性。源代码可在Zenodo(https://zenodo.org/records/15450169)上公开获取。