Xu Han, Lv Ruichan
State Key Laboratory of Electromechanical Integrated Manufacturing of High-performance Electronic Equipment, School of Electro-Mechanical Engineering, Xidian University, Xi'an, Shaanxi 710071, China.
State Key Laboratory of Electromechanical Integrated Manufacturing of High-performance Electronic Equipment, School of Electro-Mechanical Engineering, Xidian University, Xi'an, Shaanxi 710071, China.
Spectrochim Acta A Mol Biomol Spectrosc. 2025 Jul 5;335:125997. doi: 10.1016/j.saa.2025.125997. Epub 2025 Mar 6.
Lung cancer is a malignant tumor that poses a serious threat to human health. Existing lung cancer diagnostic techniques face the challenges of high cost and slow diagnosis. Early and rapid diagnosis and treatment are essential to improve the outcome of lung cancer. In this study, a deep learning-based multi-modal spectral information fusion (MSIF) network is proposed for lung adenocarcinoma cell detection. First, multi-modal data of Fourier transform infrared spectra, UV-vis absorbance spectra, and fluorescence spectra of normal and patient cells were collected. Subsequently, the spectral text data were efficiently processed by one-dimensional convolutional neural network. The global and local features of the spectral images are deeply mined by the hybrid model of ResNet and Transformer. An adaptive depth-wise convolution (ADConv) is introduced to be applied to feature extraction, overcoming the shortcomings of conventional convolution. In order to achieve feature learning between multi-modalities, a cross-modal interaction fusion (CMIF) module is designed. This module fuses the extracted spectral image and text features in a multi-faceted interaction, enabling full utilization of multi-modal features through feature sharing. The method demonstrated excellent performance on the test sets of Fourier transform infrared spectra, UV-vis absorbance spectra and fluorescence spectra, achieving 95.83 %, 97.92 % and 100 % accuracy, respectively. In addition, experiments validate the superiority of multi-modal spectral data and the robustness of the model generalization capability. This study not only provides strong technical support for the early diagnosis of lung cancer, but also opens a new chapter for the application of multi-modal data fusion in spectroscopy.
肺癌是一种对人类健康构成严重威胁的恶性肿瘤。现有的肺癌诊断技术面临着成本高和诊断速度慢的挑战。早期快速诊断和治疗对于改善肺癌治疗效果至关重要。在本研究中,提出了一种基于深度学习的多模态光谱信息融合(MSIF)网络用于肺腺癌细胞检测。首先,收集了正常细胞和患者细胞的傅里叶变换红外光谱、紫外可见吸收光谱和荧光光谱的多模态数据。随后,通过一维卷积神经网络对光谱文本数据进行有效处理。利用ResNet和Transformer的混合模型深入挖掘光谱图像的全局和局部特征。引入自适应深度卷积(ADConv)应用于特征提取,克服了传统卷积的缺点。为了实现多模态之间的特征学习,设计了一种跨模态交互融合(CMIF)模块。该模块通过多方面的交互融合提取的光谱图像和文本特征,通过特征共享充分利用多模态特征。该方法在傅里叶变换红外光谱、紫外可见吸收光谱和荧光光谱的测试集上表现出优异的性能,准确率分别达到95.83%、97.92%和100%。此外,实验验证了多模态光谱数据的优越性和模型泛化能力的鲁棒性。本研究不仅为肺癌的早期诊断提供了有力的技术支持,也为多模态数据融合在光谱学中的应用开启了新篇章。