Liu Yufang, Yang Yanjun, Lu Haoran, Cui Jiaheng, Chen Xianyan, Ma Ping, Zhong Wenxuan, Zhao Yiping
Department of Statistics, Franklin College of Arts and Sciences, University of Georgia, Athens, Georgia 30602, United States.
Department of Physics and Astronomy, Franklin College of Arts and Sciences, University of Georgia, Athens, Georgia 30602, United States.
ACS Sens. 2025 Jun 27;10(6):3941-3952. doi: 10.1021/acssensors.4c03397. Epub 2025 May 18.
Surface-enhanced Raman spectroscopy (SERS) is a transformative tool for infectious disease diagnostics, offering rapid and sensitive species identification. However, background spectra in biological samples complicate analyte peak detection, increase the limit of detection, and hinder data augmentation. To address these challenges, we developed a deep learning framework utilizing dual neural networks to extract true virus SERS spectra and estimate concentration coefficients in water for 12 different respiratory viruses. The extracted spectra showed a high similarity to those obtained at the highest viral concentration, validating their accuracy. Using these spectra and the derived concentration coefficients, we augmented spectral data sets across varying virus concentrations in water. XGBoost models trained on these augmented data sets achieved overall classification and concentration prediction accuracy of 92.3% with a coefficient of determination () > 0.95. Additionally, the extracted spectra and coefficients were used to augment data sets in saliva backgrounds. When tested against real virus-in-saliva spectra, the augmented spectra-trained XGBoost models achieved 91.9% accuracy in classification and concentration prediction with > 0.9, demonstrating the robustness of the approach. By delivering clean and uncontaminated spectra, this methodology can significantly improve species identification, differentiation, and quantification and advance SERS-based detection and diagnostics.
表面增强拉曼光谱(SERS)是传染病诊断的一种变革性工具,可实现快速且灵敏的物种识别。然而,生物样品中的背景光谱使分析物峰检测变得复杂,提高了检测限,并阻碍了数据增强。为应对这些挑战,我们开发了一种深度学习框架,利用双神经网络提取12种不同呼吸道病毒在水中的真实病毒SERS光谱并估计浓度系数。提取的光谱与在最高病毒浓度下获得的光谱高度相似,验证了其准确性。利用这些光谱和导出的浓度系数,我们对水中不同病毒浓度下的光谱数据集进行了增强。在这些增强数据集上训练的XGBoost模型实现了92.3%的总体分类和浓度预测准确率,决定系数()>0.95。此外,提取的光谱和系数用于在唾液背景下增强数据集。当针对唾液中真实病毒光谱进行测试时,经增强光谱训练的XGBoost模型在分类和浓度预测方面的准确率达到91.9%,>0.9,证明了该方法的稳健性。通过提供干净且无污染的光谱,该方法可显著改善物种识别、区分和定量,并推动基于SERS的检测和诊断发展。