Fitzpatrick Institute for Photonics, Durham, North Carolina, USA.
Department of Biomedical Engineering, Duke University, Durham, North Carolina, USA.
Appl Spectrosc. 2024 Jan;78(1):84-98. doi: 10.1177/00037028231209053. Epub 2023 Nov 1.
Surface-enhanced Raman spectroscopy (SERS) has wide diagnostic applications due to narrow spectral features that allow multiplex analysis. We have previously developed a multiplexed, SERS-based nanosensor for micro-RNA (miRNA) detection called the inverse molecular sentinel (iMS). Machine learning (ML) algorithms have been increasingly adopted for spectral analysis due to their ability to discover underlying patterns and relationships within large and complex data sets. However, the high dimensionality of SERS data poses a challenge for traditional ML techniques, which can be prone to overfitting and poor generalization. Non-negative matrix factorization (NMF) reduces the dimensionality of SERS data while preserving information content. In this paper, we compared the performance of ML methods including convolutional neural network (CNN), support vector regression, and extreme gradient boosting combined with and without NMF for spectral unmixing of four-way multiplexed SERS spectra from iMS assays used for miRNA detection. CNN achieved high accuracy in spectral unmixing. Incorporating NMF before CNN drastically decreased memory and training demands without sacrificing model performance on SERS spectral unmixing. Additionally, models were interpreted using gradient class activation maps and partial dependency plots to understand predictions. These models were used to analyze clinical SERS data from single-plexed iMS in RNA extracted from 17 endoscopic tissue biopsies. CNN and CNN-NMF, trained on multiplexed data, performed most accurately with RMSE = 0.101 and 9.68 × 10, respectively. We demonstrated that CNN-based ML shows great promise in spectral unmixing of multiplexed SERS spectra, and the effect of dimensionality reduction on performance and training speed.
表面增强拉曼光谱(SERS)由于其光谱特征狭窄,允许多重分析,因此具有广泛的诊断应用。我们之前开发了一种基于 SERS 的多重微 RNA(miRNA)检测纳米传感器,称为逆分子哨兵(iMS)。由于其能够在大型复杂数据集内发现潜在的模式和关系,机器学习(ML)算法已越来越多地用于光谱分析。然而,SERS 数据的高维性对传统的 ML 技术提出了挑战,传统的 ML 技术可能容易过度拟合和概括能力差。非负矩阵分解(NMF)在保留信息量的同时降低 SERS 数据的维数。在本文中,我们比较了包括卷积神经网络(CNN)、支持向量回归和极端梯度提升在内的 ML 方法的性能,这些方法结合和不结合 NMF 用于对 iMS 测定中用于 miRNA 检测的四向多重 SERS 光谱进行光谱解混。CNN 在光谱解混中取得了很高的准确性。在 CNN 之前结合 NMF 可以大大降低内存和训练需求,而不会牺牲 SERS 光谱解混的模型性能。此外,还使用梯度类激活图和部分依赖图来解释模型,以了解预测。这些模型用于分析从 17 个内镜组织活检中提取的 RNA 中 iMS 单重化的临床 SERS 数据。在多重数据上训练的 CNN 和 CNN-NMF 分别以 RMSE=0.101 和 9.68×10 的准确度表现最佳。我们证明了基于 CNN 的 ML 在多重 SERS 光谱的光谱解混中具有很大的应用前景,以及降维对性能和训练速度的影响。