College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China.
Anal Chem. 2023 Mar 21;95(11):4863-4870. doi: 10.1021/acs.analchem.2c03853. Epub 2023 Mar 12.
Raman spectroscopy has been widely used to provide the structural fingerprint for molecular identification. Due to interference from coexisting components, noise, baseline, and systematic differences between spectrometers, component identification with Raman spectra is challenging, especially for mixtures. In this study, a method entitled DeepRaman has been proposed to solve those problems by combining the comparison ability of a pseudo-Siamese neural network (pSNN) and the input-shape flexibility of spatial pyramid pooling (SPP). DeepRaman was trained, validated, and tested with 41,564 augmented Raman spectra from two databases (pharmaceutical material and S.T. Japan). It can achieve 96.29% accuracy, 98.40% true positive rate (TPR), and 94.36% true negative rate (TNR) on the test set. Another six data sets measured on different instruments were used to evaluate the performance of the proposed method from different aspects. DeepRaman can provide accurate identification results and significantly outperform the hit quality index (HQI) method and other deep learning models. In addition, it performs well in cases of different spectral complexity and low-content components. Once the model is established, it can be used directly on different data sets without retraining or transfer learning. Furthermore, it also obtains promising results for the analysis of surface-enhanced Raman spectroscopy (SERS) data sets and Raman imaging data sets. In summary, it is an accurate, universal, and ready-to-use method for component identification in various application scenarios.
拉曼光谱已被广泛用于提供分子识别的结构指纹。由于共存成分的干扰、噪声、基线以及光谱仪之间的系统差异,拉曼光谱的成分识别具有挑战性,特别是对于混合物。在这项研究中,提出了一种名为 DeepRaman 的方法,通过结合伪暹罗神经网络 (pSNN) 的比较能力和空间金字塔池化 (SPP) 的输入形状灵活性来解决这些问题。DeepRaman 使用来自两个数据库(药物材料和 S.T. Japan)的 41,564 个增强拉曼光谱进行训练、验证和测试。它在测试集上可实现 96.29%的准确率、98.40%的真阳性率 (TPR) 和 94.36%的真阴性率 (TNR)。另外六个在不同仪器上测量的数据集用于从不同方面评估所提出方法的性能。DeepRaman 可以提供准确的识别结果,并且显著优于命中质量指数 (HQI) 方法和其他深度学习模型。此外,它在不同光谱复杂性和低含量成分的情况下表现良好。一旦建立了模型,就可以直接在不同的数据集上使用,而无需重新训练或迁移学习。此外,它还为表面增强拉曼光谱 (SERS) 数据集和拉曼成像数据集的分析提供了有希望的结果。总之,它是一种在各种应用场景中进行成分识别的准确、通用且易于使用的方法。