School of Physics and Astronomy, University of Exeter, Exeter EX4 4QL, UK.
Central Laser Facility, Research Complex at Harwell, STFC Rutherford Appleton Laboratory, Harwell Oxford, OX11 0QX, UK.
Analyst. 2023 Dec 18;149(1):205-211. doi: 10.1039/d3an00820g.
There is increasing interest in the application of Raman spectroscopy in a medical setting, ranging from supporting real-time clinical decisions surgical margins to assisting pathologists with disease classification. However, there remain a number of barriers for adoption in the medical setting due to the increased complexity of probing highly heterogeneous, dynamic biological materials. This inherent challenge can also limit the deployment of higher level analytical approaches such as Artificial Intelligence (AI) including convolutional neural networks (CNN), as there is a lack of a ground truth required for training purposes in complex clinical samples. Principal component analysis (PCA) is an unsupervised data reduction approach (orthogonal linear transformation) that has been used extensively in spectroscopy for 30+ years, due to its capability to simplify analysis of complex spectroscopic data. However, due to PCA being unsupervised features will inherently appear mixed and their rank may vary between experiments. Here we propose Guided PCA (GPCA), a simple approach that allows PCA to be guided with spectral data to ensure a consistent rank of a key target moiety by the inclusion of a reference (guiding) spectrum to the data set. This simplifies analysis, increases robustness of PCA analysis and improves quantification and the limits of detection and decreases RMSE.
人们越来越关注拉曼光谱在医学环境中的应用,从支持实时临床决策到帮助病理学家进行疾病分类。然而,由于探测高度异质、动态生物材料的复杂性增加,在医疗环境中采用仍然存在一些障碍。这种固有的挑战也可能限制人工智能(AI)等更高层次分析方法的部署,包括卷积神经网络(CNN),因为在复杂的临床样本中缺乏用于训练目的的真实情况。主成分分析(PCA)是一种无监督的数据减少方法(正交线性变换),由于其简化复杂光谱数据分析的能力,已经在光谱学中使用了 30 多年。然而,由于 PCA 是无监督的,特征将固有地混合在一起,并且它们的秩可能在实验之间有所不同。在这里,我们提出了引导 PCA(GPCA),这是一种简单的方法,可以使用光谱数据来引导 PCA,通过将参考(引导)光谱包含到数据集中,以确保关键目标部分的秩一致。这简化了分析,提高了 PCA 分析的稳健性,提高了定量和检测限,并降低了 RMSE。