Department of Physics, The University of British Columbia-Okanagan, Kelowna, BC, Canada.
Southern Medical Program, Faculty of Medicine, The University of British Columbia-Okanagan, Kelowna, BC, Canada.
Appl Spectrosc. 2023 Jul;77(7):698-709. doi: 10.1177/00037028231169971. Epub 2023 Apr 25.
Raman spectroscopy is a useful tool for obtaining biochemical information from biological samples. However, interpretation of Raman spectroscopy data in order to draw meaningful conclusions related to the biochemical make up of cells and tissues is often difficult and could be misleading if care is not taken in the deconstruction of the spectral data. Our group has previously demonstrated the implementation of a group- and basis-restricted non-negative matrix factorization (GBR-NMF) framework as an alternative to more widely used dimensionality reduction techniques such as principal component analysis (PCA) for the deconstruction of Raman spectroscopy data as related to radiation response monitoring in both cellular and tissue data. While this method provides better biological interpretability of the Raman spectroscopy data, there are some important factors which must be considered in order to provide the most robust GBR-NMF model. We here evaluate and compare the accuracy of a GBR-NMF model in the reconstruction of three mixture solutions of known concentrations. The factors assessed include the effect of solid versus solutions bases spectra, the number of unconstrained components used in the model, the tolerance of different signal to noise thresholds, and how different groups of biochemicals compare to each other. The robustness of the model was assessed by how well the relative concentration of each individual biochemical in the solution mixture is reflected in the GBR-NMF scores obtained. We also evaluated how well the model can reconstruct original data, both with and without the inclusion of an unconstrained component. Overall, we found that solid bases spectra were generally comparable to solution bases spectra in the GBR-NMF model for all groups of biochemicals. The model was found to be relatively tolerant of high levels of noise in the mixture solutions using solid bases spectra. Additionally, the inclusion of an unconstrained component did not have a significant effect on the deconstruction, on the condition that all biochemicals in the mixture were included as bases chemicals in the model. We also report that some groups of biochemicals achieve a more accurate deconstruction using GBR-NMF than others, likely due to similarity in the individual bases spectra.
拉曼光谱是从生物样本中获取生化信息的有用工具。然而,为了得出与细胞和组织的生化组成相关的有意义的结论,对拉曼光谱数据进行解释通常是困难的,如果在对光谱数据进行解构时不小心,可能会产生误导。我们的小组之前已经证明,在对拉曼光谱数据进行解构时,采用分组和基限制非负矩阵分解(GBR-NMF)框架作为比主成分分析(PCA)等更广泛使用的降维技术的替代方法,该方法与细胞和组织数据中的辐射反应监测有关。虽然这种方法提供了拉曼光谱数据的更好的生物学可解释性,但为了提供最稳健的 GBR-NMF 模型,必须考虑一些重要因素。我们在这里评估和比较了 GBR-NMF 模型在重建三种已知浓度混合溶液的准确性。评估的因素包括固体与溶液基谱的影响、模型中使用的无约束分量的数量、不同信号噪声阈值的容忍度以及不同生化组彼此之间的比较。通过评估模型如何反映溶液混合物中每个生化物质的相对浓度来评估模型的稳健性。我们还评估了模型在不包括无约束分量的情况下重建原始数据的能力。总的来说,我们发现,对于所有生化组,在 GBR-NMF 模型中,固体基谱通常与溶液基谱相当。发现该模型对于使用固体基谱的混合物溶液中的高噪声水平具有相对的容忍度。此外,在混合物中包含无约束分量不会对解构产生重大影响,前提是混合物中的所有生化物质都包含在模型的基化学物质中。我们还报告说,一些生化组使用 GBR-NMF 实现了更准确的解构,这可能是由于个体基谱的相似性。