School of Instrumentation and Optoelectronic Engineering, Precision Opto-Mechatronics Technology Key Laboratory of Education Ministry, Beihang University, Beijing 100191, China.
Department of Neurosurgery, Air Force Medical Center, PLA, Beijing 100142, China.
Anal Methods. 2023 Aug 3;15(30):3661-3674. doi: 10.1039/d3ay00748k.
Raman spectroscopy is a promising diagnostic tool for brain gliomas, owing to its non-invasive and high information density properties. However, identifying patterns in glioma cancer tissue and healthy tissue in the brain is challenging, and outlier spectra resulting from operator error or changes in external conditions can compromise the model's robustness and generalizability to new data. Given the heterogeneity of glioma tissue, the within-group variance of data obtained by a portable Raman spectrometer is relatively high, and inconsistencies in instrument repeatability and experimental conditions can lead to an incompact distribution of non-outlier points, complicating outlier detection. Strict outlier criteria may result in the deletion of non-outlier points, leading to reduced sample utilization. To address these issues, we propose the SPCN outlier detection algorithm, which segments and prunes a competitive network to extract global outlier features, identifies topological errors, and divides initial outlier domains using the α-β region segmentation method. The algorithm also proposes a two-stage pruning method based on the characteristics of the manifold map and visualizes the outlier measure using a normalized histogram. Compared to traditional methods, SPCN is label-free and does not require an estimation of outlier distance threshold or data distribution density. We compared the accuracy of six outlier detection algorithms using Raman spectra collected from brain glioma tissues of 113 patients and examined changes in pattern recognition accuracy after removing the outliers, confirming the precision and robustness of SPCN. This method has the potential to enhance the accuracy and reliability of glioma diagnosis Raman spectroscopy and can also be applied to outlier detection in other spectra such as near infrared and middle infrared.
拉曼光谱学是一种很有前途的脑胶质瘤诊断工具,因为它具有非侵入性和高信息密度的特性。然而,识别脑胶质瘤癌症组织和健康组织中的模式是具有挑战性的,由于操作人员的错误或外部条件的变化而产生的异常谱可能会影响模型的鲁棒性和对新数据的泛化能力。鉴于胶质瘤组织的异质性,便携式拉曼光谱仪获得的数据的组内方差相对较高,仪器重复性和实验条件的不一致性会导致非异常点的分布不紧凑,从而使异常值检测变得复杂。严格的异常值标准可能会导致非异常点被删除,从而减少样本的利用率。为了解决这些问题,我们提出了 SPCN 异常值检测算法,该算法通过分割和修剪竞争网络来提取全局异常特征,识别拓扑错误,并使用α-β区域分割方法对初始异常域进行划分。该算法还提出了一种基于流形图特征的两阶段修剪方法,并使用归一化直方图可视化异常度量。与传统方法相比,SPCN 是无标签的,不需要估计异常距离阈值或数据分布密度。我们使用从 113 名脑胶质瘤患者组织中采集的拉曼光谱比较了六种异常值检测算法的准确性,并在去除异常值后检查了模式识别准确性的变化,证实了 SPCN 的精确性和稳健性。该方法有可能提高胶质瘤诊断的准确性和可靠性,也可以应用于近红外和中红外等其他光谱的异常值检测。