Massachusetts Institute of Technology, G. R. Harrison Spectroscopy Laboratory, Laser Biomedical Research Center, Cambridge, Massachusetts 02139, USA.
J Biomed Opt. 2011 Aug;16(8):087009. doi: 10.1117/1.3611006.
While Raman spectroscopy provides a powerful tool for noninvasive and real time diagnostics of biological samples, its translation to the clinical setting has been impeded by the lack of robustness of spectroscopic calibration models and the size and cumbersome nature of conventional laboratory Raman systems. Linear multivariate calibration models employing full spectrum analysis are often misled by spurious correlations, such as system drift and covariations among constituents. In addition, such calibration schemes are prone to overfitting, especially in the presence of external interferences that may create nonlinearities in the spectra-concentration relationship. To address both of these issues we incorporate residue error plot-based wavelength selection and nonlinear support vector regression (SVR). Wavelength selection is used to eliminate uninformative regions of the spectrum, while SVR is used to model the curved effects such as those created by tissue turbidity and temperature fluctuations. Using glucose detection in tissue phantoms as a representative example, we show that even a substantial reduction in the number of wavelengths analyzed using SVR lead to calibration models of equivalent prediction accuracy as linear full spectrum analysis. Further, with clinical datasets obtained from human subject studies, we also demonstrate the prospective applicability of the selected wavelength subsets without sacrificing prediction accuracy, which has extensive implications for calibration maintenance and transfer. Additionally, such wavelength selection could substantially reduce the collection time of serial Raman acquisition systems. Given the reduced footprint of serial Raman systems in relation to conventional dispersive Raman spectrometers, we anticipate that the incorporation of wavelength selection in such hardware designs will enhance the possibility of miniaturized clinical systems for disease diagnosis in the near future.
虽然拉曼光谱为生物样本的非侵入式和实时诊断提供了强大的工具,但由于光谱校准模型的不稳定性以及传统实验室拉曼系统的体积大和笨重性质,其在临床环境中的应用受到了阻碍。采用全谱分析的线性多元校准模型经常受到虚假相关的误导,例如系统漂移和成分之间的协变。此外,这种校准方案容易过度拟合,尤其是在存在可能在光谱-浓度关系中产生非线性的外部干扰的情况下。为了解决这两个问题,我们采用基于残差误差图的波长选择和非线性支持向量回归(SVR)。波长选择用于消除光谱中无信息的区域,而 SVR 用于模拟曲线效应,例如组织浑浊和温度波动产生的效应。我们使用组织仿体中的葡萄糖检测作为代表性示例,表明即使使用 SVR 显著减少分析的波长数量,也可以得到与线性全谱分析相当的预测准确性的校准模型。此外,我们还使用从人体研究中获得的临床数据集,展示了选择的波长子集在不牺牲预测准确性的情况下的前瞻性适用性,这对校准维护和转移具有广泛的意义。此外,这种波长选择可以大大减少串行拉曼采集系统的采集时间。鉴于串行拉曼系统相对于传统色散拉曼光谱仪的占地面积较小,我们预计在这种硬件设计中纳入波长选择将增强小型化临床系统用于疾病诊断的可能性在不久的将来。