1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland.
2 Diagnostic Imaging Center, Kuopio University Hospital, Kuopio, Finland.
Appl Spectrosc. 2017 Oct;71(10):2253-2262. doi: 10.1177/0003702817726766. Epub 2017 Aug 22.
Near-infrared (NIR) spectroscopy has been successful in nondestructive assessment of biological tissue properties, such as stiffness of articular cartilage, and is proposed to be used in clinical arthroscopies. Near-infrared spectroscopic data include absorbance values from a broad wavelength region resulting in a large number of contributing factors. This broad spectrum includes information from potentially noisy variables, which may contribute to errors during regression analysis. We hypothesized that partial least squares regression (PLSR) is an optimal multivariate regression technique and requires application of variable selection methods to further improve the performance of NIR spectroscopy-based prediction of cartilage tissue properties, including instantaneous, equilibrium, and dynamic moduli and cartilage thickness. To test this hypothesis, we conducted for the first time a comparative analysis of multivariate regression techniques, which included principal component regression (PCR), PLSR, ridge regression, least absolute shrinkage and selection operator (Lasso), and least squares version of support vector machines (LS-SVM) on NIR spectral data of equine articular cartilage. Additionally, we evaluated the effect of variable selection methods, including Monte Carlo uninformative variable elimination (MC-UVE), competitive adaptive reweighted sampling (CARS), variable combination population analysis (VCPA), backward interval PLS (BiPLS), genetic algorithm (GA), and jackknife, on the performance of the optimal regression technique. The PLSR technique was found as an optimal regression tool (R = 75.6%, R = 64.9%) for cartilage NIR data; variable selection methods simplified the prediction models enabling the use of lesser number of regression components. However, the improvements in model performance with variable selection methods were found to be statistically insignificant. Thus, the PLSR technique is recommended as the regression tool for multivariate analysis for prediction of articular cartilage properties from its NIR spectra.
近红外(NIR)光谱技术已成功应用于生物组织特性的无损评估,例如关节软骨的硬度,并被提议用于临床关节镜检查。近红外光谱数据包括来自宽波长区域的吸光度值,导致许多因素的贡献。该宽频谱包括来自潜在噪声变量的信息,这可能导致回归分析过程中的误差。我们假设偏最小二乘回归(PLSR)是一种最优的多元回归技术,并且需要应用变量选择方法来进一步提高基于 NIR 光谱的软骨组织特性预测的性能,包括瞬时、平衡和动态模量以及软骨厚度。为了验证这一假设,我们首次对多元回归技术进行了比较分析,包括主成分回归(PCR)、PLSR、岭回归、最小绝对值收缩和选择算子(Lasso)以及最小二乘支持向量机(LS-SVM),应用于马关节软骨的 NIR 光谱数据。此外,我们还评估了变量选择方法的效果,包括蒙特卡罗无信息变量消除(MC-UVE)、竞争自适应重加权采样(CARS)、变量组合种群分析(VCPA)、反向区间偏最小二乘(BiPLS)、遗传算法(GA)和jackknife,对最优回归技术的性能的影响。发现 PLSR 技术是软骨 NIR 数据的最优回归工具(R=75.6%,R=64.9%);变量选择方法简化了预测模型,使得能够使用更少的回归分量。然而,发现使用变量选择方法提高模型性能的效果在统计学上并不显著。因此,建议使用 PLSR 技术作为回归工具,用于从 NIR 光谱预测关节软骨特性的多元分析。