Senger Ryan S, Robertson John L
Department of Biological Systems Engineering, Virginia Polytechnic Institute and State University (Virginia Tech), Blacksburg, VA, United States of America.
Department of Chemical Engineering, Virginia Polytechnic Institute and State University (Virginia Tech), Blacksburg, VA, United States of America.
PeerJ. 2020 Jan 6;8:e8179. doi: 10.7717/peerj.8179. eCollection 2020.
Existing tools for chemometric analysis of vibrational spectroscopy data have enabled characterization of materials and biologicals by their broad molecular composition. The Rametrix LITE Toolbox v1.0 for MATLAB is one such tool available publicly. It applies discriminant analysis of principal components (DAPC) to spectral data to classify spectra into user-defined groups. However, additional functionality is needed to better evaluate the predictive capabilities of these models when "unknown" samples are introduced. Here, the Rametrix PRO Toolbox v1.0 is introduced to provide this capability.
The Rametrix PRO Toolbox v1.0 was constructed for MATLAB and works with the Rametrix LITE Toolbox v1.0. It performs leave-one-out analysis of chemometric DAPC models and reports predictive capabilities in terms of accuracy, sensitivity (true-positives), and specificity (true-negatives). RametrixPRO is available publicly through GitHub under license agreement at: https://github.com/SengerLab/RametrixPROToolbox. Rametrix PRO was used to validate Rametrix LITE models used to detect chronic kidney disease (CKD) in spectra of urine obtained by Raman spectroscopy. The dataset included Raman spectra of urine from 20 healthy individuals and 31 patients undergoing peritoneal dialysis treatment for CKD.
The number of spectral principal components (PCs) used in building the DAPC model impacted the model accuracy, sensitivity, and specificity in leave-one-out analyses. For the dataset in this study, using 35 PCs in the DAPC model resulted in 100% accuracy, sensitivity, and specificity in classifying an unknown Raman spectrum of urine as belonging to a CKD patient or a healthy volunteer. Models built with fewer or greater number of PCs showed inferior performance, which demonstrated the value of Rametrix PRO in evaluating chemometric models constructed with Rametrix LITE.
现有的振动光谱数据化学计量分析工具能够通过材料和生物制品的广泛分子组成对其进行表征。用于MATLAB的Rametrix LITE Toolbox v1.0就是这样一种公开可用的工具。它将主成分判别分析(DAPC)应用于光谱数据,以将光谱分类为用户定义的组。然而,当引入“未知”样本时,需要额外的功能来更好地评估这些模型的预测能力。在此,引入了Rametrix PRO Toolbox v1.0以提供此功能。
Rametrix PRO Toolbox v1.0是为MATLAB构建的,并与Rametrix LITE Toolbox v1.0配合使用。它对化学计量DAPC模型进行留一法分析,并报告预测能力的准确性、灵敏度(真阳性)和特异性(真阴性)。RametrixPRO可通过GitHub在许可协议下公开获取,网址为:https://github.com/SengerLab/RametrixPROToolbox。Rametrix PRO用于验证用于通过拉曼光谱检测尿液光谱中慢性肾病(CKD)的Rametrix LITE模型。该数据集包括20名健康个体和31名接受CKD腹膜透析治疗的患者的尿液拉曼光谱。
构建DAPC模型时使用的光谱主成分(PC)数量会影响留一法分析中的模型准确性、灵敏度和特异性。对于本研究中的数据集,在DAPC模型中使用35个PC可在将未知尿液拉曼光谱分类为属于CKD患者或健康志愿者时实现100%的准确性、灵敏度和特异性。使用较少或较多PC构建的模型表现较差,这证明了Rametrix PRO在评估用Rametrix LITE构建的化学计量模型方面的价值。