Kendrick Sarah K, Zheng Qi, Garbett Nichola C, Brock Guy N
University of Louisville, School of Public Health and Information Sciences, Department of Bioinformatics and Biostatistics, Louisville, KY, United States of America.
University of Louisville, James Graham Brown Cancer Center, Department of Medicine, Louisville, KY, United States of America.
PLoS One. 2017 Nov 9;12(11):e0186232. doi: 10.1371/journal.pone.0186232. eCollection 2017.
DSC is used to determine thermally-induced conformational changes of biomolecules within a blood plasma sample. Recent research has indicated that DSC curves (or thermograms) may have different characteristics based on disease status and, thus, may be useful as a monitoring and diagnostic tool for some diseases. Since thermograms are curves measured over a range of temperature values, they are considered functional data. In this paper we apply functional data analysis techniques to analyze differential scanning calorimetry (DSC) data from individuals from the Lupus Family Registry and Repository (LFRR). The aim was to assess the effect of lupus disease status as well as additional covariates on the thermogram profiles, and use FD analysis methods to create models for classifying lupus vs. control patients on the basis of the thermogram curves.
Thermograms were collected for 300 lupus patients and 300 controls without lupus who were matched with diseased individuals based on sex, race, and age. First, functional regression with a functional response (DSC) and categorical predictor (disease status) was used to determine how thermogram curve structure varied according to disease status and other covariates including sex, race, and year of birth. Next, functional logistic regression with disease status as the response and functional principal component analysis (FPCA) scores as the predictors was used to model the effect of thermogram structure on disease status prediction. The prediction accuracy for patients with Osteoarthritis and Rheumatoid Arthritis but without Lupus was also calculated to determine the ability of the classifier to differentiate between Lupus and other diseases. Data were divided 1000 times into separate 2/3 training and 1/3 test data for evaluation of predictions. Finally, derivatives of thermogram curves were included in the models to determine whether they aided in prediction of disease status.
Functional regression with thermogram as a functional response and disease status as predictor showed a clear separation in thermogram curve structure between cases and controls. The logistic regression model with FPCA scores as the predictors gave the most accurate results with a mean 79.22% correct classification rate with a mean sensitivity = 79.70%, and specificity = 81.48%. The model correctly classified OA and RA patients without Lupus as controls at a rate of 75.92% on average with a mean sensitivity = 79.70% and specificity = 77.6%. Regression models including FPCA scores for derivative curves did not perform as well, nor did regression models including covariates.
Changes in thermograms observed in the disease state likely reflect covalent modifications of plasma proteins or changes in large protein-protein interacting networks resulting in the stabilization of plasma proteins towards thermal denaturation. By relating functional principal components from thermograms to disease status, our Functional Principal Component Analysis model provides results that are more easily interpretable compared to prior studies. Further, the model could also potentially be coupled with other biomarkers to improve diagnostic classification for lupus.
差示扫描量热法(DSC)用于确定血浆样本中生物分子的热诱导构象变化。最近的研究表明,DSC曲线(或热谱图)可能因疾病状态而异,因此可能作为某些疾病的监测和诊断工具。由于热谱图是在一系列温度值上测量的曲线,它们被视为功能数据。在本文中,我们应用功能数据分析技术来分析来自狼疮家族登记与储存库(LFRR)个体的差示扫描量热法(DSC)数据。目的是评估狼疮疾病状态以及其他协变量对热谱图轮廓的影响,并使用功能数据分析方法创建基于热谱图曲线对狼疮患者与对照患者进行分类的模型。
收集了300名狼疮患者和300名无狼疮对照者的热谱图,这些对照者在性别、种族和年龄上与患病个体相匹配。首先,使用具有功能响应(DSC)和分类预测变量(疾病状态)的功能回归来确定热谱图曲线结构如何根据疾病状态和其他协变量(包括性别、种族和出生年份)而变化。接下来,使用以疾病状态为响应、功能主成分分析(FPCA)分数为预测变量的功能逻辑回归来模拟热谱图结构对疾病状态预测的影响。还计算了骨关节炎和类风湿关节炎但无狼疮患者的预测准确性,以确定分类器区分狼疮和其他疾病的能力。数据被1000次划分为单独的2/3训练数据和1/3测试数据用于预测评估。最后,将热谱图曲线的导数纳入模型,以确定它们是否有助于疾病状态的预测。
以热谱图为功能响应、疾病状态为预测变量的功能回归显示病例组和对照组的热谱图曲线结构有明显差异。以FPCA分数为预测变量的逻辑回归模型给出了最准确的结果,平均正确分类率为79.22%,平均灵敏度 = 79.70%,特异性 = 81.48%。该模型将无狼疮的骨关节炎和类风湿关节炎患者正确分类为对照组的平均比率为75.92%,平均灵敏度 = 79.70%,特异性 = 77.6%。包括导数曲线FPCA分数的回归模型表现不佳,包括协变量的回归模型也是如此。
在疾病状态下观察到的热谱图变化可能反映血浆蛋白的共价修饰或大的蛋白质 - 蛋白质相互作用网络的变化,从而导致血浆蛋白对热变性的稳定性增加。通过将热谱图中的功能主成分与疾病状态相关联,我们的功能主成分分析模型提供了比先前研究更易于解释的结果。此外,该模型还可能与其他生物标志物结合,以改善狼疮的诊断分类。