Zhao Yijun, Healy Brian C, Rotstein Dalia, Guttmann Charles R G, Bakshi Rohit, Weiner Howard L, Brodley Carla E, Chitnis Tanuja
Department of Computer Science, Tufts University, Medford, Massachusetts, United States of America.
Partners MS Center, Brigham and Women's Hospital, Brookline, Massachusetts, United States of America.
PLoS One. 2017 Apr 5;12(4):e0174866. doi: 10.1371/journal.pone.0174866. eCollection 2017.
To explore the value of machine learning methods for predicting multiple sclerosis disease course.
1693 CLIMB study patients were classified as increased EDSS≥1.5 (worsening) or not (non-worsening) at up to five years after baseline visit. Support vector machines (SVM) were used to build the classifier, and compared to logistic regression (LR) using demographic, clinical and MRI data obtained at years one and two to predict EDSS at five years follow-up.
Baseline data alone provided little predictive value. Clinical observation for one year improved overall SVM sensitivity to 62% and specificity to 65% in predicting worsening cases. The addition of one year MRI data improved sensitivity to 71% and specificity to 68%. Use of non-uniform misclassification costs in the SVM model, weighting towards increased sensitivity, improved predictions (up to 86%). Sensitivity, specificity, and overall accuracy improved minimally with additional follow-up data. Predictions improved within specific groups defined by baseline EDSS. LR performed more poorly than SVM in most cases. Race, family history of MS, and brain parenchymal fraction, ranked highly as predictors of the non-worsening group. Brain T2 lesion volume ranked highly as predictive of the worsening group.
SVM incorporating short-term clinical and brain MRI data, class imbalance corrective measures, and classification costs may be a promising means to predict MS disease course, and for selection of patients suitable for more aggressive treatment regimens.
探讨机器学习方法在预测多发性硬化症病程中的价值。
1693名CLIMB研究患者在基线访视后长达五年的时间里被分类为扩展残疾状态量表(EDSS)增加≥1.5(病情恶化)或未增加(病情未恶化)。使用支持向量机(SVM)构建分类器,并与逻辑回归(LR)进行比较,后者使用在第一年和第二年获得的人口统计学、临床和磁共振成像(MRI)数据来预测五年随访时的EDSS。
仅基线数据的预测价值很小。在预测病情恶化病例时,一年的临床观察将SVM的总体敏感性提高到62%,特异性提高到65%。加入一年的MRI数据后,敏感性提高到71%,特异性提高到68%。在SVM模型中使用非均匀误分类成本,向提高敏感性加权,改善了预测(高达86%)。额外的随访数据使敏感性、特异性和总体准确性略有提高。在由基线EDSS定义的特定组内预测得到改善。在大多数情况下,LR的表现比SVM差。种族、MS家族史和脑实质分数在未恶化组的预测因素中排名靠前。脑T2病变体积在病情恶化组的预测中排名靠前。
结合短期临床和脑MRI数据、类别不平衡校正措施及分类成本的SVM可能是预测MS病程以及选择适合更积极治疗方案患者的一种有前景的方法。