Casanova Ramon, Saldana Santiago, Chew Emily Y, Danis Ronald P, Greven Craig M, Ambrosius Walter T
Department of Biostatistical Sciences, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America.
National Eye Institute, National Institutes of Health [NIH], Bethesda, Maryland, United States of America.
PLoS One. 2014 Jun 18;9(6):e98587. doi: 10.1371/journal.pone.0098587. eCollection 2014.
Diabetic retinopathy (DR) is one of the leading causes of blindness in the United States and world-wide. DR is a silent disease that may go unnoticed until it is too late for effective treatment. Therefore, early detection could improve the chances of therapeutic interventions that would alleviate its effects.
Graded fundus photography and systemic data from 3443 ACCORD-Eye Study participants were used to estimate Random Forest (RF) and logistic regression classifiers. We studied the impact of sample size on classifier performance and the possibility of using RF generated class conditional probabilities as metrics describing DR risk. RF measures of variable importance are used to detect factors that affect classification performance.
Both types of data were informative when discriminating participants with or without DR. RF based models produced much higher classification accuracy than those based on logistic regression. Combining both types of data did not increase accuracy but did increase statistical discrimination of healthy participants who subsequently did or did not have DR events during four years of follow-up. RF variable importance criteria revealed that microaneurysms counts in both eyes seemed to play the most important role in discrimination among the graded fundus variables, while the number of medicines and diabetes duration were the most relevant among the systemic variables.
We have introduced RF methods to DR classification analyses based on fundus photography data. In addition, we propose an approach to DR risk assessment based on metrics derived from graded fundus photography and systemic data. Our results suggest that RF methods could be a valuable tool to diagnose DR diagnosis and evaluate its progression.
糖尿病视网膜病变(DR)是美国及全球失明的主要原因之一。DR是一种隐性疾病,在有效治疗为时已晚之前可能未被察觉。因此,早期检测可以提高进行治疗干预以减轻其影响的机会。
使用来自3443名ACCORD-Eye研究参与者的分级眼底照片和系统数据来估计随机森林(RF)和逻辑回归分类器。我们研究了样本量对分类器性能的影响,以及使用RF生成的类条件概率作为描述DR风险指标的可能性。使用RF变量重要性度量来检测影响分类性能的因素。
在区分患有或未患有DR的参与者时,两种类型的数据都具有信息价值。基于RF的模型产生的分类准确率远高于基于逻辑回归的模型。合并两种类型的数据并没有提高准确率,但确实增加了对健康参与者的统计区分度,这些参与者在四年随访期间随后是否发生DR事件。RF变量重要性标准显示,双眼微动脉瘤计数在分级眼底变量的区分中似乎起着最重要的作用,而药物数量和糖尿病病程在系统变量中最为相关。
我们已将RF方法引入基于眼底照片数据的DR分类分析中。此外,我们提出了一种基于分级眼底照片和系统数据得出的指标进行DR风险评估的方法。我们的结果表明,RF方法可能是诊断DR及其评估进展的有价值工具。