Fraccaro Paolo, Nicolo Massimo, Bonetto Monica, Giacomini Mauro, Weller Peter, Traverso Carlo Enrico, Prosperi Mattia, OSullivan Dympna
Centre for Health Informatics, City University London, London, UK.
Centre for Health Informatics, University of Manchester, Manchester, UK.
BMC Ophthalmol. 2015 Jan 27;15:10. doi: 10.1186/1471-2415-15-10.
To investigate machine learning methods, ranging from simpler interpretable techniques to complex (non-linear) "black-box" approaches, for automated diagnosis of Age-related Macular Degeneration (AMD).
Data from healthy subjects and patients diagnosed with AMD or other retinal diseases were collected during routine visits via an Electronic Health Record (EHR) system. Patients' attributes included demographics and, for each eye, presence/absence of major AMD-related clinical signs (soft drusen, retinal pigment epitelium, defects/pigment mottling, depigmentation area, subretinal haemorrhage, subretinal fluid, macula thickness, macular scar, subretinal fibrosis). Interpretable techniques known as white box methods including logistic regression and decision trees as well as less interpreitable techniques known as black box methods, such as support vector machines (SVM), random forests and AdaBoost, were used to develop models (trained and validated on unseen data) to diagnose AMD. The gold standard was confirmed diagnosis of AMD by physicians. Sensitivity, specificity and area under the receiver operating characteristic (AUC) were used to assess performance.
Study population included 487 patients (912 eyes). In terms of AUC, random forests, logistic regression and adaboost showed a mean performance of (0.92), followed by SVM and decision trees (0.90). All machine learning models identified soft drusen and age as the most discriminating variables in clinicians' decision pathways to diagnose AMD.
Both black-box and white box methods performed well in identifying diagnoses of AMD and their decision pathways. Machine learning models developed through the proposed approach, relying on clinical signs identified by retinal specialists, could be embedded into EHR to provide physicians with real time (interpretable) support.
研究从更简单的可解释技术到复杂的(非线性)“黑箱”方法等机器学习方法,用于年龄相关性黄斑变性(AMD)的自动诊断。
通过电子健康记录(EHR)系统在常规就诊期间收集健康受试者以及被诊断患有AMD或其他视网膜疾病患者的数据。患者的属性包括人口统计学信息,以及每只眼睛是否存在主要的AMD相关临床体征(软性玻璃膜疣、视网膜色素上皮、缺损/色素斑驳、色素脱失区、视网膜下出血、视网膜下液、黄斑厚度、黄斑瘢痕、视网膜下纤维化)。使用称为白盒方法的可解释技术(包括逻辑回归和决策树)以及称为黑盒方法的较难解释的技术(如支持向量机(SVM)、随机森林和AdaBoost)来开发模型(在未见数据上进行训练和验证)以诊断AMD。金标准是医生确诊的AMD。使用敏感性、特异性和受试者工作特征曲线下面积(AUC)来评估性能。
研究人群包括487名患者(912只眼睛)。就AUC而言,随机森林、逻辑回归和AdaBoost的平均表现为(0.92),其次是SVM和决策树(0.90)。所有机器学习模型都将软性玻璃膜疣和年龄确定为临床医生诊断AMD决策途径中最具区分性的变量。
黑盒和白盒方法在识别AMD诊断及其决策途径方面均表现良好。通过所提出的方法开发的机器学习模型,依赖于视网膜专家识别的临床体征,可以嵌入到EHR中,为医生提供实时(可解释)支持。