IEEE J Biomed Health Inform. 2014 Sep;18(5):1717-28. doi: 10.1109/JBHI.2013.2294635.
This paper presents a computer-aided screening system (DREAM) that analyzes fundus images with varying illumination and fields of view, and generates a severity grade for diabetic retinopathy (DR) using machine learning. Classifiers such as the Gaussian Mixture model (GMM), k-nearest neighbor (kNN), support vector machine (SVM), and AdaBoost are analyzed for classifying retinopathy lesions from nonlesions. GMM and kNN classifiers are found to be the best classifiers for bright and red lesion classification, respectively. A main contribution of this paper is the reduction in the number of features used for lesion classification by feature ranking using Adaboost where 30 top features are selected out of 78. A novel two-step hierarchical classification approach is proposed where the nonlesions or false positives are rejected in the first step. In the second step, the bright lesions are classified as hard exudates and cotton wool spots, and the red lesions are classified as hemorrhages and micro-aneurysms. This lesion classification problem deals with unbalanced datasets and SVM or combination classifiers derived from SVM using the Dempster-Shafer theory are found to incur more classification error than the GMM and kNN classifiers due to the data imbalance. The DR severity grading system is tested on 1200 images from the publicly available MESSIDOR dataset. The DREAM system achieves 100% sensitivity, 53.16% specificity, and 0.904 AUC, compared to the best reported 96% sensitivity, 51% specificity, and 0.875 AUC, for classifying images as with or without DR. The feature reduction further reduces the average computation time for DR severity per image from 59.54 to 3.46 s.
本文提出了一种计算机辅助筛查系统(DREAM),可分析具有不同光照和视场的眼底图像,并使用机器学习为糖尿病视网膜病变(DR)生成严重程度等级。分析了高斯混合模型(GMM)、k-最近邻(kNN)、支持向量机(SVM)和 AdaBoost 等分类器,以从非病变中分类视网膜病变病变。发现 GMM 和 kNN 分类器分别是用于分类明亮和红色病变的最佳分类器。本文的主要贡献是通过使用 Adaboost 对特征进行排名来减少用于病变分类的特征数量,其中从 78 个特征中选择了 30 个最佳特征。提出了一种新颖的两步分层分类方法,在第一步中拒绝非病变或假阳性。在第二步中,将明亮的病变分类为硬性渗出物和棉絮斑,将红色病变分类为出血和微动脉瘤。这个病变分类问题涉及到不平衡数据集,并且使用 Dempster-Shafer 理论从 SVM 派生的 SVM 或组合分类器由于数据不平衡而导致更多的分类错误,比 GMM 和 kNN 分类器。DR 严重程度分级系统在公开的 MESSIDOR 数据集上进行了 1200 张图像的测试。与报告的最佳分类图像为有或无 DR 的 96%灵敏度、51%特异性和 0.875 AUC 相比,DREAM 系统的灵敏度为 100%,特异性为 53.16%,AUC 为 0.904。特征减少进一步将每张 DR 严重程度图像的平均计算时间从 59.54 秒减少到 3.46 秒。