Pan Hongyi, Miao Jingpeng, Yu Jie, Li Jingmin, Wang Xiaobing, Feng Jihong
Department of Biomedical Engineering, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing University of Technology, Beijing 100124, People's Republic of China.
Beijing Tongren Eye Center, Beijing Ophthalmology & Visual Sciences Key Lab, Beijing Tongren Hospital, Capital Medical University, Beijing 100730, People's Republic of China.
Biomed Phys Eng Express. 2025 Jul 28;11(4). doi: 10.1088/2057-1976/adeb92.
Retinal diseases such as age-related macular degeneration and diabetic retinopathy will lead to irreversible blindness without timely diagnosis and treatment. Optical coherence tomography (OCT) and optical coherence tomography angiography (OCTA) images provide complementary views of the retina, and the integration of the two imaging modalities can improve the accuracy of retinal disease classification. We propose a multi-modal classification model consisting of two branches to automatically diagnose retinal diseases, in which OCT and OCTA images are efficiently integrated to improve both the accuracy and efficiency of disease diagnosis. A bright line cropping is used to remove the useless black edge region while preserving the lesion features and reducing the calculation load. To solve the insufficient data issue, data enhancement and loose matching methods are adopted to increase the data amount. A two-step training method is used to train our proposed model, alleviating the limited training images. Our model is tested on an external test set instead of a training set, making the classification results more rigorous. The intermediate fusion and two-step training methods are adopted in our multiple classification model, achieving 0.9667, 0.9418, 0.8569, 0.9422, and 0.8921 in average accuracy, precision, recall, specificity, and F1-Score, respectively. Our multi-modal model outperforms the single-modal model, the early, and late fusion multi-modal model in accuracy. Our model offers doctors less human error, lower cost, more uniform, and effective mass screening, thus providing a solution to improve deep learning performance in terms of a relatively fewer number of training data and even more imbalanced classes.
年龄相关性黄斑变性和糖尿病性视网膜病变等视网膜疾病若不及时诊断和治疗,将导致不可逆的失明。光学相干断层扫描(OCT)和光学相干断层扫描血管造影(OCTA)图像提供了视网膜的互补视图,两种成像方式的整合可以提高视网膜疾病分类的准确性。我们提出了一种由两个分支组成的多模态分类模型来自动诊断视网膜疾病,其中OCT和OCTA图像被有效地整合,以提高疾病诊断的准确性和效率。采用亮线裁剪去除无用的黑色边缘区域,同时保留病变特征并减少计算量。为了解决数据不足的问题,采用数据增强和宽松匹配方法来增加数据量。使用两步训练方法来训练我们提出的模型,缓解训练图像有限的问题。我们的模型在外部测试集而非训练集上进行测试,使分类结果更加严格。我们的多分类模型采用中间融合和两步训练方法,平均准确率、精确率、召回率、特异性和F1分数分别达到0.9667、0.9418、0.8569、0.9422和0.8921。我们的多模态模型在准确性方面优于单模态模型、早期和晚期融合多模态模型。我们的模型为医生提供了更少的人为误差、更低的成本、更统一且有效的大规模筛查,从而在训练数据相对较少且类别更加不均衡的情况下,提供了一种提高深度学习性能的解决方案。