Zedadra Amina, Salah-Salah Mahmoud Yassine, Zedadra Ouarda, Guerrieri Antonio
LabSTIC Laboratory, University 8 May 1945 Guelma, Algeria, BP 401, Guelma 24000, Algeria.
Medical-Surgical Ophthalmology Office, 284 CNEP Apartments, University Road, Block 37, No. 1, Guelma 24006, Algeria.
Sensors (Basel). 2025 Jul 19;25(14):4492. doi: 10.3390/s25144492.
Ocular diseases can significantly affect vision and overall quality of life, with diagnosis often being time-consuming and dependent on expert interpretation. While previous computer-aided diagnostic systems have focused primarily on medical imaging, this paper proposes VisionTrack, a multi-modal AI system for predicting multiple retinal diseases, including Diabetic Retinopathy (DR), Age-related Macular Degeneration (AMD), Diabetic Macular Edema (DME), drusen, Central Serous Retinopathy (CSR), and Macular Hole (MH), as well as normal cases. The proposed framework integrates a Convolutional Neural Network (CNN) for image-based feature extraction, a Graph Neural Network (GNN) to model complex relationships among clinical risk factors, and a Large Language Model (LLM) to process patient medical reports. By leveraging diverse data sources, VisionTrack improves prediction accuracy and offers a more comprehensive assessment of retinal health. Experimental results demonstrate the effectiveness of this hybrid system, highlighting its potential for early detection, risk assessment, and personalized ophthalmic care. Experiments were conducted using two publicly available datasets, RetinalOCT and RFMID, which provide diverse retinal imaging modalities: OCT images and fundus images, respectively. The proposed multi-modal AI system demonstrated strong performance in multi-label disease prediction. On the RetinalOCT dataset, the model achieved an accuracy of 0.980, F1-score of 0.979, recall of 0.978, and precision of 0.979. Similarly, on the RFMID dataset, it reached an accuracy of 0.989, F1-score of 0.881, recall of 0.866, and precision of 0.897. These results confirm the robustness, reliability, and generalization capability of the proposed approach across different imaging modalities.
眼部疾病会严重影响视力和整体生活质量,其诊断往往耗时且依赖专家解读。虽然以往的计算机辅助诊断系统主要集中在医学成像方面,但本文提出了VisionTrack,这是一个多模态人工智能系统,用于预测多种视网膜疾病,包括糖尿病视网膜病变(DR)、年龄相关性黄斑变性(AMD)、糖尿病性黄斑水肿(DME)、玻璃膜疣、中心性浆液性视网膜病变(CSR)和黄斑裂孔(MH),以及正常病例。所提出的框架集成了用于基于图像的特征提取的卷积神经网络(CNN)、用于对临床风险因素之间的复杂关系进行建模的图神经网络(GNN),以及用于处理患者医疗报告的大语言模型(LLM)。通过利用多样化的数据源,VisionTrack提高了预测准确性,并对视网膜健康提供了更全面的评估。实验结果证明了这种混合系统的有效性,突出了其在早期检测、风险评估和个性化眼科护理方面的潜力。实验使用了两个公开可用的数据集RetinalOCT和RFMID进行,这两个数据集分别提供了不同的视网膜成像模态:OCT图像和眼底图像。所提出的多模态人工智能系统在多标签疾病预测中表现出强大的性能。在RetinalOCT数据集上,该模型的准确率为0.980,F1分数为0.979,召回率为0.978,精确率为0.979。同样,在RFMID数据集上,它的准确率为0.989,F1分数为0.881,召回率为0.866,精确率为0.897。这些结果证实了所提出方法在不同成像模态下的稳健性、可靠性和泛化能力。