Ou Chubin, Wei Xifei, An Lin, Qin Jia, Zhu Min, Jin Mei, Kong Xiangbin
Department of Radiology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, China.
Guangdong Eye Intelligent Medical Imaging Equipment Engineering Technology Research Center, Foshan, China.
Transl Vis Sci Technol. 2024 Dec 2;13(12):31. doi: 10.1167/tvst.13.12.31.
Accurate diagnosis of retinal disease based on optical coherence tomography (OCT) requires scrutiny of both B-scan and en face images. The aim of this study was to investigate the effectiveness of fusing en face and B-scan images for better diagnostic performance of deep learning models.
A multiview fusion network (MVFN) with a decision fusion module to integrate fast-axis and slow-axis B-scans and en face information was proposed and compared with five state-of-the-art methods: a model using B-scans, a model using en face imaging, a model using three-dimensional volume, and two other relevant methods. They were evaluated using the OCTA-500 public dataset and a private multicenter dataset with 2330 cases; cases from the first center were used for training and cases from the second center were used for external validation. Performance was assessed by averaged area under the curve (AUC), accuracy, sensitivity, specificity, and precision.
In the private external test set, our MVFN achieved the highest AUC of 0.994, significantly outperforming the other models (P < 0.01). Similarly, for the OCTA-500 public dataset, our proposed method also outperformed the other methods with the highest AUC of 0.976, further demonstrating its effectiveness. Typical cases were demonstrated using activation heatmaps to illustrate the synergy of combining en face and B-scan images.
The fusion of en face and B-scan information is an effective strategy for improving the diagnostic accuracy of deep learning models.
Multiview fusion models combining B-scan and en face images demonstrate great potential in improving AI performance for retina disease diagnosis.
基于光学相干断层扫描(OCT)准确诊断视网膜疾病需要仔细检查B扫描图像和正面图像。本研究的目的是探讨融合正面图像和B扫描图像对提高深度学习模型诊断性能的有效性。
提出了一种具有决策融合模块的多视图融合网络(MVFN),用于整合快轴和慢轴B扫描图像以及正面信息,并与五种先进方法进行比较:使用B扫描图像的模型、使用正面成像的模型、使用三维容积的模型以及另外两种相关方法。使用OCTA - 500公共数据集和一个包含2330例病例的私人多中心数据集对它们进行评估;来自第一个中心的病例用于训练,来自第二个中心的病例用于外部验证。通过平均曲线下面积(AUC)、准确率、灵敏度、特异性和精确率来评估性能。
在私人外部测试集中,我们的MVFN实现了最高的AUC为0.994,显著优于其他模型(P < 0.01)。同样,对于OCTA - 500公共数据集,我们提出的方法也以最高的AUC为0.976优于其他方法,进一步证明了其有效性。使用激活热图展示了典型病例,以说明融合正面图像和B扫描图像的协同作用。
融合正面图像和B扫描信息是提高深度学习模型诊断准确性的有效策略。
结合B扫描图像和正面图像的多视图融合模型在提高人工智能对视网膜疾病诊断性能方面显示出巨大潜力。