School of Information and Communication Engineering, Hainan University, Haikou, 570228, China.
Australian AI Institute, School of Computer Science, FEIT, University of Technology Sydney, Sydney, 2008, NSW, Australia.
Med Biol Eng Comput. 2024 Sep;62(9):2839-2852. doi: 10.1007/s11517-024-03101-3. Epub 2024 May 3.
Retinal optical coherence tomography (OCT) images provide crucial insights into the health of the posterior ocular segment. Therefore, the advancement of automated image analysis methods is imperative to equip clinicians and researchers with quantitative data, thereby facilitating informed decision-making. The application of deep learning (DL)-based approaches has gained extensive traction for executing these analysis tasks, demonstrating remarkable performance compared to labor-intensive manual analyses. However, the acquisition of retinal OCT images often presents challenges stemming from privacy concerns and the resource-intensive labeling procedures, which contradicts the prevailing notion that DL models necessitate substantial data volumes for achieving superior performance. Moreover, limitations in available computational resources constrain the progress of high-performance medical artificial intelligence, particularly in less developed regions and countries. This paper introduces a novel ensemble learning mechanism designed for recognizing retinal diseases under limited resources (e.g., data, computation). The mechanism leverages insights from multiple pre-trained models, facilitating the transfer and adaptation of their knowledge to retinal OCT images. This approach establishes a robust model even when confronted with limited labeled data, eliminating the need for an extensive array of parameters, as required in learning from scratch. Comprehensive experimentation on real-world datasets demonstrates that the ensemble models constructed by the proposed ensemble method show superior performance over the baseline models under sparse labeled data, especially the triple ensemble model, which achieves the accuracy of 92.06%, which is 8.27%, 7.99%, and 11.14% better than the three baseline models, respectively. In addition, compared with the three baseline models learned from scratch, the triple ensemble model has fewer trainable parameters, only 3.677M, which is lower than the three baseline models of 8.013M, 4.302M, and 20.158M, respectively.
视网膜光学相干断层扫描(OCT)图像为后眼部段的健康状况提供了重要的见解。因此,推进自动化图像分析方法对于为临床医生和研究人员提供定量数据至关重要,从而有助于做出明智的决策。基于深度学习(DL)的方法在执行这些分析任务方面得到了广泛的应用,与劳动密集型的手动分析相比,其性能显著提高。然而,视网膜 OCT 图像的获取常常面临隐私问题和资源密集型标记过程的挑战,这与 DL 模型需要大量数据才能实现卓越性能的观点相悖。此外,可用计算资源的限制限制了高性能医疗人工智能的进展,特别是在欠发达地区和国家。本文介绍了一种新的集成学习机制,用于在资源有限的情况下(例如,数据、计算)识别视网膜疾病。该机制利用了来自多个预训练模型的见解,促进了它们的知识转移和适应到视网膜 OCT 图像。即使在面临有限标记数据的情况下,这种方法也可以建立一个稳健的模型,而不需要像从头开始学习那样使用大量的参数。在真实数据集上的综合实验表明,与稀疏标记数据下的基线模型相比,所提出的集成方法构建的集成模型表现出更好的性能,特别是三重集成模型,其准确率达到 92.06%,分别比三个基线模型提高了 8.27%、7.99%和 11.14%。此外,与从头开始学习的三个基线模型相比,三重集成模型的可训练参数更少,只有 3.677M,低于三个基线模型的 8.013M、4.302M 和 20.158M。