Verma Tanvi, Jin Liyuan, Zhou Jun, Huang Jia, Tan Mingrui, Choong Benjamin Chen Ming, Tan Ting Fang, Gao Fei, Xu Xinxing, Ting Daniel S, Liu Yong
Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore, Singapore.
Artificial Intelligence and Digital Innovation Research Group, Singapore Eye Research Institute, Singapore, Singapore.
Front Med (Lausanne). 2023 Aug 14;10:1227515. doi: 10.3389/fmed.2023.1227515. eCollection 2023.
The implementation of deep learning models for medical image classification poses significant challenges, including gradual performance degradation and limited adaptability to new diseases. However, frequent retraining of models is unfeasible and raises concerns about healthcare privacy due to the retention of prior patient data. To address these issues, this study investigated privacy-preserving continual learning methods as an alternative solution.
We evaluated twelve privacy-preserving non-storage continual learning algorithms based deep learning models for classifying retinal diseases from public optical coherence tomography (OCT) images, in a class-incremental learning scenario. The OCT dataset comprises 108,309 OCT images. Its classes include normal (47.21%), drusen (7.96%), choroidal neovascularization (CNV) (34.35%), and diabetic macular edema (DME) (10.48%). Each class consisted of 250 testing images. For continuous training, the first task involved CNV and normal classes, the second task focused on DME class, and the third task included drusen class. All selected algorithms were further experimented with different training sequence combinations. The final model's average class accuracy was measured. The performance of the joint model obtained through retraining and the original finetune model without continual learning algorithms were compared. Additionally, a publicly available medical dataset for colon cancer detection based on histology slides was selected as a proof of concept, while the CIFAR10 dataset was included as the continual learning benchmark.
Among the continual learning algorithms, Brain-inspired-replay (BIR) outperformed the others in the continual learning-based classification of retinal diseases from OCT images, achieving an accuracy of 62.00% (95% confidence interval: 59.36-64.64%), with consistent top performance observed in different training sequences. For colon cancer histology classification, Efficient Feature Transformations (EFT) attained the highest accuracy of 66.82% (95% confidence interval: 64.23-69.42%). In comparison, the joint model achieved accuracies of 90.76% and 89.28%, respectively. The finetune model demonstrated catastrophic forgetting in both datasets.
Although the joint retraining model exhibited superior performance, continual learning holds promise in mitigating catastrophic forgetting and facilitating continual model updates while preserving privacy in healthcare deep learning models. Thus, it presents a highly promising solution for the long-term clinical deployment of such models.
深度学习模型在医学图像分类中的应用面临重大挑战,包括性能逐渐下降以及对新疾病的适应性有限。然而,频繁重新训练模型并不可行,且由于保留先前患者数据,引发了对医疗隐私的担忧。为解决这些问题,本研究调查了隐私保护持续学习方法作为替代解决方案。
在类别增量学习场景中,我们评估了基于深度学习模型的十二种隐私保护非存储持续学习算法,用于从公共光学相干断层扫描(OCT)图像中对视网膜疾病进行分类。OCT数据集包含108,309张OCT图像。其类别包括正常(47.21%)、玻璃膜疣(7.96%)、脉络膜新生血管(CNV)(34.35%)和糖尿病性黄斑水肿(DME)(10.48%)。每个类别由250张测试图像组成。对于持续训练,第一个任务涉及CNV和正常类别,第二个任务聚焦于DME类别,第三个任务包括玻璃膜疣类别。所有选定算法进一步用不同的训练序列组合进行实验。测量最终模型的平均类别准确率。比较通过重新训练获得的联合模型和没有持续学习算法的原始微调模型的性能。此外,选择一个基于组织学切片的用于结肠癌检测的公开可用医学数据集作为概念验证,同时纳入CIFAR10数据集作为持续学习基准。
在持续学习算法中,受脑启发的重放(BIR)在基于持续学习的从OCT图像中对视网膜疾病进行分类方面优于其他算法,准确率达到62.00%(95%置信区间:59.36 - 64.64%),在不同训练序列中均观察到一致的最佳性能。对于结肠癌组织学分类,高效特征变换(EFT)达到最高准确率66.82%(95%置信区间:64.23 - 69.42%)。相比之下,联合模型的准确率分别为90.76%和89.28%。微调模型在两个数据集中均表现出灾难性遗忘。
尽管联合重新训练模型表现出卓越性能,但持续学习在减轻灾难性遗忘以及促进模型持续更新同时在医疗深度学习模型中保护隐私方面具有潜力。因此,它为这类模型的长期临床部署提供了一个非常有前景的解决方案。