Vakanski Aleksandar, Xian Min
Department of Nuclear Engineering and Industrial Management, University of Idaho, Idaho Falls, USA.
Department of Computer Science, University of Idaho, Idaho Falls, USA.
IEEE Int Workshop Mach Learn Signal Process. 2021 Oct;2021. doi: 10.1109/mlsp52302.2021.9596501. Epub 2021 Nov 15.
The generalization error of deep learning models for medical image analysis often increases on images collected with different devices for data acquisition, device settings, or patient population. A better understanding of the generalization capacity on new images is crucial for clinicians' trustworthiness. Although significant efforts have been recently directed toward establishing generalization bounds and complexity measures, there is still a significant discrepancy between the predicted and actual generalization performance. As well, related large empirical studies have been primarily based on validation with general-purpose image datasets. This paper presents an empirical study that investigates the correlation between 25 complexity measures and the generalization abilities of deep learning classifiers for breast ultrasound images. The results indicate that PAC-Bayes flatness and path norm measures produce the most consistent explanation for the combination of models and data. We also report that multi-task classification and segmentation approach for breast images is conducive toward improved generalization.
深度学习模型在医学图像分析中的泛化误差,在使用不同设备采集的数据、不同设备设置或不同患者群体的图像上,往往会增加。更好地理解新图像上的泛化能力对于临床医生的可信度至关重要。尽管最近人们付出了巨大努力来建立泛化界限和复杂度度量,但预测的泛化性能与实际泛化性能之间仍存在显著差异。同样,相关的大型实证研究主要基于对通用图像数据集的验证。本文提出了一项实证研究,该研究调查了25种复杂度度量与乳腺超声图像深度学习分类器泛化能力之间的相关性。结果表明,PAC-贝叶斯平坦度和路径范数度量对模型和数据的组合给出了最一致的解释。我们还报告说,乳腺图像的多任务分类和分割方法有助于提高泛化能力。