Guo Xutao, Ye Chenfei, Yang Yanwu, Zhang Li, Liang Li, Lu Shang, Lv Haiyan, Guo Chunjie, Ma Ting
School of Electronics and Information Engineering, Harbin Institute of Technology, Shenzhen, China.
Peng Cheng Laboratory, Shenzhen, China.
Front Neurosci. 2022 Sep 15;16:946343. doi: 10.3389/fnins.2022.946343. eCollection 2022.
Since the ambiguous boundary of the lesion and inter-observer variability, white matter hyperintensity segmentation annotations are inherently noisy and uncertain. On the other hand, the high capacity of deep neural networks (DNN) enables them to overfit labels with noise and uncertainty, which may lead to biased models with weak generalization ability. This challenge has been addressed by leveraging multiple annotations per image. However, multiple annotations are often not available in a real-world scenario. To mitigate the issue, this paper proposes a supervision augmentation method (SA) and combines it with ensemble learning (SA-EN) to improve the generalization ability of the model. SA can obtain diverse supervision information by estimating the uncertainty of annotation in a real-world scenario that per image have only one ambiguous annotation. Then different base learners in EN are trained with diverse supervision information. The experimental results on two white matter hyperintensity segmentation datasets demonstrate that SA-EN gets the optimal accuracy compared with other state-of-the-art ensemble methods. SA-EN is more effective on small datasets, which is more suitable for medical image segmentation with few annotations. A quantitative study is presented to show the effect of ensemble size and the effectiveness of the ensemble model. Furthermore, SA-EN can capture two types of uncertainty, aleatoric uncertainty modeled in SA and epistemic uncertainty modeled in EN.
由于病变边界模糊以及观察者间的差异,白质高信号分割标注本身就存在噪声且不确定。另一方面,深度神经网络(DNN)的高容量使其能够过度拟合带有噪声和不确定性的标签,这可能导致泛化能力较弱的有偏差模型。通过利用每张图像的多个标注,这一挑战已得到解决。然而,在实际场景中通常无法获得多个标注。为缓解这一问题,本文提出一种监督增强方法(SA),并将其与集成学习相结合(SA-EN)以提高模型的泛化能力。SA可以通过在每张图像只有一个模糊标注的实际场景中估计标注的不确定性来获得多样的监督信息。然后,EN中的不同基础学习器使用多样的监督信息进行训练。在两个白质高信号分割数据集上的实验结果表明,与其他现有最先进的集成方法相比,SA-EN获得了最优的准确率。SA-EN在小数据集上更有效,更适合于标注较少的医学图像分割。本文进行了定量研究以展示集成规模的影响以及集成模型的有效性。此外,SA-EN可以捕捉两种类型的不确定性,即SA中建模的偶然不确定性和EN中建模的认知不确定性。