Kurz Alexander, Hauser Katja, Mehrtens Hendrik Alexander, Krieghoff-Henning Eva, Hekler Achim, Kather Jakob Nikolas, Fröhling Stefan, von Kalle Christof, Brinker Titus Josef
Digital Biomarkers for Oncology Group, German Cancer Research Center (DKFZ), Heidelberg, Germany.
Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany.
JMIR Med Inform. 2022 Aug 2;10(8):e36427. doi: 10.2196/36427.
Deep neural networks are showing impressive results in different medical image classification tasks. However, for real-world applications, there is a need to estimate the network's uncertainty together with its prediction.
In this review, we investigate in what form uncertainty estimation has been applied to the task of medical image classification. We also investigate which metrics are used to describe the effectiveness of the applied uncertainty estimation.
Google Scholar, PubMed, IEEE Xplore, and ScienceDirect were screened for peer-reviewed studies, published between 2016 and 2021, that deal with uncertainty estimation in medical image classification. The search terms "uncertainty," "uncertainty estimation," "network calibration," and "out-of-distribution detection" were used in combination with the terms "medical images," "medical image analysis," and "medical image classification."
A total of 22 papers were chosen for detailed analysis through the systematic review process. This paper provides a table for a systematic comparison of the included works with respect to the applied method for estimating the uncertainty.
The applied methods for estimating uncertainties are diverse, but the sampling-based methods Monte-Carlo Dropout and Deep Ensembles are used most frequently. We concluded that future works can investigate the benefits of uncertainty estimation in collaborative settings of artificial intelligence systems and human experts.
INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR2-10.2196/11936.
深度神经网络在不同的医学图像分类任务中展现出了令人瞩目的成果。然而,对于实际应用而言,需要在网络预测的同时估计其不确定性。
在本综述中,我们研究不确定性估计是以何种形式应用于医学图像分类任务的。我们还研究了使用哪些指标来描述所应用的不确定性估计的有效性。
在谷歌学术、PubMed、IEEE Xplore和ScienceDirect数据库中筛选2016年至2021年间发表的关于医学图像分类中不确定性估计的同行评审研究。搜索词“不确定性”“不确定性估计”“网络校准”和“分布外检测”与“医学图像”“医学图像分析”和“医学图像分类”等词组合使用。
通过系统综述过程,共选择了22篇论文进行详细分析。本文提供了一个表格,用于对纳入研究在估计不确定性的应用方法方面进行系统比较。
用于估计不确定性的应用方法多种多样,但基于采样的方法蒙特卡洛随机失活和深度集成方法使用最为频繁。我们得出结论,未来的研究可以探讨在人工智能系统与人类专家的协作环境中不确定性估计的益处。
国际注册报告识别码(IRRID):RR2-10.2196/11936。