Gao Long, Wu Shandong
College of Computer, National University of Defense Technology, Changsha 410073, China; Department of Radiology, School of Medicine, University of Pittsburgh, 4200 Fifth Ave, Pittsburgh, PA 15260, USA.
Department of Radiology, School of Medicine, University of Pittsburgh, 4200 Fifth Ave, Pittsburgh, PA 15260, USA; Department of Biomedical Informatics, University of Pittsburgh, 4200 Fifth Ave, Pittsburgh, PA 15260, USA; Department of Bioengineering, University of Pittsburgh, 4200 Fifth Ave, Pittsburgh, PA 15260, USA; Intelligent Systems Program, University of Pittsburgh, 4200 Fifth Ave, Pittsburgh, PA 15260, USA.
J Biomed Inform. 2020 Jul;107:103442. doi: 10.1016/j.jbi.2020.103442. Epub 2020 May 22.
Deep learning Convolutional Neural Networks have achieved remarkable performance in a variety of classification tasks. The data-driven nature of deep learning indicates that a model behaves in response to the data used to train the model, and the quality of datasets may lead to substantial influence on the model's performance, especially when dealing with complicated clinical images. In this paper, we propose a simple and novel method to investigate and quantify a deep learning model's response with respect to a given sample, allowing us to detect out-of-distribution samples based on a newly proposed metric, Response Score. The key idea is that samples belonging to different classes may have different degrees of influence on a model. We quantify the resulting consequence of a single sample to a trained-model and relate the quantitative measure of the consequence (by the Response Score) to detect the out-of-distribution samples. The proposed method can find multiple applications such as (1) recognizing abnormal samples, (2) detecting mixed-domain data, and (3) identifying mislabeled data. We present extensive experiments on the three different applications using four biomedical imaging datasets. Experimental results show that our method exhibits remarkable performance and outperforms the compared methods.
深度学习卷积神经网络在各种分类任务中取得了显著的性能。深度学习的数据驱动性质表明,模型的行为是对用于训练模型的数据的响应,并且数据集的质量可能对模型的性能产生重大影响,特别是在处理复杂的临床图像时。在本文中,我们提出了一种简单而新颖的方法来研究和量化深度学习模型对给定样本的响应,使我们能够基于新提出的指标“响应分数”检测分布外样本。关键思想是属于不同类别的样本对模型可能有不同程度的影响。我们量化单个样本对训练模型的结果,并将结果的定量度量(通过响应分数)关联起来以检测分布外样本。所提出的方法可以找到多种应用,例如(1)识别异常样本,(2)检测混合域数据,以及(3)识别错误标记的数据。我们使用四个生物医学成像数据集对这三种不同应用进行了广泛的实验。实验结果表明,我们的方法表现出卓越的性能,优于比较方法。