Li Xiaoxiao, Zhou Yuan, Dvornek Nicha C, Gu Yufeng, Ventola Pamela, Duncan James S
Biomedical Engineering, Yale University, New Haven, CT, USA.
Radiology & Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA.
Med Image Comput Comput Assist Interv. 2020;12261:792-801. doi: 10.1007/978-3-030-59710-8_77. Epub 2020 Sep 29.
Complex deep learning models have shown their impressive power in analyzing high-dimensional medical image data. To increase the trust of applying deep learning models in medical field, it is essential to understand why a particular prediction was reached. Data feature importance estimation is an important approach to understand both the model and the underlying properties of data. Shapley value explanation (SHAP) is a technique to fairly evaluate input feature importance of a given model. However, the existing SHAP-based explanation works have limitations such as 1) computational complexity, which hinders their applications on high-dimensional medical image data; 2) being sensitive to noise, which can lead to serious errors. Therefore, we propose an uncertainty estimation method for the feature importance results calculated by SHAP. Then we theoretically justify the methods under a Shapley value framework. Finally we evaluate our methods on MNIST and a public neuroimaging dataset. We show the potential of our method to discover disease related biomarkers from neuroimaging data.
复杂的深度学习模型在分析高维医学图像数据方面展现出了令人印象深刻的能力。为了提高深度学习模型在医学领域应用的可信度,理解特定预测结果的达成原因至关重要。数据特征重要性估计是理解模型和数据潜在属性的重要方法。夏普利值解释(SHAP)是一种公平评估给定模型输入特征重要性的技术。然而,现有的基于SHAP的解释工作存在局限性,例如:1)计算复杂度,这阻碍了它们在高维医学图像数据上的应用;2)对噪声敏感,这可能导致严重错误。因此,我们针对SHAP计算的特征重要性结果提出了一种不确定性估计方法。然后我们在夏普利值框架下从理论上证明了这些方法的合理性。最后,我们在MNIST和一个公共神经影像数据集上评估了我们的方法。我们展示了我们的方法从神经影像数据中发现疾病相关生物标志物的潜力。