评估用于医学影像中异常定位的显著性图的可信度。

Assessing the Trustworthiness of Saliency Maps for Localizing Abnormalities in Medical Imaging.

作者信息

Arun Nishanth, Gaw Nathan, Singh Praveer, Chang Ken, Aggarwal Mehak, Chen Bryan, Hoebel Katharina, Gupta Sharut, Patel Jay, Gidwani Mishka, Adebayo Julius, Li Matthew D, Kalpathy-Cramer Jayashree

机构信息

Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, 149 13th St, Boston, MA 02129 (N.A., P.S., K.C., M.A., B.C., K.H., S.G., J.P., M.G., M.D.L., J.K.C.); Department of Computer Science, Shiv Nadar University, Greater Noida, India (N.A.); Department of Operational Sciences, Graduate School of Engineering and Management, Air Force Institute of Technology, Wright-Patterson AFB, Dayton, Ohio (N.G.); and Massachusetts Institute of Technology, Cambridge, Mass (K.C., B.C., K.H., J.P., J.A.).

出版信息

Radiol Artif Intell. 2021 Oct 6;3(6):e200267. doi: 10.1148/ryai.2021200267. eCollection 2021 Nov.

DOI:10.1148/ryai.2021200267

PMID:34870212

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8637231/

Abstract

PURPOSE

To evaluate the trustworthiness of saliency maps for abnormality localization in medical imaging.

MATERIALS AND METHODS

Using two large publicly available radiology datasets (Society for Imaging Informatics in Medicine-American College of Radiology Pneumothorax Segmentation dataset and Radiological Society of North America Pneumonia Detection Challenge dataset), the performance of eight commonly used saliency map techniques were quantified in regard to localization utility (segmentation and detection), sensitivity to model weight randomization, repeatability, and reproducibility. Their performances versus baseline methods and localization network architectures were compared, using area under the precision-recall curve (AUPRC) and structural similarity index measure (SSIM) as metrics.

RESULTS

All eight saliency map techniques failed at least one of the criteria and were inferior in performance compared with localization networks. For pneumothorax segmentation, the AUPRC ranged from 0.024 to 0.224, while a U-Net achieved a significantly superior AUPRC of 0.404 ( < .005). For pneumonia detection, the AUPRC ranged from 0.160 to 0.519, while a RetinaNet achieved a significantly superior AUPRC of 0.596 ( <.005). Five and two saliency methods (of eight) failed the model randomization test on the segmentation and detection datasets, respectively, suggesting that these methods are not sensitive to changes in model parameters. The repeatability and reproducibility of the majority of the saliency methods were worse than localization networks for both the segmentation and detection datasets.

CONCLUSION

The use of saliency maps in the high-risk domain of medical imaging warrants additional scrutiny and recommend that detection or segmentation models be used if localization is the desired output of the network. Technology Assessment, Technical Aspects, Feature Detection, Convolutional Neural Network (CNN) Supplemental material is available for this article. © RSNA, 2021.

摘要

目的

评估显著性图在医学影像中异常定位的可信度。

材料与方法

使用两个大型公开可用的放射学数据集（医学影像信息学会 - 美国放射学会气胸分割数据集和北美放射学会肺炎检测挑战赛数据集），从定位效用（分割和检测）、对模型权重随机化的敏感性、可重复性和再现性方面对八种常用的显著性图技术的性能进行量化。使用精确召回率曲线下面积（AUPRC）和结构相似性指数测量（SSIM）作为指标，将它们与基线方法和定位网络架构的性能进行比较。

结果

所有八种显著性图技术至少未达到其中一项标准，并且与定位网络相比性能较差。对于气胸分割，AUPRC范围为0.024至0.224，而一个U-Net实现了显著更高的AUPRC为0.404（P <.005）。对于肺炎检测，AUPRC范围为0.160至0.519，而一个RetinaNet实现了显著更高的AUPRC为0.596（P<.005）。八种显著性方法中的五种和两种分别在分割和检测数据集上未通过模型随机化测试，这表明这些方法对模型参数的变化不敏感。对于分割和检测数据集，大多数显著性方法的可重复性和再现性都比定位网络差。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

评估用于医学影像中异常定位的显著性图的可信度。

Assessing the Trustworthiness of Saliency Maps for Localizing Abnormalities in Medical Imaging.

作者信息

机构信息

出版信息

PURPOSE

MATERIALS AND METHODS

RESULTS

CONCLUSION

目的

材料与方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

评估用于医学影像中异常定位的显著性图的可信度。

Assessing the Trustworthiness of Saliency Maps for Localizing Abnormalities in Medical Imaging.

作者信息

机构信息

出版信息

PURPOSE

MATERIALS AND METHODS

RESULTS

CONCLUSION

目的

材料与方法

结果

结论

相似文献

引用本文的文献

本文引用的文献