深度学习在乳腺 X 线图像分类中的预训练策略:一项评估研究。

Deep Learning Pre-training Strategy for Mammogram Image Classification: an Evaluation Study.

机构信息

Department of Computer Science, University of Pittsburgh, 3240 Craft Place, Pittsburgh, PA, 15213, USA.

Department of Biomedical Informatics, University of Pittsburgh, 3240 Craft Place, Pittsburgh, PA, 15213, USA.

出版信息

J Digit Imaging. 2020 Oct;33(5):1257-1265. doi: 10.1007/s10278-020-00369-3.

Abstract

In this work, we assess how pre-training strategy affects deep learning performance for the task of distinguishing false-recall from malignancy and normal (benign) findings in digital mammography images. A cohort of 1303 breast cancer screening patients (4935 digital mammogram images in total) was retrospectively analyzed as the target dataset for this study. We assessed six different convolutional neural network model structures utilizing four different imaging datasets (total > 1.4 million images (including ImageNet); medical images different in terms of scale, modality, organ, and source) for pre-training on six classification tasks to assess how the performance of CNN models varies based on training strategy. Representative pre-training strategies included transfer learning with medical and non-medical datasets, layer freezing, varied network structure, and multi-view input for both binary and triple-class classification of mammogram images. The area under the receiver operating characteristic curve (AUC) was used as the model performance metric. The best performing model out of all experimental settings was an AlexNet model incrementally pre-trained on ImageNet and a large Breast Density dataset. The AUC for the six classification tasks using this model ranged from 0.68 to 0.77. In the case of distinguishing recalled-benign mammograms from others, four out of five pre-training strategies tested produced significant performance differences from the baseline model. This study suggests that pre-training strategy influences significant performance differences, especially in the case of distinguishing recalled- benign from malignant and benign screening patients.

摘要

在这项工作中,我们评估了预训练策略如何影响深度学习在区分数字乳腺图像中的假召回与恶性和良性发现的任务中的性能。我们回顾性地分析了一个由 1303 名乳腺癌筛查患者组成的队列(总共 4935 张数字乳腺 X 线照片)作为本研究的目标数据集。我们评估了六个不同的卷积神经网络模型结构,利用四个不同的成像数据集(总共超过 140 万张图像(包括 ImageNet);在规模、模态、器官和来源方面有所不同的医学图像)进行预训练,以评估 CNN 模型的性能如何基于训练策略而变化。代表性的预训练策略包括使用医学和非医学数据集的迁移学习、冻结层、不同的网络结构以及对乳腺 X 线照片的二进制和三分类分类的多视图输入。接收者操作特征曲线下的面积(AUC)被用作模型性能指标。在所有实验设置中表现最好的模型是在 ImageNet 和大型乳腺密度数据集上逐步预训练的 AlexNet 模型。该模型在六个分类任务中的 AUC 范围为 0.68 至 0.77。在区分召回良性乳腺 X 线照片与其他乳腺 X 线照片的情况下,五种预训练策略中有四种与基线模型相比产生了显著的性能差异。这项研究表明,预训练策略会影响显著的性能差异,特别是在区分召回良性与恶性和良性筛查患者的情况下。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索