Huynh Benjamin Q, Li Hui, Giger Maryellen L
University of Chicago , Department of Radiology, 5841 South Maryland Avenue, Chicago, Illinois 60637, United States.
J Med Imaging (Bellingham). 2016 Jul;3(3):034501. doi: 10.1117/1.JMI.3.3.034501. Epub 2016 Aug 22.
Convolutional neural networks (CNNs) show potential for computer-aided diagnosis (CADx) by learning features directly from the image data instead of using analytically extracted features. However, CNNs are difficult to train from scratch for medical images due to small sample sizes and variations in tumor presentations. Instead, transfer learning can be used to extract tumor information from medical images via CNNs originally pretrained for nonmedical tasks, alleviating the need for large datasets. Our database includes 219 breast lesions (607 full-field digital mammographic images). We compared support vector machine classifiers based on the CNN-extracted image features and our prior computer-extracted tumor features in the task of distinguishing between benign and malignant breast lesions. Five-fold cross validation (by lesion) was conducted with the area under the receiver operating characteristic (ROC) curve as the performance metric. Results show that classifiers based on CNN-extracted features (with transfer learning) perform comparably to those using analytically extracted features [area under the ROC curve [Formula: see text]]. Further, the performance of ensemble classifiers based on both types was significantly better than that of either classifier type alone ([Formula: see text] versus 0.81, [Formula: see text]). We conclude that transfer learning can improve current CADx methods while also providing standalone classifiers without large datasets, facilitating machine-learning methods in radiomics and precision medicine.
卷积神经网络(CNN)通过直接从图像数据中学习特征而非使用解析提取的特征,在计算机辅助诊断(CADx)方面展现出潜力。然而,由于医学图像样本量小以及肿瘤表现的多样性,从零开始训练CNN很困难。相反,迁移学习可用于通过最初为非医学任务预训练的CNN从医学图像中提取肿瘤信息,从而减少对大型数据集的需求。我们的数据库包含219个乳腺病变(607幅全视野数字化乳腺钼靶图像)。在区分良性和恶性乳腺病变的任务中,我们比较了基于CNN提取的图像特征的支持向量机分类器和我们之前计算机提取的肿瘤特征。以受试者操作特征(ROC)曲线下面积作为性能指标进行五折交叉验证(按病变)。结果表明,基于CNN提取特征(采用迁移学习)的分类器与使用解析提取特征的分类器性能相当(ROC曲线下面积[公式:见原文])。此外,基于这两种特征的集成分类器的性能明显优于单独的任何一种分类器类型([公式:见原文]对0.81,[公式:见原文])。我们得出结论,迁移学习可以改进当前的CADx方法,同时在无需大型数据集的情况下提供独立的分类器,促进放射组学和精准医学中的机器学习方法。