Samala Ravi K, Chan Heang-Ping, Hadjiiski Lubomir, Helvie Mark A, Wei Jun, Cha Kenny
Department of Radiology, University of Michigan, Ann Arbor, Michigan 48109.
Med Phys. 2016 Dec;43(12):6654. doi: 10.1118/1.4967345.
Develop a computer-aided detection (CAD) system for masses in digital breast tomosynthesis (DBT) volume using a deep convolutional neural network (DCNN) with transfer learning from mammograms.
A data set containing 2282 digitized film and digital mammograms and 324 DBT volumes were collected with IRB approval. The mass of interest on the images was marked by an experienced breast radiologist as reference standard. The data set was partitioned into a training set (2282 mammograms with 2461 masses and 230 DBT views with 228 masses) and an independent test set (94 DBT views with 89 masses). For DCNN training, the region of interest (ROI) containing the mass (true positive) was extracted from each image. False positive (FP) ROIs were identified at prescreening by their previously developed CAD systems. After data augmentation, a total of 45 072 mammographic ROIs and 37 450 DBT ROIs were obtained. Data normalization and reduction of non-uniformity in the ROIs across heterogeneous data was achieved using a background correction method applied to each ROI. A DCNN with four convolutional layers and three fully connected (FC) layers was first trained on the mammography data. Jittering and dropout techniques were used to reduce overfitting. After training with the mammographic ROIs, all weights in the first three convolutional layers were frozen, and only the last convolution layer and the FC layers were randomly initialized again and trained using the DBT training ROIs. The authors compared the performances of two CAD systems for mass detection in DBT: one used the DCNN-based approach and the other used their previously developed feature-based approach for FP reduction. The prescreening stage was identical in both systems, passing the same set of mass candidates to the FP reduction stage. For the feature-based CAD system, 3D clustering and active contour method was used for segmentation; morphological, gray level, and texture features were extracted and merged with a linear discriminant classifier to score the detected masses. For the DCNN-based CAD system, ROIs from five consecutive slices centered at each candidate were passed through the trained DCNN and a mass likelihood score was generated. The performances of the CAD systems were evaluated using free-response ROC curves and the performance difference was analyzed using a non-parametric method.
Before transfer learning, the DCNN trained only on mammograms with an AUC of 0.99 classified DBT masses with an AUC of 0.81 in the DBT training set. After transfer learning with DBT, the AUC improved to 0.90. For breast-based CAD detection in the test set, the sensitivity for the feature-based and the DCNN-based CAD systems was 83% and 91%, respectively, at 1 FP/DBT volume. The difference between the performances for the two systems was statistically significant (p-value < 0.05).
The image patterns learned from the mammograms were transferred to the mass detection on DBT slices through the DCNN. This study demonstrated that large data sets collected from mammography are useful for developing new CAD systems for DBT, alleviating the problem and effort of collecting entirely new large data sets for the new modality.
利用深度卷积神经网络(DCNN)并通过从乳腺X线摄影进行迁移学习,开发一种用于数字乳腺断层合成(DBT)容积中肿块的计算机辅助检测(CAD)系统。
经机构审查委员会(IRB)批准,收集了一个包含2282张数字化胶片和数字乳腺X线摄影图像以及324个DBT容积的数据集。图像上感兴趣的肿块由一位经验丰富的乳腺放射科医生标记作为参考标准。该数据集被划分为一个训练集(2282张乳腺X线摄影图像,带有2461个肿块以及230个DBT视图,带有228个肿块)和一个独立测试集(94个DBT视图,带有89个肿块)。对于DCNN训练,从每张图像中提取包含肿块(真阳性)的感兴趣区域(ROI)。假阳性(FP)ROI在预筛查时通过其先前开发的CAD系统进行识别。经过数据增强后,总共获得了45072个乳腺X线摄影ROI和37450个DBT ROI。使用应用于每个ROI的背景校正方法实现了数据归一化以及减少跨异构数据的ROI中的不均匀性。一个具有四个卷积层和三个全连接(FC)层的DCNN首先在乳腺X线摄影数据上进行训练。使用抖动和随机失活技术来减少过拟合。在用乳腺X线摄影ROI训练之后,前三个卷积层中的所有权重被冻结,并且仅最后一个卷积层和FC层再次被随机初始化,并使用DBT训练ROI进行训练。作者比较了两种用于DBT中肿块检测的CAD系统的性能:一种使用基于DCNN的方法,另一种使用他们先前开发的基于特征的方法来减少FP。两个系统的预筛查阶段相同,将同一组肿块候选对象传递到FP减少阶段。对于基于特征的CAD系统,使用三维聚类和活动轮廓方法进行分割;提取形态学、灰度级和纹理特征,并与线性判别分类器合并以对检测到的肿块进行评分。对于基于DCNN的CAD系统,以每个候选对象为中心的五个连续切片的ROI通过经过训练的DCNN,并生成一个肿块似然分数。使用自由响应ROC曲线评估CAD系统的性能,并使用非参数方法分析性能差异。
在迁移学习之前,仅在乳腺X线摄影图像上训练的DCNN在DBT训练集中对DBT肿块进行分类时,曲线下面积(AUC)为0.81,而在乳腺X线摄影图像上训练时AUC为0.99。在用DBT进行迁移学习之后,AUC提高到了0.90。在测试集中对于基于乳腺的CAD检测,在每DBT容积1个FP的情况下,基于特征的CAD系统和基于DCNN的CAD系统的灵敏度分别为83%和91%。两个系统性能之间的差异具有统计学意义(p值<0.05)。
从乳腺X线摄影图像中学到的图像模式通过DCNN被转移到DBT切片上的肿块检测中。本研究表明,从乳腺X线摄影收集的大数据集对于开发用于DBT的新CAD系统是有用的,缓解了为新模态收集全新大数据集的问题和工作量。