Cha Kenny H, Petrick Nicholas, Pezeshk Aria, Graff Christian G, Sharma Diksha, Badal Andreu, Sahiner Berkman
U.S. Food and Drug Administration, Silver Spring, Maryland, United States.
J Med Imaging (Bellingham). 2020 Jan;7(1):012703. doi: 10.1117/1.JMI.7.1.012703. Epub 2019 Nov 22.
We evaluated whether using synthetic mammograms for training data augmentation may reduce the effects of overfitting and increase the performance of a deep learning algorithm for breast mass detection. Synthetic mammograms were generated using procedural analytic breast and breast mass modeling algorithms followed by simulated x-ray projections of the breast models into mammographic images. breast phantoms containing masses were modeled across the four BI-RADS breast density categories, and the masses were modeled with different sizes, shapes, and margins. A Monte Carlo-based x-ray transport simulation code, MC-GPU, was used to project the three-dimensional phantoms into realistic synthetic mammograms. 2000 mammograms with 2522 masses were generated to augment a real data set during training. From the Curated Breast Imaging Subset of the Digital Database for Screening Mammography (CBIS-DDSM) data set, we used 1111 mammograms (1198 masses) for training, 120 mammograms (120 masses) for validation, and 361 mammograms (378 masses) for testing. We used faster R-CNN for our deep learning network with pretraining from ImageNet using the Resnet-101 architecture. We compared the detection performance when the network was trained using different percentages of the real CBIS-DDSM training set (100%, 50%, and 25%), and when these subsets of the training set were augmented with 250, 500, 1000, and 2000 synthetic mammograms. Free-response receiver operating characteristic (FROC) analysis was performed to compare performance with and without the synthetic mammograms. We generally observed an improved test FROC curve when training with the synthetic images compared to training without them, and the amount of improvement depended on the number of real and synthetic images used in training. Our study shows that enlarging the training data with synthetic samples can increase the performance of deep learning systems.
我们评估了使用合成乳腺X线照片进行训练数据增强是否可以减少过拟合的影响,并提高用于乳腺肿块检测的深度学习算法的性能。使用程序分析乳腺和乳腺肿块建模算法生成合成乳腺X线照片,然后将乳腺模型的模拟X射线投影到乳腺X线图像中。对包含肿块的乳腺模型在四个BI-RADS乳腺密度类别中进行建模,并且对肿块进行不同大小、形状和边缘的建模。使用基于蒙特卡洛的X射线传输模拟代码MC-GPU将三维模型投影到逼真的合成乳腺X线照片中。在训练期间生成了2000张带有2522个肿块的乳腺X线照片以扩充真实数据集。从数字乳腺X线筛查数据库(CBIS-DDSM)数据集中的精选乳腺成像子集中,我们使用1111张乳腺X线照片(1198个肿块)进行训练,120张乳腺X线照片(120个肿块)进行验证,361张乳腺X线照片(378个肿块)进行测试。我们使用更快的R-CNN作为深度学习网络,并使用Resnet-101架构从ImageNet进行预训练。我们比较了使用真实CBIS-DDSM训练集的不同百分比(100%、50%和25%)进行训练时,以及当这些训练集子集用250、500、1000和2000张合成乳腺X线照片扩充时的检测性能。进行自由响应接收者操作特征(FROC)分析以比较有无合成乳腺X线照片时的性能。与不使用合成图像训练相比,我们通常观察到使用合成图像训练时测试FROC曲线有所改善,并且改善的程度取决于训练中使用的真实图像和合成图像的数量。我们的研究表明,用合成样本扩充训练数据可以提高深度学习系统的性能。