Zhou Yuanpin, Wei Jun, Wu Dongmei, Zhang Yaqin
School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China.
Perception Vision Medical Technology Company Ltd., Guangzhou, China.
Front Oncol. 2022 Apr 29;12:868257. doi: 10.3389/fonc.2022.868257. eCollection 2022.
Developing deep learning algorithms for breast cancer screening is limited due to the lack of labeled full-field digital mammograms (FFDMs). Since FFDM is a new technique that rose in recent decades and replaced digitized screen-film mammograms (DFM) as the main technique for breast cancer screening, most mammogram datasets were still stored in the form of DFM. A solution for developing deep learning algorithms based on FFDM while leveraging existing labeled DFM datasets is a generative algorithm that generates FFDM from DFM. Generating high-resolution FFDM from DFM remains a challenge due to the limitations of network capacity and lacking GPU memory.
In this study, we developed a deep-learning-based generative algorithm, HRGAN, to generate synthesized FFDM (SFFDM) from DFM. More importantly, our algorithm can keep the image resolution and details while using high-resolution DFM as input. Our model used FFDM and DFM for training. First, a sliding window was used to crop DFMs and FFDMs into 256 × 256 pixels patches. Second, the patches were divided into three categories (breast, background, and boundary) by breast masks. Patches from the DFM and FFDM datasets were paired as inputs for training our model where these paired patches should be sampled from the same category of the two different image sets. U-Net liked generators and modified discriminators with two-channels output, one channel for distinguishing real and SFFDMs and the other for representing a probability map for breast mask, were used in our algorithm. Last, a study was designed to evaluate the usefulness of HRGAN. A mass segmentation task and a calcification detection task were included in the study.
Two public mammography datasets, the CBIS-DDSM dataset and the INbreast dataset, were included in our experiment. The CBIS-DDSM dataset includes 753 calcification cases and 891 mass cases with verified pathology information, resulting in a total of 3568 DFMs. The INbreast dataset contains a total of 410 FFDMs with annotations of masses, calcifications, asymmetries, and distortions. There were 1784 DFMs and 205 FFDM randomly selected as Dataset A. The remaining DFMs from the CBIS-DDSM dataset were selected as Dataset B. The remaining FFDMs from the INbreast dataset were selected as Dataset C. All DFMs and FFDMs were normalized to 100 × 100 in our experiments. A study with a mass segmentation task and a calcification detection task was performed to evaluate the usefulness of HRGAN.
The proposed HRGAN can generate high-resolution SFFDMs from DFMs. Extensive experiments showed the SFFDMs were able to help improve the performance of deep-learning-based algorithms for breast cancer screening on DFM when the size of the training dataset is small.
由于缺乏标注的全视野数字化乳腺钼靶(FFDM)图像,用于乳腺癌筛查的深度学习算法开发受到限制。由于FFDM是近几十年来兴起的一项新技术,已取代数字化屏-片乳腺钼靶(DFM)成为乳腺癌筛查的主要技术,大多数乳腺钼靶数据集仍以DFM的形式存储。一种在利用现有标注DFM数据集的同时基于FFDM开发深度学习算法的解决方案是一种从DFM生成FFDM的生成算法。由于网络容量的限制和GPU内存的不足,从DFM生成高分辨率FFDM仍然是一个挑战。
在本研究中,我们开发了一种基于深度学习的生成算法HRGAN,用于从DFM生成合成FFDM(SFFDM)。更重要的是,我们的算法在使用高分辨率DFM作为输入时能够保持图像分辨率和细节。我们的模型使用FFDM和DFM进行训练。首先,使用滑动窗口将DFM和FFDM裁剪成256×256像素的补丁。其次,通过乳腺掩膜将补丁分为三类(乳腺、背景和边界)。将DFM和FFDM数据集中的补丁配对作为训练我们模型的输入,其中这些配对的补丁应从两个不同图像集的同一类别中采样。我们的算法使用了类似U-Net的生成器和具有双通道输出的改进判别器,一个通道用于区分真实图像和SFFDM,另一个通道用于表示乳腺掩膜的概率图。最后,设计了一项研究来评估HRGAN的有效性。该研究包括一个肿块分割任务和一个钙化检测任务。
我们的实验纳入了两个公共乳腺钼靶数据集,即CBIS-DDSM数据集和INbreast数据集。CBIS-DDSM数据集包括753例钙化病例和891例肿块病例,并具有经过验证的病理信息,共有3568幅DFM。INbreast数据集总共包含410幅FFDM,带有肿块、钙化、不对称和变形的标注。随机选择1784幅DFM和205幅FFDM作为数据集A。从CBIS-DDSM数据集中剩余的DFM被选为数据集B。从INbreast数据集中剩余的FFDM被选为数据集C。在我们的实验中,所有DFM和FFDM都被归一化为100×100。进行了一项包含肿块分割任务和钙化检测任务的研究,以评估HRGAN的有效性。
所提出的HRGAN能够从DFM生成高分辨率的SFFDM。大量实验表明,当训练数据集规模较小时,SFFDM能够帮助提高基于深度学习的乳腺癌筛查算法在DFM上的性能。