Department of Radiology, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.
Med Phys. 2022 Dec;49(12):7596-7608. doi: 10.1002/mp.15883. Epub 2022 Aug 19.
Due to the complex nature of digital breast tomosynthesis (DBT) in imaging techniques, reading times are longer than 2D mammograms. A robust computer-aided diagnosis system in DBT could help radiologists reduce their workload and reading times.
The purpose of this study was to develop algorithms for detecting biopsy-proven breast lesions on DBT using multi-depth level convolutional models and leveraging non-biopsied samples. As biopsied positive samples in a lesion dataset are limited, we hypothesized that false positive (FP) findings by detection algorithms from non-biopsied benign lesions could improve detection algorithms by using them as data augmentation.
We first extracted 2D slices from DBT volumes with biopsy-proven breast lesions (cancer and benign), with non-biopsied benign lesions (actionable), and for controls. Then, to provide lesion continuity along the z-direction, we combined a lesion slice with its immediate adjacent slices to synthesize 2.5-dimensional (2.5D) images of the lesion by assigning them into R, G, and B color channels. We used 224 biopsy-proven lesions from 39 cancer and 62 benign patients from a DBTex challenge dataset of 1000 scans. We included the 2.5D images of immediate neighboring slices from the lesion's center to increase the number of training samples. For lesion detection, we used the YOLOv5 algorithm as our base network. We trained a baseline algorithm (medium-depth level) using biopsied samples to detect actionable FPs in non-biopsied images. Afterward, we fine-tuned the baseline model on the augmented image set (actionable FPs added). For lesion inferencing, we processed the DBT volume slice-by-slice to estimate bounding boxes in each slice, and then combined them by connecting bounding boxes along the depth via volumetric morphological closing. We trained an additional model (large) with deeper-depth levels by repeating the above process. Finally, we developed an ensemble algorithm by combining the medium and large detection models. We used the free-response operating characteristic curve to evaluate our algorithms. We reported mean sensitivity per FPs per DBT volume only for biopsied views and sensitivity at 2-false positives per image (2FPI) for all views. However, due to the limited accessibility to the truth of the challenge validation and test datasets, we used sensitivity at 2FPI for statistical evaluation.
For the DBTex independent validation set, the medium baseline model achieved a mean sensitivity of 0.627 FPs per DBT volume, and a sensitivity of 0.640 at 2FPI. After adding actionable FP lesions, the model had an improved 2FPI of 0.769 over the baseline (p-value = 0.013). Our ensemble algorithm with multi-depth levels (medium + large) achieved a mean sensitivity of 0.815 FPs per DBT volume and an improved sensitivity at 2FPI of 0.80 over the baseline (p-value < 0.001) on the validation set. Finally, our ensemble model achieved a mean sensitivity of 0.786 FPs per DBT volume and a sensitivity of 0.743 at 2FPI on the DBTex independent test set.
Our results show that actionable FP findings hold useful information for lesion detection algorithms, and our ensemble detection model with multi-depth levels improves lesion detection performance.
由于数字乳腺断层合成(DBT)在成像技术方面的复杂性,其阅读时间长于二维乳房 X 光检查。在 DBT 中使用强大的计算机辅助诊断系统可以帮助放射科医生减少工作量和阅读时间。
本研究旨在开发一种使用多深度卷积模型检测 DBT 活检证实的乳腺病变的算法,并利用非活检良性样本。由于病变数据集的活检阳性样本有限,我们假设通过非活检良性病变的检测算法发现的假阳性(FP)发现可以通过将其用作数据扩充来改善检测算法。
我们首先从 DBT 体积中提取活检证实的乳腺病变(癌症和良性)、非活检的良性病变(可操作)和对照的 2D 切片。然后,为了提供病变在 z 方向上的连续性,我们将病变切片与其紧邻的切片组合在一起,通过将它们分配到 R、G 和 B 颜色通道来合成病变的 2.5 维(2.5D)图像。我们使用了来自 DBTex 挑战数据集 1000 次扫描中 39 例癌症和 62 例良性患者的 224 个活检证实的病变。我们包括了病变中心的相邻 2.5D 图像,以增加训练样本的数量。对于病变检测,我们使用 YOLOv5 算法作为我们的基础网络。我们使用活检样本训练了一个基线算法(中深度级别)来检测非活检图像中的可操作 FP。之后,我们在增强图像集(添加可操作 FP)上对基线模型进行了微调。对于病变推断,我们逐片处理 DBT 体积切片,估计每个切片中的边界框,然后通过沿深度连接边界框来通过体积形态学闭合进行组合。我们通过重复上述过程,使用更深层次的水平(大型)训练了另一个模型。最后,我们通过组合中、大检测模型开发了一个集成算法。我们使用自由响应操作特征曲线来评估我们的算法。我们仅报告了每个 DBT 体积的 FP 每 FP 的平均敏感性和所有视图的每图像 2 个 FP(2FPI)的敏感性。然而,由于对挑战赛验证和测试数据集的真实性的有限访问,我们使用了 2FPI 的敏感性进行统计评估。
对于 DBTex 独立验证集,中等级别基线模型的平均 FP 每 DBT 体积为 0.627,2FPI 的敏感性为 0.640。在添加可操作 FP 病变后,模型的 2FPI 相对于基线提高了 0.769(p 值= 0.013)。我们使用多深度级别(中+大)的集成算法在验证集上实现了平均 FP 每 DBT 体积为 0.815 和 2FPI 提高到 0.80 的敏感性,相对于基线提高了(p 值<0.001)。最后,我们的集成模型在 DBTex 独立测试集上实现了平均 FP 每 DBT 体积为 0.786 和 2FPI 为 0.743 的敏感性。
我们的结果表明,可操作 FP 的发现为病变检测算法提供了有用的信息,并且我们的具有多深度级别的集成检测模型提高了病变检测性能。