An Ulzee, Bhardwaj Ankit, Shameer Khader, Subramanian Lakshminarayanan
Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, United States.
Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, New York, NY, United States.
Front Big Data. 2021 Dec 3;4:742779. doi: 10.3389/fdata.2021.742779. eCollection 2021.
Breast cancer screening using Mammography serves as the earliest defense against breast cancer, revealing anomalous tissue years before it can be detected through physical screening. Despite the use of high resolution radiography, the presence of densely overlapping patterns challenges the consistency of human-driven diagnosis and drives interest in leveraging state-of-art localization ability of deep convolutional neural networks (DCNN). The growing availability of digitized clinical archives enables the training of deep segmentation models, but training using the most widely available form of coarse hand-drawn annotations works against learning the precise boundary of cancerous tissue in evaluation, while producing results that are more aligned with the annotations rather than the underlying lesions. The expense of collecting high quality pixel-level data in the field of medical science makes this even more difficult. To surmount this fundamental challenge, we propose LatentCADx, a deep learning segmentation model capable of precisely annotating cancer lesions underlying hand-drawn annotations, which we procedurally obtain using joint classification training and a strict segmentation penalty. We demonstrate the capability of LatentCADx on a publicly available dataset of 2,620 Mammogram case files, where LatentCADx obtains classification ROC of 0.97, AP of 0.87, and segmentation AP of 0.75 (IOU = 0.5), giving comparable or better performance than other models. Qualitative and precision evaluation of LatentCADx annotations on validation samples reveals that LatentCADx increases the specificity of segmentations beyond that of existing models trained on hand-drawn annotations, with pixel level specificity reaching a staggering value of 0.90. It also obtains sharp boundary around lesions unlike other methods, reducing the confused pixels in the output by more than 60.
使用乳房X线摄影术进行乳腺癌筛查是对抗乳腺癌的最早防线,它能在通过体格检查发现异常组织的数年前就揭示出异常组织。尽管使用了高分辨率射线照相术,但密集重叠模式的存在对人工诊断的一致性提出了挑战,并激发了人们利用深度卷积神经网络(DCNN)的先进定位能力的兴趣。数字化临床档案的日益普及使得深度分割模型的训练成为可能,但使用最广泛的粗略手绘注释形式进行训练不利于在评估中学习癌组织的精确边界,同时产生的结果与注释而非潜在病变更一致。在医学领域收集高质量像素级数据的成本使得这一问题更加困难。为了克服这一根本挑战,我们提出了LatentCADx,这是一种深度学习分割模型,能够精确注释手绘注释下的癌症病变,我们通过联合分类训练和严格的分割惩罚程序获得这些注释。我们在一个包含2620个乳房X线照片病例文件的公开可用数据集上展示了LatentCADx的能力,LatentCADx在该数据集上获得了0.97的分类ROC、0.87的AP和0.75(IOU = 0.5)的分割AP,性能与其他模型相当或更好。对验证样本上LatentCADx注释的定性和精确性评估表明,LatentCADx提高了分割的特异性,超过了在手绘注释上训练的现有模型,像素级特异性达到了惊人的0.90。与其他方法不同,它还能在病变周围获得清晰的边界,将输出中的混淆像素减少了60%以上。