Keller Brad M, Nathan Diane L, Wang Yan, Zheng Yuanjie, Gee James C, Conant Emily F, Kontos Despina
Department of Radiology, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.
Med Phys. 2012 Aug;39(8):4903-17. doi: 10.1118/1.4736530.
The amount of fibroglandular tissue content in the breast as estimated mammographically, commonly referred to as breast percent density (PD%), is one of the most significant risk factors for developing breast cancer. Approaches to quantify breast density commonly focus on either semiautomated methods or visual assessment, both of which are highly subjective. Furthermore, most studies published to date investigating computer-aided assessment of breast PD% have been performed using digitized screen-film mammograms, while digital mammography is increasingly replacing screen-film mammography in breast cancer screening protocols. Digital mammography imaging generates two types of images for analysis, raw (i.e., "FOR PROCESSING") and vendor postprocessed (i.e., "FOR PRESENTATION"), of which postprocessed images are commonly used in clinical practice. Development of an algorithm which effectively estimates breast PD% in both raw and postprocessed digital mammography images would be beneficial in terms of direct clinical application and retrospective analysis.
This work proposes a new algorithm for fully automated quantification of breast PD% based on adaptive multiclass fuzzy c-means (FCM) clustering and support vector machine (SVM) classification, optimized for the imaging characteristics of both raw and processed digital mammography images as well as for individual patient and image characteristics. Our algorithm first delineates the breast region within the mammogram via an automated thresholding scheme to identify background air followed by a straight line Hough transform to extract the pectoral muscle region. The algorithm then applies adaptive FCM clustering based on an optimal number of clusters derived from image properties of the specific mammogram to subdivide the breast into regions of similar gray-level intensity. Finally, a SVM classifier is trained to identify which clusters within the breast tissue are likely fibroglandular, which are then aggregated into a final dense tissue segmentation that is used to compute breast PD%. Our method is validated on a group of 81 women for whom bilateral, mediolateral oblique, raw and processed screening digital mammograms were available, and agreement is assessed with both continuous and categorical density estimates made by a trained breast-imaging radiologist.
Strong association between algorithm-estimated and radiologist-provided breast PD% was detected for both raw (r = 0.82, p < 0.001) and processed (r = 0.85, p < 0.001) digital mammograms on a per-breast basis. Stronger agreement was found when overall breast density was assessed on a per-woman basis for both raw (r = 0.85, p < 0.001) and processed (0.89, p < 0.001) mammograms. Strong agreement between categorical density estimates was also seen (weighted Cohen's κ ≥ 0.79). Repeated measures analysis of variance demonstrated no statistically significant differences between the PD% estimates (p > 0.1) due to either presentation of the image (raw vs processed) or method of PD% assessment (radiologist vs algorithm).
The proposed fully automated algorithm was successful in estimating breast percent density from both raw and processed digital mammographic images. Accurate assessment of a woman's breast density is critical in order for the estimate to be incorporated into risk assessment models. These results show promise for the clinical application of the algorithm in quantifying breast density in a repeatable manner, both at time of imaging as well as in retrospective studies.
通过乳房X线摄影术估算的乳腺纤维腺组织含量,通常称为乳房密度百分比(PD%),是患乳腺癌的最重要风险因素之一。量化乳房密度的方法通常集中在半自动方法或视觉评估上,这两种方法都具有高度主观性。此外,迄今为止发表的大多数研究都是使用数字化屏-片乳房X线照片来研究计算机辅助评估乳房PD%,而数字乳房X线摄影在乳腺癌筛查方案中越来越多地取代了屏-片乳房X线摄影。数字乳房X线摄影成像生成两种类型的图像用于分析,即原始图像(即“用于处理”)和供应商后处理图像(即“用于呈现”),其中后处理图像通常用于临床实践。开发一种能够有效估计原始和后处理数字乳房X线摄影图像中乳房PD%的算法,对于直接临床应用和回顾性分析都将是有益的。
这项工作提出了一种基于自适应多类模糊c均值(FCM)聚类和支持向量机(SVM)分类的全自动量化乳房PD%的新算法,该算法针对原始和处理后的数字乳房X线摄影图像的成像特征以及个体患者和图像特征进行了优化。我们的算法首先通过自动阈值化方案在乳房X线照片内划定乳房区域,以识别背景空气,然后通过直线霍夫变换提取胸肌区域。然后,该算法基于从特定乳房X线照片的图像属性得出的最佳聚类数应用自适应FCM聚类,将乳房细分为具有相似灰度强度的区域。最后,训练一个SVM分类器来识别乳房组织内哪些聚类可能是纤维腺组织,然后将这些聚类汇总为最终的致密组织分割,用于计算乳房PD%。我们的方法在一组81名女性中得到验证,这些女性有双侧、内外斜位的原始和处理后的筛查数字乳房X线照片,并与训练有素的乳房影像放射科医生进行的连续和分类密度估计进行一致性评估。
在每侧乳房基础上,对于原始(r = 0.82,p < 0.001)和处理后(r = 0.85,p < 0.001)的数字乳房X线照片,均检测到算法估计的和放射科医生提供的乳房PD%之间有很强的相关性。当在每位女性基础上评估总体乳房密度时,对于原始(r = 0.85,p < 0.001)和处理后(0.89,p < 0.001)的乳房X线照片,发现有更强的一致性。在分类密度估计之间也观察到很强的一致性(加权科恩κ≥0.79)。重复测量方差分析表明,由于图像呈现(原始图像与处理后图像)或PD%评估方法(放射科医生与算法),PD%估计之间没有统计学上的显著差异(p > 0.1)。
所提出的全自动算法成功地从原始和处理后的数字乳房X线摄影图像中估计了乳房密度百分比。准确评估女性的乳房密度对于将该估计纳入风险评估模型至关重要。这些结果表明该算法在以可重复的方式量化乳房密度方面具有临床应用前景,无论是在成像时还是在回顾性研究中。