Mullooly Maeve, Ehteshami Bejnordi Babak, Pfeiffer Ruth M, Fan Shaoqi, Palakal Maya, Hada Manila, Vacek Pamela M, Weaver Donald L, Shepherd John A, Fan Bo, Mahmoudzadeh Amir Pasha, Wang Jeff, Malkov Serghei, Johnson Jason M, Herschorn Sally D, Sprague Brian L, Hewitt Stephen, Brinton Louise A, Karssemeijer Nico, van der Laak Jeroen, Beck Andrew, Sherman Mark E, Gierach Gretchen L
1Division of Population Health Sciences, Royal College of Surgeons in Ireland, Dublin, Ireland.
2Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD USA.
NPJ Breast Cancer. 2019 Nov 19;5:43. doi: 10.1038/s41523-019-0134-6. eCollection 2019.
Breast density, a breast cancer risk factor, is a radiologic feature that reflects fibroglandular tissue content relative to breast area or volume. Its histology is incompletely characterized. Here we use deep learning approaches to identify histologic correlates in radiologically-guided biopsies that may underlie breast density and distinguish cancer among women with elevated and low density. We evaluated hematoxylin and eosin (H&E)-stained digitized images from image-guided breast biopsies ( = 852 patients). Breast density was assessed as global and localized fibroglandular volume (%). A convolutional neural network characterized H&E composition. In total 37 features were extracted from the network output, describing tissue quantities and morphological structure. A random forest regression model was trained to identify correlates most predictive of fibroglandular volume ( = 588). Correlations between predicted and radiologically quantified fibroglandular volume were assessed in 264 independent patients. A second random forest classifier was trained to predict diagnosis (invasive vs. benign); performance was assessed using area under receiver-operating characteristics curves (AUC). Using extracted features, regression models predicted global ( = 0.94) and localized ( = 0.93) fibroglandular volume, with fat and non-fatty stromal content representing the strongest correlates, followed by epithelial organization rather than quantity. For predicting cancer among high and low fibroglandular volume, the classifier achieved AUCs of 0.92 and 0.84, respectively, with epithelial organizational features ranking most important. These results suggest non-fatty stroma, fat tissue quantities and epithelial region organization predict fibroglandular volume. The model holds promise for identifying histological correlates of cancer risk in patients with high and low density and warrants further evaluation.
乳腺密度是一种乳腺癌风险因素,是一种反映纤维腺组织含量相对于乳房面积或体积的放射学特征。其组织学特征尚未完全明确。在此,我们使用深度学习方法来识别在放射学引导活检中可能构成乳腺密度基础并区分高密度和低密度女性中癌症的组织学关联因素。我们评估了来自图像引导乳腺活检(n = 852例患者)的苏木精和伊红(H&E)染色数字化图像。乳腺密度被评估为整体和局部纤维腺体积(%)。一个卷积神经网络对H&E组成进行了特征描述。总共从网络输出中提取了37个特征,描述了组织数量和形态结构。训练了一个随机森林回归模型来识别最能预测纤维腺体积(n = 588)的关联因素。在264例独立患者中评估了预测的和放射学量化的纤维腺体积之间的相关性。训练了第二个随机森林分类器来预测诊断(浸润性与良性);使用受试者操作特征曲线下面积(AUC)评估性能。使用提取的特征,回归模型预测了整体(r = 0.94)和局部(r = 0.93)纤维腺体积,脂肪和非脂肪间质含量是最强的关联因素,其次是上皮组织而非数量。对于在高纤维腺体积和低纤维腺体积中预测癌症,分类器分别实现了0.92和0.84的AUC,上皮组织特征最为重要。这些结果表明非脂肪间质、脂肪组织数量和上皮区域组织可预测纤维腺体积。该模型有望识别高密度和低密度患者中癌症风险的组织学关联因素,值得进一步评估。