Holley Susan O, Cardoza Daniel, Matthews Thomas P, Tibatemwa Elisha E, Morales Hoil Rodrigo, Toriola Adetunji T, Gastounioti Aimilia
Onsite Women's Health, Nashville, TN 37203, United States.
Whiterabbit.ai, Redwood City, CA 94065, United States.
BJR Artif Intell. 2025 Mar 3;2(1):ubaf004. doi: 10.1093/bjrai/ubaf004. eCollection 2025 Jan.
To assess whether use of an artificial intelligence (AI) model for mammography could result in more longitudinally consistent breast density assessments compared with interpreting radiologists.
The AI model was evaluated retrospectively on a large mammography dataset including 50 sites across the United States from an outpatient radiology practice. Examinations were acquired on Hologic imaging systems between 2016 and 2021 and were interpreted by 39 radiologists (36% fellowship trained; years of experience: 2-37 years). Longitudinal patterns in 4-category breast density and binary breast density (non-dense vs. dense) were characterized for all women with at least 3 examinations (61 177 women; 214 158 examinations) as constant, descending, ascending, or bi-directional. Differences in longitudinal density patterns were assessed using paired proportion hypothesis testing.
The AI model produced more constant ( < .001) and fewer bi-directional ( < .001) longitudinal density patterns compared to radiologists (AI: constant 81.0%, bi-directional 4.9%; radiologists: constant 56.8%, bi-directional 15.3%). The AI density model also produced more constant ( < .001) and fewer bi-directional ( < .001) longitudinal patterns for binary breast density. These findings held in various subset analyses, which minimize (1) change in breast density (post-menopausal women, women with stable image-based BMI), (2) inter-observer variability (same radiologist), and (3) variability by radiologist's training level (fellowship-trained radiologists).
AI produces more longitudinally consistent breast density assessments compared with interpreting radiologists.
Our results extend the advantages of AI in breast density evaluation beyond automation and reproducibility, showing a potential path to improved longitudinal consistency and more consistent downstream care for screened women.
评估与放射科医生解读相比,使用人工智能(AI)模型进行乳房X光检查是否能带来更具纵向一致性的乳房密度评估。
在一个大型乳房X光检查数据集上对AI模型进行回顾性评估,该数据集来自美国各地50个站点的门诊放射科实践。检查于2016年至2021年期间在Hologic成像系统上进行,由39名放射科医生进行解读(36%接受过专科培训;经验年限:2 - 37年)。对所有至少进行过3次检查的女性(61177名女性;214158次检查)的4类乳房密度和二元乳房密度(非致密型与致密型)的纵向模式进行特征描述,分为恒定、下降、上升或双向。使用配对比例假设检验评估纵向密度模式的差异。
与放射科医生相比,AI模型产生的纵向密度模式更恒定(<0.001)且双向模式更少(<0.001)(AI:恒定81.0%,双向4.9%;放射科医生:恒定56.8%,双向15.3%)。AI密度模型在二元乳房密度方面也产生了更恒定(<0.001)且双向模式更少(<0.001)的纵向模式。这些发现在各种亚组分析中均成立,这些分析最小化了(1)乳房密度变化(绝经后女性、基于图像的BMI稳定的女性),(2)观察者间变异性(同一名放射科医生),以及(3)放射科医生培训水平的变异性(接受过专科培训的放射科医生)。
与放射科医生解读相比,AI能产生更具纵向一致性的乳房密度评估。
我们的结果将AI在乳房密度评估中的优势扩展到自动化和可重复性之外,显示出一条改善纵向一致性以及为筛查女性提供更一致的下游护理的潜在途径。