提高对乳腺肿块分割数据集进行完全标注的标注效率。

Improving annotation efficiency for fully labeling a breast mass segmentation dataset.

作者信息

Sharma Vaibhav, Barnett Alina Jade, Yang Julia, Cheon Sangwook, Kim Giyoung, Regina Schwartz Fides, Wang Avivah, Hall Neal, Grimm Lars, Chen Chaofan, Lo Joseph Y, Rudin Cynthia

机构信息

Duke University, Department of Computer Science, Durham, North Carolina, United States.

Duke University School of Medicine, Department of Radiology, Durham, North Carolina, United States.

出版信息

J Med Imaging (Bellingham). 2025 May;12(3):035501. doi: 10.1117/1.JMI.12.3.035501. Epub 2025 May 21.

DOI:10.1117/1.JMI.12.3.035501

PMID:40415867

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12094908/

Abstract

PURPOSE

Breast cancer remains a leading cause of death for women. Screening programs are deployed to detect cancer at early stages. One current barrier identified by breast imaging researchers is a shortage of labeled image datasets. Addressing this problem is crucial to improve early detection models. We present an active learning (AL) framework for segmenting breast masses from 2D digital mammography, and we publish labeled data. Our method aims to reduce the input needed from expert annotators to reach a fully labeled dataset.

APPROACH

We create a dataset of 1136 mammographic masses with pixel-wise binary segmentation labels, with the test subset labeled independently by two different teams. With this dataset, we simulate a human annotator within an AL framework to develop and compare AI-assisted labeling methods, using a discriminator model and a simulated oracle to collect acceptable segmentation labels. A UNet model is retrained on these labels, generating new segmentations. We evaluate various oracle heuristics using the percentage of segmentations that the oracle relabels and measure the quality of the proposed labels by evaluating the intersection over union over a validation dataset.

RESULTS

Our method reduces expert annotator input by 44%. We present a dataset of 1136 binary segmentation labels approved by board-certified radiologists and make the 143-image validation set public for comparison with other researchers' methods.

CONCLUSIONS

We demonstrate that AL can significantly improve the efficiency and time-effectiveness of creating labeled mammogram datasets. Our framework facilitates the development of high-quality datasets while minimizing manual effort in the domain of digital mammography.

摘要

目的

乳腺癌仍是女性死亡的主要原因。开展筛查项目以在早期阶段检测癌症。乳腺影像研究人员目前发现的一个障碍是缺乏带标注的图像数据集。解决这个问题对于改进早期检测模型至关重要。我们提出了一种用于从二维数字乳腺钼靶图像中分割乳腺肿块的主动学习（AL）框架，并发布了带标注的数据。我们的方法旨在减少专家标注人员为获得一个完全带标注的数据集所需的投入。

方法

我们创建了一个包含1136个乳腺钼靶肿块的数据集，带有逐像素的二进制分割标注，测试子集由两个不同团队独立标注。利用这个数据集，我们在一个主动学习框架内模拟人类标注人员，以开发和比较人工智能辅助的标注方法，使用一个判别模型和一个模拟预言机来收集可接受的分割标注。在这些标注上对一个U-Net模型进行重新训练，生成新的分割结果。我们使用预言机重新标注的分割结果的百分比来评估各种预言机启发式方法，并通过在一个验证数据集上评估交并比来衡量所提出标注的质量。

结果

我们的方法将专家标注人员的投入减少了44%。我们展示了一个由获得委员会认证的放射科医生批准的1136个二进制分割标注的数据集，并公开了143幅图像的验证集，以便与其他研究人员的方法进行比较。

结论

我们证明主动学习可以显著提高创建带标注的乳腺钼靶数据集的效率和时效性。我们的框架有助于高质量数据集的开发，同时将数字乳腺钼靶领域的人工工作量降至最低。

相似文献

Improving annotation efficiency for fully labeling a breast mass segmentation dataset.提高对乳腺肿块分割数据集进行完全标注的标注效率。

J Med Imaging (Bellingham). 2025 May;12(3):035501. doi: 10.1117/1.JMI.12.3.035501. Epub 2025 May 21.

Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.利用基础模型库进行跨设备肿瘤显微镜检查中的细胞相似性搜索。

Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.

UltraBones100k: A reliable automated labeling method and large-scale dataset for ultrasound-based bone surface extraction.UltraBones100k：一种用于基于超声的骨表面提取的可靠自动标注方法及大规模数据集。

Comput Biol Med. 2025 Aug;194:110435. doi: 10.1016/j.compbiomed.2025.110435. Epub 2025 Jun 4.

Influence of early through late fusion on pancreas segmentation from imperfectly registered multimodal magnetic resonance imaging.早期至晚期融合对来自配准不完善的多模态磁共振成像的胰腺分割的影响。

J Med Imaging (Bellingham). 2025 Mar;12(2):024008. doi: 10.1117/1.JMI.12.2.024008. Epub 2025 Apr 26.

Streamlining the annotation process by radiologists of volumetric medical images with few-shot learning.通过少样本学习简化放射科医生对容积医学图像的标注过程。

Int J Comput Assist Radiol Surg. 2025 Jun 25. doi: 10.1007/s11548-025-03457-3.

Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病：网络荟萃分析。

Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.

Automated assessment of task-based performance of digital mammography and tomosynthesis systems using an anthropomorphic breast phantom and deep learning-based scoring.使用拟人化乳房模型和基于深度学习的评分对数字乳腺摄影和断层合成系统的基于任务的性能进行自动评估。

J Med Imaging (Bellingham). 2025 Jan;12(Suppl 1):S13005. doi: 10.1117/1.JMI.12.S1.S13005. Epub 2024 Oct 15.

Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益

Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.

Tuning vision foundation models for rectal cancer segmentation from CT scans.从CT扫描中调整用于直肠癌分割的视觉基础模型。

Commun Med (Lond). 2025 Jul 1;5(1):256. doi: 10.1038/s43856-025-00953-0.

Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.慢性斑块状银屑病的全身药理学治疗：一项网状Meta分析。

Cochrane Database Syst Rev. 2020 Jan 9;1(1):CD011535. doi: 10.1002/14651858.CD011535.pub3.

本文引用的文献

Learning with limited annotations: A survey on deep semi-supervised learning for medical image segmentation.利用有限标注进行学习：医学图像分割的深度半监督学习综述。

Comput Biol Med. 2024 Feb;169:107840. doi: 10.1016/j.compbiomed.2023.107840. Epub 2023 Dec 16.

A review of the machine learning datasets in mammography, their adherence to the FAIR principles and the outlook for the future.对乳腺 X 线摄影机器学习数据集的回顾，以及它们对 FAIR 原则的遵守情况和未来展望。

Sci Data. 2023 Sep 8;10(1):595. doi: 10.1038/s41597-023-02430-6.

Breast Cancer Statistics, 2022.2022 年乳腺癌统计数据。

CA Cancer J Clin. 2022 Nov;72(6):524-541. doi: 10.3322/caac.21754. Epub 2022 Oct 3.

A reciprocal learning strategy for semisupervised medical image segmentation.一种用于半监督医学图像分割的互惠学习策略。

Med Phys. 2023 Jan;50(1):163-177. doi: 10.1002/mp.15923. Epub 2022 Aug 23.

Automatic mass detection in mammograms using deep convolutional neural networks.使用深度卷积神经网络在乳腺X光片中进行自动肿块检测。

J Med Imaging (Bellingham). 2019 Jul;6(3):031409. doi: 10.1117/1.JMI.6.3.031409. Epub 2019 Feb 20.

Applications and challenges of artificial intelligence in diagnostic and interventional radiology.人工智能在诊断与介入放射学中的应用及挑战

Pol J Radiol. 2022 Feb 25;87:e113-e117. doi: 10.5114/pjr.2022.113531. eCollection 2022.

Fostering a Healthy AI Ecosystem for Radiology: Conclusions of the 2018 RSNA Summit on AI in Radiology.为放射学培育健康的人工智能生态系统：2018年RSNA放射学人工智能峰会结论

Radiol Artif Intell. 2019 Mar 27;1(2):190021. doi: 10.1148/ryai.2019190021. eCollection 2019 Mar.

nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation.nnU-Net：一种基于深度学习的生物医学图像分割的自配置方法。

Nat Methods. 2021 Feb;18(2):203-211. doi: 10.1038/s41592-020-01008-z. Epub 2020 Dec 7.

Evaluation of Combined Artificial Intelligence and Radiologist Assessment to Interpret Screening Mammograms.联合人工智能和放射科医生评估解读筛查性乳房 X 光照片的效果。

JAMA Netw Open. 2020 Mar 2;3(3):e200265. doi: 10.1001/jamanetworkopen.2020.0265.

Integrating artificial intelligence into the clinical practice of radiology: challenges and recommendations.将人工智能融入放射科的临床实践：挑战与建议。

Eur Radiol. 2020 Jun;30(6):3576-3584. doi: 10.1007/s00330-020-06672-5. Epub 2020 Feb 17.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验