Suppr超能文献

合成数据对训练用于对比增强乳腺钼靶中病变检测与分类的深度学习模型的影响。

Impact of synthetic data on training a deep learning model for lesion detection and classification in contrast-enhanced mammography.

作者信息

Van Camp Astrid, Woodruff Henry C, Cockmartin Lesley, Lobbes Marc, Majer Michael, Balleyguier Corinne, Marshall Nicholas W, Bosmans Hilde, Lambin Philippe

机构信息

Maastricht University, GROW - Research Institute for Oncology and Reproduction, Department of Precision Medicine, Maastricht, The Netherlands.

KU Leuven, Division of Medical Physics & Quality Assessment, Department of Imaging and Pathology, Leuven, Belgium.

出版信息

J Med Imaging (Bellingham). 2025 Nov;12(Suppl 2):S22006. doi: 10.1117/1.JMI.12.S2.S22006. Epub 2025 Apr 28.

Abstract

PURPOSE

Predictive models for contrast-enhanced mammography often perform better at detecting and classifying enhancing masses than (non-enhancing) microcalcification clusters. We aim to investigate whether incorporating synthetic data with simulated microcalcification clusters during training can enhance model performance.

APPROACH

Microcalcification clusters were simulated in low-energy images of lesion-free breasts from 782 patients, considering local texture features. Enhancement was simulated in the corresponding recombined images. A deep learning (DL) model for lesion detection and classification was trained with varying ratios of synthetic and real (850 patients) data. In addition, a handcrafted radiomics classifier was trained using delineations and class labels from real data, and predictions from both models were ensembled. Validation was performed on internal (212 patients) and external (279 patients) real datasets.

RESULTS

The DL model trained exclusively with synthetic data detected over 60% of malignant lesions. Adding synthetic data to smaller real training sets improved detection sensitivity for malignant lesions but decreased precision. Performance plateaued at a detection sensitivity of 0.80. The ensembled DL and radiomics models performed worse than the standalone DL model, decreasing the area under this receiver operating characteristic curve from 0.75 to 0.60 on the external validation set, likely due to falsely detected suspicious regions of interest.

CONCLUSIONS

Synthetic data can enhance DL model performance, provided model setup and data distribution are optimized. The possibility to detect malignant lesions without real data present in the training set confirms the utility of synthetic data. It can serve as a helpful tool, especially when real data are scarce, and it is most effective when complementing real data.

摘要

目的

对比增强乳腺钼靶的预测模型在检测和分类强化肿块方面通常比(非强化的)微钙化簇表现更好。我们旨在研究在训练过程中加入带有模拟微钙化簇的合成数据是否能提高模型性能。

方法

考虑局部纹理特征,在782例无病变乳房的低能量图像中模拟微钙化簇。在相应的重组图像中模拟强化。使用不同比例的合成数据和真实数据(850例患者)训练用于病变检测和分类的深度学习(DL)模型。此外,使用来自真实数据的轮廓和类别标签训练手工制作的放射组学分类器,并将两个模型的预测结果合并。在内部(212例患者)和外部(279例患者)真实数据集上进行验证。

结果

仅使用合成数据训练的DL模型检测出超过60%的恶性病变。向较小的真实训练集添加合成数据可提高恶性病变的检测灵敏度,但降低了精度。检测灵敏度达到0.80时性能趋于平稳。合并的DL模型和放射组学模型的表现比独立的DL模型更差,在外部验证集上,受试者操作特征曲线下面积从0.75降至0.60,可能是由于错误检测到可疑感兴趣区域。

结论

只要优化模型设置和数据分布,合成数据可以提高DL模型性能。在训练集中没有真实数据的情况下检测恶性病变的可能性证实了合成数据的实用性。它可以作为一个有用的工具,特别是在真实数据稀缺时,并且在补充真实数据时最有效。

相似文献

本文引用的文献

7
Foundation models for generalist medical artificial intelligence.通用型医学人工智能的基础模型。
Nature. 2023 Apr;616(7956):259-265. doi: 10.1038/s41586-023-05881-4. Epub 2023 Apr 12.
10
A review of artificial intelligence in mammography.人工智能在乳腺 X 线摄影中的应用综述。
Clin Imaging. 2022 Aug;88:36-44. doi: 10.1016/j.clinimag.2022.05.005. Epub 2022 May 15.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验