From the Delft University of Technology, Delft, the Netherlands (L.R.K.); Segmed, 3790 El Camino Real #810, Palo Alto, CA 94306 (J.W., A.L., M.C., W.A.K., J.P., M.J.W.); Department of Radiology, University of Washington, Seattle, Wash (D.M.); Department of Radiology, OncoRad/Tumor Imaging Metrics Core, Seattle, Wash (D.M.); Harvard University, Cambridge, Mass (J.P.); Department of Radiology, Stanford University School of Medicine, Palo Alto, Calif (A.S.C.); Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (A.S.C.); Department of Biomedical Informatics, Harvard Medical School, Boston, Mass (P.R.); Microsoft, Redmond, Wash (M.P.L.); and Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, Calif (M.P.L.).
Radiology. 2024 Sep;312(3):e232471. doi: 10.1148/radiol.232471.
Artificial intelligence (AI) models for medical imaging tasks, such as classification or segmentation, require large and diverse datasets of images. However, due to privacy and ethical issues, as well as data sharing infrastructure barriers, these datasets are scarce and difficult to assemble. Synthetic medical imaging data generated by AI from existing data could address this challenge by augmenting and anonymizing real imaging data. In addition, synthetic data enable new applications, including modality translation, contrast synthesis, and professional training for radiologists. However, the use of synthetic data also poses technical and ethical challenges. These challenges include ensuring the realism and diversity of the synthesized images while keeping data unidentifiable, evaluating the performance and generalizability of models trained on synthetic data, and high computational costs. Since existing regulations are not sufficient to guarantee the safe and ethical use of synthetic images, it becomes evident that updated laws and more rigorous oversight are needed. Regulatory bodies, physicians, and AI developers should collaborate to develop, maintain, and continually refine best practices for synthetic data. This review aims to provide an overview of the current knowledge of synthetic data in medical imaging and highlights current key challenges in the field to guide future research and development.
人工智能(AI)模型在医学影像任务(如分类或分割)中需要大量且多样化的图像数据集。然而,由于隐私和道德问题以及数据共享基础设施障碍,这些数据集稀缺且难以收集。人工智能从现有数据中生成的合成医学影像数据可以通过扩充和匿名化真实影像数据来解决这一挑战。此外,合成数据还可以实现新的应用,包括模态转换、对比度合成以及放射科医生的专业培训。然而,合成数据的使用也带来了技术和伦理方面的挑战。这些挑战包括确保合成图像的真实性和多样性,同时保持数据的不可识别性,评估在合成数据上训练的模型的性能和泛化能力,以及高计算成本。由于现有的法规不足以保证合成图像的安全和伦理使用,因此显然需要更新的法律和更严格的监督。监管机构、医生和人工智能开发者应合作制定、维护和不断完善合成数据的最佳实践。这篇综述旨在概述医学影像中合成数据的现有知识,并强调该领域当前的关键挑战,以指导未来的研究和开发。
Radiology. 2024-9
Disabil Rehabil Assist Technol. 2025-3-13
Cochrane Database Syst Rev. 2014-4-29
J Med Internet Res. 2025-6-23
Cochrane Database Syst Rev. 2024-8-27
JMIR Mhealth Uhealth. 2025-1-29
Eur J Nucl Med Mol Imaging. 2025-3-22
Nat Biomed Eng. 2025-4
NPJ Digit Med. 2023-10-9
Proc IEEE Inst Electr Electron Eng. 2021-5
Med Image Anal. 2023-8
Phys Med Biol. 2023-5-5
Int J Comput Assist Radiol Surg. 2023-10
PLOS Digit Health. 2023-1-6
Diagnostics (Basel). 2022-12-26