利用生成式人工智能模拟急性髓系白血病患者的临床试验。

Mimicking clinical trials with synthetic acute myeloid leukemia patients using generative artificial intelligence.

作者信息

Eckardt Jan-Niklas, Hahn Waldemar, Röllig Christoph, Stasik Sebastian, Platzbecker Uwe, Müller-Tidow Carsten, Serve Hubert, Baldus Claudia D, Schliemann Christoph, Schäfer-Eckart Kerstin, Hanoun Maher, Kaufmann Martin, Burchert Andreas, Thiede Christian, Schetelig Johannes, Sedlmayr Martin, Bornhäuser Martin, Wolfien Markus, Middeke Jan Moritz

机构信息

Department of Internal Medicine I, University Hospital Carl Gustav Carus, Technical University Dresden, Dresden, Germany.

Else Kröner Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany.

出版信息

NPJ Digit Med. 2024 Mar 20;7(1):76. doi: 10.1038/s41746-024-01076-x.

DOI:10.1038/s41746-024-01076-x

PMID:38509224

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10954666/

Abstract

Clinical research relies on high-quality patient data, however, obtaining big data sets is costly and access to existing data is often hindered by privacy and regulatory concerns. Synthetic data generation holds the promise of effectively bypassing these boundaries allowing for simplified data accessibility and the prospect of synthetic control cohorts. We employed two different methodologies of generative artificial intelligence - CTAB-GAN+ and normalizing flows (NFlow) - to synthesize patient data derived from 1606 patients with acute myeloid leukemia, a heterogeneous hematological malignancy, that were treated within four multicenter clinical trials. Both generative models accurately captured distributions of demographic, laboratory, molecular and cytogenetic variables, as well as patient outcomes yielding high performance scores regarding fidelity and usability of both synthetic cohorts (n = 1606 each). Survival analysis demonstrated close resemblance of survival curves between original and synthetic cohorts. Inter-variable relationships were preserved in univariable outcome analysis enabling explorative analysis in our synthetic data. Additionally, training sample privacy is safeguarded mitigating possible patient re-identification, which we quantified using Hamming distances. We provide not only a proof-of-concept for synthetic data generation in multimodal clinical data for rare diseases, but also full public access to synthetic data sets to foster further research.

摘要

临床研究依赖于高质量的患者数据，然而，获取大数据集成本高昂，且对现有数据的访问常常受到隐私和监管问题的阻碍。合成数据生成有望有效突破这些限制，实现更便捷的数据获取，并带来合成对照队列的前景。我们采用了两种不同的生成式人工智能方法——CTAB-GAN+和归一化流（NFlow）——来合成来自1606例急性髓系白血病患者的数据，急性髓系白血病是一种异质性血液系统恶性肿瘤，这些数据来自四项多中心临床试验中的患者治疗信息。两种生成模型都准确地捕捉了人口统计学、实验室、分子和细胞遗传学变量的分布，以及患者的预后情况，两个合成队列（各n = 1606）在保真度和可用性方面均获得了高分。生存分析表明，原始队列和合成队列的生存曲线非常相似。单变量结果分析中保留了变量间的关系，从而能够对我们的合成数据进行探索性分析。此外，训练样本的隐私得到了保护，减少了患者被重新识别的可能性，我们使用汉明距离对其进行了量化。我们不仅为罕见病多模态临床数据的合成数据生成提供了概念验证，还提供了合成数据集的完全公共访问权限，以促进进一步的研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf0d/10954666/45b87695b964/41746_2024_1076_Fig1_HTML.jpg

相似文献

Mimicking clinical trials with synthetic acute myeloid leukemia patients using generative artificial intelligence.

NPJ Digit Med. 2024 Mar 20;7(1):76. doi: 10.1038/s41746-024-01076-x.

Synthetic Data Generation in Hematology - Paving the Way for OMOP and FHIR Integration.

Stud Health Technol Inform. 2024 Aug 22;316:1472-1476. doi: 10.3233/SHTI240692.

Synthetic Data Generation by Artificial Intelligence to Accelerate Research and Precision Medicine in Hematology.

JCO Clin Cancer Inform. 2023 Jun;7:e2300021. doi: 10.1200/CCI.23.00021.

Unlocking the potential of synthetic patients for accelerating clinical trials: Results of the first GIMEMA experience on acute myeloid leukemia patients.

EJHaem. 2024 Mar 15;5(2):353-359. doi: 10.1002/jha2.873. eCollection 2024 Apr.

Creating High Fidelity Synthetic Pelvis Radiographs Using Generative Adversarial Networks: Unlocking the Potential of Deep Learning Models Without Patient Privacy Concerns.

J Arthroplasty. 2023 Oct;38(10):2037-2043.e1. doi: 10.1016/j.arth.2022.12.013. Epub 2022 Dec 17.

Generating synthetic mixed-type longitudinal electronic health records for artificial intelligent applications.

NPJ Digit Med. 2023 May 27;6(1):98. doi: 10.1038/s41746-023-00834-7.

Assessment of Deep Generative Models for High-Resolution Synthetic Retinal Image Generation of Age-Related Macular Degeneration.

JAMA Ophthalmol. 2019 Mar 1;137(3):258-264. doi: 10.1001/jamaophthalmol.2018.6156.

Demonstrating the successful application of synthetic learning in spine surgery for training multi-center models with increased patient privacy.

Sci Rep. 2023 Aug 1;13(1):12481. doi: 10.1038/s41598-023-39458-y.

Generative adversarial network based synthetic data training model for lightweight convolutional neural networks.

Multimed Tools Appl. 2023 May 20:1-23. doi: 10.1007/s11042-023-15747-6.

CTAB-GAN+: enhancing tabular data synthesis.

Front Big Data. 2024 Jan 8;6:1296508. doi: 10.3389/fdata.2023.1296508. eCollection 2023.

引用本文的文献

Artificial intelligence across the cancer care continuum.

Cancer. 2025 Aug 15;131(16):e70050. doi: 10.1002/cncr.70050.

Unconditional latent diffusion models memorize patient imaging data.

Nat Biomed Eng. 2025 Aug 11. doi: 10.1038/s41551-025-01468-8.

Ethics and Algorithms to Navigate AI's Emerging Role in Organ Transplantation.

J Clin Med. 2025 Apr 17;14(8):2775. doi: 10.3390/jcm14082775.

Synthetic data generation: a privacy-preserving approach to accelerate rare disease research.

Front Digit Health. 2025 Mar 18;7:1563991. doi: 10.3389/fdgth.2025.1563991. eCollection 2025.

Artificial Intelligence-Enabled Clinical Trials in Inflammatory Bowel Disease: Automating and Enhancing Disease Assessment and Study Management.

Gastroenterology. 2025 Aug;169(3):432-443. doi: 10.1053/j.gastro.2025.02.039. Epub 2025 Mar 28.

AI-driven synthetic data generation for accelerating hepatology research: A study of the United Network for Organ Sharing (UNOS) database.

Hepatology. 2025 Mar 11. doi: 10.1097/HEP.0000000000001299.

Augmenting Insufficiently Accruing Oncology Clinical Trials Using Generative Models: Validation Study.

J Med Internet Res. 2025 Mar 5;27:e66821. doi: 10.2196/66821.

Improving medical machine learning models with generative balancing for equity and excellence.

NPJ Digit Med. 2025 Feb 14;8(1):100. doi: 10.1038/s41746-025-01438-z.

Application of multimodal large language models for safety indicator calculation and contraindication prediction in laser vision correction.

NPJ Digit Med. 2025 Feb 3;8(1):82. doi: 10.1038/s41746-025-01487-4.

Retrospective Analysis of R-COMP Therapy in Patients with Diffuse Large B-Cell Lymphoma (DLBCL): Assessing the Impact of Sample Selection Bias.

J Clin Med. 2025 Jan 20;14(2):639. doi: 10.3390/jcm14020639.

本文引用的文献

CTAB-GAN+: enhancing tabular data synthesis.

Front Big Data. 2024 Jan 8;6:1296508. doi: 10.3389/fdata.2023.1296508. eCollection 2023.

Harnessing the power of synthetic data in healthcare: innovation, application, and privacy.

NPJ Digit Med. 2023 Oct 9;6(1):186. doi: 10.1038/s41746-023-00927-3.

EHR-Safe: generating high-fidelity and privacy-preserving synthetic electronic health records.

NPJ Digit Med. 2023 Aug 11;6(1):141. doi: 10.1038/s41746-023-00888-7.

Opportunities and Challenges of Synthetic Data Generation in Oncology.

JCO Clin Cancer Inform. 2023 Aug;7:e2300045. doi: 10.1200/CCI.23.00045.

Synthetic Data Generation by Artificial Intelligence to Accelerate Research and Precision Medicine in Hematology.

JCO Clin Cancer Inform. 2023 Jun;7:e2300021. doi: 10.1200/CCI.23.00021.

Synthetic data in health care: A narrative review.

PLOS Digit Health. 2023 Jan 6;2(1):e0000082. doi: 10.1371/journal.pdig.0000082. eCollection 2023 Jan.

Synthetic data as an enabler for machine learning applications in medicine.

iScience. 2022 Oct 13;25(11):105331. doi: 10.1016/j.isci.2022.105331. eCollection 2022 Nov 18.

Diagnosis and management of AML in adults: 2022 recommendations from an international expert panel on behalf of the ELN.

Blood. 2022 Sep 22;140(12):1345-1377. doi: 10.1182/blood.2022016867.

Synthetic patient data in health care: a widening legal loophole.

Lancet. 2022 Apr 23;399(10335):1601-1602. doi: 10.1016/S0140-6736(22)00232-X. Epub 2022 Mar 28.

GANs for medical image analysis.

Artif Intell Med. 2020 Sep;109:101938. doi: 10.1016/j.artmed.2020.101938. Epub 2020 Aug 9.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用生成式人工智能模拟急性髓系白血病患者的临床试验。

Mimicking clinical trials with synthetic acute myeloid leukemia patients using generative artificial intelligence.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献