RCT-Twin-GAN生成适应真实世界患者的随机对照试验数字孪生体，以增强其推理和应用能力。

RCT-Twin-GAN Generates Digital Twins of Randomized Control Trials Adapted to Real-world Patients to Enhance their Inference and Application.

作者信息

Thangaraj Phyllis M, Shankar Sumukh Vasisht, Oikonomou Evangelos K, Khera Rohan

机构信息

Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA.

Section of Health Informatics, Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.

出版信息

medRxiv. 2023 Dec 15:2023.12.06.23299464. doi: 10.1101/2023.12.06.23299464.

DOI:10.1101/2023.12.06.23299464

PMID:38106089

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10723568/

Abstract

BACKGROUND

Randomized clinical trials (RCTs) are designed to produce evidence in selected populations. Assessing their effects in the real-world is essential to change medical practice, however, key populations are historically underrepresented in the RCTs. We define an approach to simulate RCT-based effects in real-world settings using RCT digital twins reflecting the covariate patterns in an electronic health record (EHR).

METHODS

We developed a Generative Adversarial Network (GAN) model, RCT-Twin-GAN, which generates a digital twin of an RCT (RCT-Twin) conditioned on covariate distributions from an EHR cohort. We improved upon a traditional tabular conditional GAN, CTGAN, with a loss function adapted for data distributions and by conditioning on multiple discrete and continuous covariates simultaneously. We assessed the similarity between a Heart Failure with preserved Ejection Fraction (HFpEF) RCT (TOPCAT), a Yale HFpEF EHR cohort, and RCT-Twin. We also evaluated cardiovascular event-free survival stratified by Spironolactone (treatment) use.

RESULTS

By applying RCT-Twin-GAN to 3445 TOPCAT participants and conditioning on 3445 Yale EHR HFpEF patients, we generated RCT-Twin datasets between 1141-3445 patients in size, depending on covariate conditioning and model parameters. RCT-Twin randomly allocated spironolactone (S)/ placebo (P) arms like an RCT, was similar to RCT by a multi-dimensional distance metric, and balanced covariates (median absolute standardized mean difference (MASMD) 0.017, IQR 0.0034-0.030). The 5 EHR-conditioned covariates in RCT-Twin were closer to the EHR compared with the RCT (MASMD 0.008 vs 0.63, IQR 0.005-0.018 vs 0.59-1.11). RCT-Twin reproduced the overall effect size seen in TOPCAT (5-year cardiovascular composite outcome odds ratio (95% confidence interval) of 0.89 (0.75-1.06) in RCT vs 0.85 (0.69-1.04) in RCT-Twin).

CONCLUSIONS

RCT-Twin-GAN simulates RCT-derived effects in real-world patients by translating these effects to the covariate distributions of EHR patients. This key methodological advance may enable the direct translation of RCT-derived effects into real-world patient populations and may enable causal inference in real-world settings.

摘要

背景

随机临床试验（RCT）旨在在特定人群中产生证据。然而，评估其在现实世界中的效果对于改变医疗实践至关重要，关键人群在RCT中的代表性历来不足。我们定义了一种方法，使用反映电子健康记录（EHR）中协变量模式的RCT数字孪生体来模拟现实世界环境中基于RCT的效果。

方法

我们开发了一种生成对抗网络（GAN）模型，即RCT-Twin-GAN，它根据EHR队列的协变量分布生成RCT的数字孪生体（RCT-Twin）。我们改进了传统的表格条件GAN（CTGAN），采用了适用于数据分布的损失函数，并同时基于多个离散和连续协变量进行条件设定。我们评估了射血分数保留的心力衰竭（HFpEF）RCT（TOPCAT）、耶鲁HFpEF EHR队列和RCT-Twin之间的相似性。我们还评估了按螺内酯（治疗）使用情况分层的无心血管事件生存期。

结果

通过将RCT-Twin-GAN应用于3445名TOPCAT参与者，并以3445名耶鲁EHR HFpEF患者为条件，我们生成了规模在1141-3445名患者之间的RCT-Twin数据集，具体取决于协变量条件设定和模型参数。RCT-Twin像RCT一样随机分配螺内酯（S）/安慰剂（P）组，通过多维距离度量与RCT相似，并且协变量平衡（中位数绝对标准化均值差（MASMD）为0.017，四分位数间距为0.0034-0.030）。与RCT相比，RCT-Twin中的5个EHR条件协变量更接近EHR（MASMD为0.008对0.63，四分位数间距为0.005-0.018对0.59-1.11）。RCT-Twin再现了TOPCAT中观察到的总体效应大小（RCT中5年心血管综合结局优势比（95%置信区间）为0.89（0.75-1.06），而RCT-Twin中为0.85（0.69-1.04））。

结论

RCT-Twin-GAN通过将基于RCT的效果转化为EHR患者的协变量分布，在现实世界患者中模拟基于RCT的效果。这一关键的方法学进展可能使基于RCT的效果能够直接转化为现实世界患者群体，并可能在现实世界环境中进行因果推断。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/161e/10727763/2634224eed57/nihpp-2023.12.06.23299464v2-f0001.jpg

相似文献

RCT-Twin-GAN Generates Digital Twins of Randomized Control Trials Adapted to Real-world Patients to Enhance their Inference and Application.RCT-Twin-GAN生成适应真实世界患者的随机对照试验数字孪生体，以增强其推理和应用能力。

medRxiv. 2023 Dec 15:2023.12.06.23299464. doi: 10.1101/2023.12.06.23299464.

A Novel Digital Twin Strategy to Examine the Implications of Randomized Clinical Trials for Real-World Populations.一种用于研究随机临床试验对真实世界人群影响的新型数字孪生策略。

medRxiv. 2024 Sep 6:2024.03.25.24304868. doi: 10.1101/2024.03.25.24304868.

Computational Phenomapping of Randomized Clinical Trials to Enable Assessment of their Real-world Representativeness and Personalized Inference.随机临床试验的计算表型映射，以评估其真实世界代表性和个性化推断

medRxiv. 2025 Jan 24:2024.05.15.24306285. doi: 10.1101/2024.05.15.24306285.

Computational Phenomapping of Randomized Clinical Trial Participants to Enable Assessment of Their Real-World Representativeness and Personalized Inference.随机临床试验参与者的计算表型映射，以评估其真实世界代表性和个性化推断

Circ Cardiovasc Qual Outcomes. 2025 May;18(5):e011306. doi: 10.1161/CIRCOUTCOMES.124.011306. Epub 2025 Apr 22.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Assessment of a Clinical Trial-Derived Survival Model in Patients With Metastatic Castration-Resistant Prostate Cancer.转移性去势抵抗性前列腺癌患者的临床试验衍生生存模型评估。

JAMA Netw Open. 2021 Jan 4;4(1):e2031730. doi: 10.1001/jamanetworkopen.2020.31730.

Tabular transformer generative adversarial network for heterogeneous distribution in healthcare.用于医疗保健中异构分布的表格变压器生成对抗网络。

Sci Rep. 2025 Mar 25;15(1):10254. doi: 10.1038/s41598-025-93077-3.

A composite metric for predicting benefit from spironolactone in heart failure with preserved ejection fraction.一种用于预测射血分数保留的心力衰竭患者使用螺内酯获益情况的综合指标。

ESC Heart Fail. 2021 Oct;8(5):3495-3503. doi: 10.1002/ehf2.13523. Epub 2021 Aug 8.

Generating synthetic mixed-type longitudinal electronic health records for artificial intelligent applications.为人工智能应用生成合成混合型纵向电子健康记录。

NPJ Digit Med. 2023 May 27;6(1):98. doi: 10.1038/s41746-023-00834-7.

Enhanced Conditional GAN for High-Quality Synthetic Tabular Data Generation in Mobile-Based Cardiovascular Healthcare.用于基于移动设备的心血管医疗保健中高质量合成表格数据生成的增强条件生成对抗网络

Sensors (Basel). 2024 Nov 30;24(23):7673. doi: 10.3390/s24237673.

本文引用的文献

Cardiovascular Care Innovation through Data-Driven Discoveries in the Electronic Health Record.电子健康记录中的数据驱动发现推动心血管护理创新。

Am J Cardiol. 2023 Sep 15;203:136-148. doi: 10.1016/j.amjcard.2023.06.104. Epub 2023 Jul 25.

Generating synthetic mixed-type longitudinal electronic health records for artificial intelligent applications.为人工智能应用生成合成混合型纵向电子健康记录。

NPJ Digit Med. 2023 May 27;6(1):98. doi: 10.1038/s41746-023-00834-7.

Generalizability of randomized controlled trials in heart failure with reduced ejection fraction.射血分数降低的心力衰竭随机对照试验的可推广性。

Eur Heart J Qual Care Clin Outcomes. 2022 Oct 26;8(7):761-769. doi: 10.1093/ehjqcco/qcab070.

Evaluating eligibility criteria of oncology trials using real-world data and AI.利用真实世界数据和人工智能评估肿瘤学试验的入组标准。

Nature. 2021 Apr;592(7855):629-633. doi: 10.1038/s41586-021-03430-5. Epub 2021 Apr 7.

Conditional Generative Adversarial Networks for Individualized Treatment Effect Estimation and Treatment Selection.用于个性化治疗效果估计和治疗选择的条件生成对抗网络

Front Genet. 2020 Dec 11;11:585804. doi: 10.3389/fgene.2020.585804. eCollection 2020.

Treatment effect prediction with adversarial deep learning using electronic health records.利用电子健康记录进行对抗性深度学习的治疗效果预测。

BMC Med Inform Decis Mak. 2020 Dec 14;20(Suppl 4):139. doi: 10.1186/s12911-020-01151-9.

The Counterfactual χ-GAN: Finding comparable cohorts in observational health data.反事实 χ-GAN：在观察性健康数据中找到可比队列。

J Biomed Inform. 2020 Sep;109:103515. doi: 10.1016/j.jbi.2020.103515. Epub 2020 Aug 7.

Clinical Phenogroups in Heart Failure With Preserved Ejection Fraction: Detailed Phenotypes, Prognosis, and Response to Spironolactone.射血分数保留的心力衰竭中的临床表型组：详细表型、预后及对螺内酯的反应

JACC Heart Fail. 2020 Mar;8(3):172-184. doi: 10.1016/j.jchf.2019.09.009. Epub 2020 Jan 8.

Assessing the Eligibility Criteria in Phase III Randomized Controlled Trials of Drug Therapy in Heart Failure With Preserved Ejection Fraction: The Critical Play-Off Between a "Pure" Patient Phenotype and the Generalizability of Trial Findings.评估射血分数保留的心力衰竭药物治疗 III 期随机对照试验的纳入标准：“纯”患者表型与试验结果普遍性之间的关键博弈。

J Card Fail. 2017 Jul;23(7):517-524. doi: 10.1016/j.cardfail.2017.04.006. Epub 2017 Apr 18.

Regional variation in patients and outcomes in the Treatment of Preserved Cardiac Function Heart Failure With an Aldosterone Antagonist (TOPCAT) trial.TOPCAT 试验中醛固酮拮抗剂治疗心脏收缩功能保留心力衰竭患者及结局的地域差异。

Circulation. 2015 Jan 6;131(1):34-42. doi: 10.1161/CIRCULATIONAHA.114.013255. Epub 2014 Nov 18.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

RCT-Twin-GAN生成适应真实世界患者的随机对照试验数字孪生体，以增强其推理和应用能力。

RCT-Twin-GAN Generates Digital Twins of Randomized Control Trials Adapted to Real-world Patients to Enhance their Inference and Application.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献