Suppr超能文献

RCT-Twin-GAN生成适应真实世界患者的随机对照试验数字孪生体,以增强其推理和应用能力。

RCT-Twin-GAN Generates Digital Twins of Randomized Control Trials Adapted to Real-world Patients to Enhance their Inference and Application.

作者信息

Thangaraj Phyllis M, Shankar Sumukh Vasisht, Oikonomou Evangelos K, Khera Rohan

机构信息

Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA.

Section of Health Informatics, Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.

出版信息

medRxiv. 2023 Dec 15:2023.12.06.23299464. doi: 10.1101/2023.12.06.23299464.

Abstract

BACKGROUND

Randomized clinical trials (RCTs) are designed to produce evidence in selected populations. Assessing their effects in the real-world is essential to change medical practice, however, key populations are historically underrepresented in the RCTs. We define an approach to simulate RCT-based effects in real-world settings using RCT digital twins reflecting the covariate patterns in an electronic health record (EHR).

METHODS

We developed a Generative Adversarial Network (GAN) model, RCT-Twin-GAN, which generates a digital twin of an RCT (RCT-Twin) conditioned on covariate distributions from an EHR cohort. We improved upon a traditional tabular conditional GAN, CTGAN, with a loss function adapted for data distributions and by conditioning on multiple discrete and continuous covariates simultaneously. We assessed the similarity between a Heart Failure with preserved Ejection Fraction (HFpEF) RCT (TOPCAT), a Yale HFpEF EHR cohort, and RCT-Twin. We also evaluated cardiovascular event-free survival stratified by Spironolactone (treatment) use.

RESULTS

By applying RCT-Twin-GAN to 3445 TOPCAT participants and conditioning on 3445 Yale EHR HFpEF patients, we generated RCT-Twin datasets between 1141-3445 patients in size, depending on covariate conditioning and model parameters. RCT-Twin randomly allocated spironolactone (S)/ placebo (P) arms like an RCT, was similar to RCT by a multi-dimensional distance metric, and balanced covariates (median absolute standardized mean difference (MASMD) 0.017, IQR 0.0034-0.030). The 5 EHR-conditioned covariates in RCT-Twin were closer to the EHR compared with the RCT (MASMD 0.008 vs 0.63, IQR 0.005-0.018 vs 0.59-1.11). RCT-Twin reproduced the overall effect size seen in TOPCAT (5-year cardiovascular composite outcome odds ratio (95% confidence interval) of 0.89 (0.75-1.06) in RCT vs 0.85 (0.69-1.04) in RCT-Twin).

CONCLUSIONS

RCT-Twin-GAN simulates RCT-derived effects in real-world patients by translating these effects to the covariate distributions of EHR patients. This key methodological advance may enable the direct translation of RCT-derived effects into real-world patient populations and may enable causal inference in real-world settings.

摘要

背景

随机临床试验(RCT)旨在在特定人群中产生证据。然而,评估其在现实世界中的效果对于改变医疗实践至关重要,关键人群在RCT中的代表性历来不足。我们定义了一种方法,使用反映电子健康记录(EHR)中协变量模式的RCT数字孪生体来模拟现实世界环境中基于RCT的效果。

方法

我们开发了一种生成对抗网络(GAN)模型,即RCT-Twin-GAN,它根据EHR队列的协变量分布生成RCT的数字孪生体(RCT-Twin)。我们改进了传统的表格条件GAN(CTGAN),采用了适用于数据分布的损失函数,并同时基于多个离散和连续协变量进行条件设定。我们评估了射血分数保留的心力衰竭(HFpEF)RCT(TOPCAT)、耶鲁HFpEF EHR队列和RCT-Twin之间的相似性。我们还评估了按螺内酯(治疗)使用情况分层的无心血管事件生存期。

结果

通过将RCT-Twin-GAN应用于3445名TOPCAT参与者,并以3445名耶鲁EHR HFpEF患者为条件,我们生成了规模在1141-3445名患者之间的RCT-Twin数据集,具体取决于协变量条件设定和模型参数。RCT-Twin像RCT一样随机分配螺内酯(S)/安慰剂(P)组,通过多维距离度量与RCT相似,并且协变量平衡(中位数绝对标准化均值差(MASMD)为0.017,四分位数间距为0.0034-0.030)。与RCT相比,RCT-Twin中的5个EHR条件协变量更接近EHR(MASMD为0.008对0.63,四分位数间距为0.005-0.018对0.59-1.11)。RCT-Twin再现了TOPCAT中观察到的总体效应大小(RCT中5年心血管综合结局优势比(95%置信区间)为0.89(0.75-1.06),而RCT-Twin中为0.85(0.69-1.04))。

结论

RCT-Twin-GAN通过将基于RCT的效果转化为EHR患者的协变量分布,在现实世界患者中模拟基于RCT的效果。这一关键的方法学进展可能使基于RCT的效果能够直接转化为现实世界患者群体,并可能在现实世界环境中进行因果推断。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/161e/10727763/2634224eed57/nihpp-2023.12.06.23299464v2-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验