Bayer Oy, Tuulikuja 2, 02100, Espoo, Finland.
HUS Helsinki University Hospital, Helsinki, Finland.
Sci Rep. 2024 Jan 19;14(1):1731. doi: 10.1038/s41598-024-51938-3.
A growing body of research is focusing on real-world data (RWD) to supplement or replace randomized controlled trials (RCTs). However, due to the disparities in data generation mechanisms, differences are likely and necessitate scrutiny to validate the merging of these datasets. We compared the characteristics of RCT data from 5734 diabetic kidney disease patients with corresponding RWD from electronic health records (EHRs) of 23,523 patients. Demographics, diagnoses, medications, laboratory measurements, and vital signs were analyzed using visualization, statistical comparison, and cluster analysis. RCT and RWD sets exhibited significant differences in prevalence, longitudinality, completeness, and sampling density. The cluster analysis revealed distinct patient subgroups within both RCT and RWD sets, as well as clusters containing patients from both sets. We stress the importance of validation to verify the feasibility of combining RCT and RWD, for instance, in building an external control arm. Our results highlight general differences between RCT and RWD sets, which should be considered during the planning stages of an RCT-RWD study. If they are, RWD has the potential to enrich RCT data by providing first-hand baseline data, filling in missing data or by subgrouping or matching individuals, which calls for advanced methods to mitigate the differences between datasets.
越来越多的研究关注真实世界数据(RWD),以补充或替代随机对照试验(RCT)。然而,由于数据生成机制的差异,可能存在差异,需要仔细审查以验证这些数据集的合并。我们比较了来自 5734 名糖尿病肾病患者的 RCT 数据与来自 23523 名患者的电子健康记录(EHR)的相应 RWD 的特征。使用可视化、统计比较和聚类分析分析了人口统计学、诊断、药物、实验室测量和生命体征。RCT 和 RWD 数据集在流行率、纵向性、完整性和采样密度方面存在显著差异。聚类分析显示了 RCT 和 RWD 数据集内的不同患者亚组,以及包含来自两个数据集的患者的聚类。我们强调验证的重要性,以验证合并 RCT 和 RWD 的可行性,例如,在建立外部对照臂时。我们的结果突出了 RCT 和 RWD 数据集之间的一般差异,在 RCT-RWD 研究的规划阶段应考虑这些差异。如果考虑到这些差异,RWD 有可能通过提供第一手基线数据、填补缺失数据或通过亚组或匹配个体来丰富 RCT 数据,这需要先进的方法来减轻数据集之间的差异。