Genesis Research Group, 111 River St, Ste 1120, Hoboken, NJ, 07030, USA.
BMC Med Res Methodol. 2024 Sep 13;24(1):203. doi: 10.1186/s12874-024-02313-3.
Evaluating outcome reliability is critical in real-world evidence studies. Overall survival is a common outcome in these studies; however, its capture in real-world data (RWD) sources is often incomplete and supplemented with linked mortality information from external sources. Conflicting recommendations exist for censoring overall survival in real-world evidence studies. This simulation study aimed to understand the impact of different censoring methods on estimating median survival and log hazard ratios when external mortality information is partially captured.
We used Monte Carlo simulation to emulate a non-randomized comparative effectiveness study of two treatments with RWD from electronic health records and linked external mortality data. We simulated the time to death, the time to last database activity, and the time to data cutoff. Death events after the last database activity were attributed to linked external mortality data and randomly set to missing to reflect the sensitivity of contemporary real-world data sources. Two censoring schemes were evaluated: (1) censoring at the last activity date and (2) censoring at the end of data availability (data cutoff) without an observed death. We assessed the performance of each method in estimating median survival and log hazard ratios using bias, coverage, variance, and rejection rate under varying amounts of incomplete mortality information and varying treatment effects, length of follow-up, and sample size.
When mortality information was fully captured, median survival estimates were unbiased when censoring at data cutoff and underestimated when censoring at the last activity. When linked mortality information was missing, censoring at the last activity date underestimated the median survival, while censoring at the data cutoff overestimated it. As missing linked mortality information increased, bias decreased when censoring at the last activity date and increased when censoring at data cutoff.
Researchers should consider the completeness of linked external mortality information when choosing how to censor the analysis of overall survival using RWD. Substantial bias in median survival estimates can occur if an inappropriate censoring scheme is selected. We advocate for RWD providers to perform validation studies of their mortality data and publish their findings to inform methodological decisions better.
在真实世界证据研究中,评估结局可靠性至关重要。总生存是这些研究中的常见结局;然而,其在真实世界数据(RWD)来源中的捕获往往不完整,并通过外部来源的链接死亡率信息进行补充。对于真实世界证据研究中总生存的删失,存在相互矛盾的建议。本模拟研究旨在了解当部分捕获外部死亡率信息时,不同删失方法对估计中位生存时间和对数风险比的影响。
我们使用蒙特卡罗模拟来模拟来自电子健康记录和链接外部死亡率数据的 RWD 的两项治疗的非随机对照有效性研究。我们模拟了死亡时间、最后数据库活动时间和数据截止时间。最后数据库活动后的死亡事件归因于链接的外部死亡率数据,并随机设置为缺失,以反映当代真实世界数据源的敏感性。评估了两种删失方案:(1)在最后一次活动日期进行删失;(2)在数据可用性结束时(数据截止)进行删失,而不观察到死亡。我们根据不完全死亡率信息的数量和不同的治疗效果、随访时间和样本量,评估了每种方法在估计中位生存时间和对数风险比方面的表现,包括偏差、覆盖度、方差和拒绝率。
当完全捕获死亡率信息时,在数据截止时进行删失时中位生存估计值无偏,而在最后一次活动时进行删失时低估。当链接的死亡率信息缺失时,在最后一次活动日期进行删失会低估中位生存时间,而在数据截止时进行删失会高估。随着缺失链接的死亡率信息的增加,在最后一次活动日期进行删失时的偏差减小,而在数据截止时进行删失时的偏差增加。
研究人员在选择如何使用 RWD 分析总生存时,应考虑链接外部死亡率信息的完整性。如果选择了不合适的删失方案,中位生存估计值可能会出现很大的偏差。我们主张 RWD 提供者对其死亡率数据进行验证研究,并发布研究结果以更好地为方法学决策提供信息。