Suppr超能文献

评估在真实世界数据中定义可观察时间对结局发生率的影响。

Evaluation of the impact of defining observable time in real-world data on outcome incidence.

作者信息

Blacketer Clair, DeFalco Frank J, Conover Mitchell M, Ryan Patrick B, Schuemie Martijn J, Rijnbeek Peter R

机构信息

Coordinating Center, Observational Health Data Sciences and Informatics (OHDSI), New York, NY, 10032, United States.

Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, NL, 3015 GD, United States.

出版信息

J Am Med Inform Assoc. 2025 Sep 1;32(9):1434-1444. doi: 10.1093/jamia/ocaf119.

Abstract

OBJECTIVE

In real-world data (RWD), defining the observation period-the time during which a patient is considered observable-is critical for estimating incidence rates (IRs) and other outcomes. Yet, in the absence of explicit enrollment information, this period must often be inferred, introducing potential bias.

MATERIALS AND METHODS

This study evaluates methods for defining observation periods and their impact on IR estimates across multiple database types. We applied 3 methods for defining observation periods: (1) a persistence + surveillance window approach, (2) an age- and gender-adjusted method based on time between healthcare events, and (3) the min/max method. These were tested across 11 RWD databases, including both enrollment-based and encounter-based sources. Enrollment time was used as the reference standard in eligible databases. To assess the impact on epidemiologic results, we replicated a prior study of adverse event incidence, comparing IRs and calculating mean squared error between methods.

RESULTS

Incidence rates decreased as observation periods lengthened, driven by increases in the person-time denominator. The persistence + surveillance method produced estimates closest to enrollment-based rates when appropriately balanced. The min/max approach yielded inconsistent results, particularly in encounter-based databases, with greater error observed in databases with longer time spans.

DISCUSSION

These findings suggest that assumptions about data completeness and population observability significantly affect incidence estimates. Observation period definitions substantially influence outcome measurement in RWD studies.

CONCLUSION

Standardized, transparent approaches are necessary to ensure valid, reproducible results-especially in databases lacking defined enrollment.

摘要

目的

在真实世界数据(RWD)中,定义观察期(即患者被视为可观察的时间段)对于估计发病率(IR)和其他结局至关重要。然而,在缺乏明确的入组信息时,这个时间段通常必须进行推断,这就引入了潜在的偏差。

材料与方法

本研究评估了定义观察期的方法及其对多种数据库类型中IR估计值的影响。我们应用了3种定义观察期的方法:(1)持续存在+监测窗口方法;(2)基于医疗事件之间时间的年龄和性别调整方法;(3)最小/最大方法。这些方法在11个RWD数据库中进行了测试,包括基于入组和基于就诊的数据源。在符合条件的数据库中,将入组时间用作参考标准。为了评估对流行病学结果的影响,我们重复了一项先前关于不良事件发生率的研究,比较了发病率,并计算了不同方法之间的均方误差。

结果

随着观察期延长,发病率下降,这是由人时分母增加所驱动的。当适当平衡时,持续存在+监测方法得出的估计值最接近基于入组的发病率。最小/最大方法产生的结果不一致,特别是在基于就诊的数据库中,在时间跨度较长的数据库中观察到更大的误差。

讨论

这些发现表明,关于数据完整性和人群可观察性的假设会显著影响发病率估计。观察期定义在RWD研究中对结局测量有重大影响。

结论

需要采用标准化、透明的方法来确保结果有效、可重复,尤其是在缺乏明确入组定义的数据库中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7ee/12361855/55101943b42a/ocaf119f1.jpg

相似文献

1
Evaluation of the impact of defining observable time in real-world data on outcome incidence.
J Am Med Inform Assoc. 2025 Sep 1;32(9):1434-1444. doi: 10.1093/jamia/ocaf119.
4
Systemic treatments for metastatic cutaneous melanoma.
Cochrane Database Syst Rev. 2018 Feb 6;2(2):CD011123. doi: 10.1002/14651858.CD011123.pub2.
6
Eliciting adverse effects data from participants in clinical trials.
Cochrane Database Syst Rev. 2018 Jan 16;1(1):MR000039. doi: 10.1002/14651858.MR000039.pub2.
7
The measurement and monitoring of surgical adverse events.
Health Technol Assess. 2001;5(22):1-194. doi: 10.3310/hta5220.
8
The effect of sample site and collection procedure on identification of SARS-CoV-2 infection.
Cochrane Database Syst Rev. 2024 Dec 16;12(12):CD014780. doi: 10.1002/14651858.CD014780.
10
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.
Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.

本文引用的文献

1
Evaluating the impact of alternative phenotype definitions on incidence rates across a global data network.
JAMIA Open. 2023 Nov 21;6(4):ooad096. doi: 10.1093/jamiaopen/ooad096. eCollection 2023 Dec.
2
Risk of osteoporosis among individuals with varicose veins: a multi-institution cohort study.
Arch Osteoporos. 2023 Nov 27;18(1):141. doi: 10.1007/s11657-023-01351-6.
4
Measurement error and misclassification in electronic medical records: methods to mitigate bias.
Curr Epidemiol Rep. 2018 Dec;5(4):343-356. doi: 10.1007/s40471-018-0164-x. Epub 2018 Sep 10.
5
Prevalence and incidence of neuromuscular conditions in the UK between 2000 and 2019: A retrospective study using primary care data.
PLoS One. 2021 Dec 31;16(12):e0261983. doi: 10.1371/journal.pone.0261983. eCollection 2021.
6
Racial/Ethnic and Socioeconomic Disparities in Management of Incident Paroxysmal Atrial Fibrillation.
JAMA Netw Open. 2021 Feb 1;4(2):e210247. doi: 10.1001/jamanetworkopen.2021.0247.
7
Calculating incidence rates and prevalence proportions: not as simple as it seems.
BMC Public Health. 2019 May 6;19(1):512. doi: 10.1186/s12889-019-6820-3.
8
Data resource profile: Clinical Practice Research Datalink (CPRD) Aurum.
Int J Epidemiol. 2019 Dec 1;48(6):1740-1740g. doi: 10.1093/ije/dyz034.
9
Evaluating large-scale propensity score performance through real-world and synthetic data experiments.
Int J Epidemiol. 2018 Dec 1;47(6):2005-2014. doi: 10.1093/ije/dyy120.
10
A Data Quality Assessment Guideline for Electronic Health Record Data Reuse.
EGEMS (Wash DC). 2017 Sep 4;5(1):14. doi: 10.5334/egems.218.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验