Suppr超能文献

我们能否利用观察性医疗保健数据开发真实世界的预后模型?调查模型对数据库和表型敏感性的大规模实验。

Can we develop real-world prognostic models using observational healthcare data? Large-scale experiment to investigate model sensitivity to database and phenotypes.

作者信息

Reps Jenna M, Rijnbeek Peter R, Ryan Patrick B

机构信息

, Johnson & Johnson, Raritan, NJ, USA.

Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands.

出版信息

Diagn Progn Res. 2025 Apr 17;9(1):10. doi: 10.1186/s41512-025-00191-x.

Abstract

BACKGROUND

Large observational healthcare databases are frequently used to develop models to be implemented in real-world clinical practice populations. For example, these databases were used to develop COVID severity models that guided interventions such as who to prioritize vaccinating during the pandemic. However, the clinical setting and observational databases often differ in the types of patients (case mix), and it is a nontrivial process to identify patients with medical conditions (phenotyping) in these databases. In this study, we investigate how sensitive a model's performance is to the choice of development database, population, and outcome phenotype.

METHODS

We developed > 450 different logistic regression models for nine prediction tasks across seven databases with a range of suitable population and outcome phenotypes. Performance stability within tasks was calculated by applying each model to data created by permuting the database, population, or outcome phenotype. We investigate performance (AUROC, scaled Brier, and calibration-in-the-large) stability and individual risk estimate stability.

RESULTS

In general, changing the outcome definitions or population phenotype made little impact on the model validation discrimination. However, validation discrimination was unstable when the database changed. Calibration and Brier performance were unstable when the population, outcome definition, or database changed. This may be problematic if a model developed using observational data is implemented in a real-world setting.

CONCLUSIONS

These results highlight the importance of validating a model developed using observational data in the clinical setting prior to using it for decision-making. Calibration and Brier score should be evaluated to prevent miscalibrated risk estimates being used to aid clinical decisions.

摘要

背景

大型观察性医疗保健数据库经常被用于开发可在实际临床实践人群中实施的模型。例如,这些数据库被用于开发新冠严重程度模型,该模型指导了诸如在疫情期间确定优先接种疫苗对象等干预措施。然而,临床环境和观察性数据库在患者类型(病例组合)方面往往存在差异,并且在这些数据库中识别患有特定疾病的患者(表型分析)是一个复杂的过程。在本研究中,我们调查了模型性能对开发数据库、人群和结局表型选择的敏感程度。

方法

我们针对七个数据库中的九项预测任务,开发了超过450种不同的逻辑回归模型,这些模型具有一系列合适的人群和结局表型。通过将每个模型应用于通过对数据库、人群或结局表型进行置换而创建的数据,计算任务内的性能稳定性。我们研究了性能(曲线下面积、缩放布里尔得分和整体校准)稳定性以及个体风险估计稳定性。

结果

总体而言,改变结局定义或人群表型对模型验证辨别力影响不大。然而,当数据库改变时,验证辨别力不稳定。当人群、结局定义或数据库改变时,校准和布里尔性能不稳定。如果将使用观察性数据开发的模型应用于实际环境中,这可能会产生问题。

结论

这些结果凸显了在将使用观察性数据开发的模型用于决策之前,在临床环境中对其进行验证的重要性。应评估校准和布里尔得分,以防止使用校准错误的风险估计来辅助临床决策。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb7a/12004590/25d100e8ec74/41512_2025_191_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验