• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在验证预测模型并将其应用于新患者时处理预测变量的缺失值。

Handling missing predictor values when validating and applying a prediction model to new patients.

作者信息

Hoogland Jeroen, van Barreveld Marit, Debray Thomas P A, Reitsma Johannes B, Verstraelen Tom E, Dijkgraaf Marcel G W, Zwinderman Aeilko H

机构信息

Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.

Department of Clinical Epidemiology, Biostatistics, & Bioinformatics, Academic Medical Center, Amsterdam University Medical Centers, Amsterdam, The Netherlands.

出版信息

Stat Med. 2020 Nov 10;39(25):3591-3607. doi: 10.1002/sim.8682. Epub 2020 Jul 20.

DOI:10.1002/sim.8682
PMID:32687233
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7586995/
Abstract

Missing data present challenges for development and real-world application of clinical prediction models. While these challenges have received considerable attention in the development setting, there is only sparse research on the handling of missing data in applied settings. The main unique feature of handling missing data in these settings is that missing data methods have to be performed for a single new individual, precluding direct application of mainstay methods used during model development. Correspondingly, we propose that it is desirable to perform model validation using missing data methods that transfer to practice in single new patients. This article compares existing and new methods to account for missing data for a new individual in the context of prediction. These methods are based on (i) submodels based on observed data only, (ii) marginalization over the missing variables, or (iii) imputation based on fully conditional specification (also known as chained equations). They were compared in an internal validation setting to highlight the use of missing data methods that transfer to practice while validating a model. As a reference, they were compared to the use of multiple imputation by chained equations in a set of test patients, because this has been used in validation studies in the past. The methods were evaluated in a simulation study where performance was measured by means of optimism corrected C-statistic and mean squared prediction error. Furthermore, they were applied in data from a large Dutch cohort of prophylactic implantable cardioverter defibrillator patients.

摘要

缺失数据给临床预测模型的开发和实际应用带来了挑战。虽然这些挑战在开发环境中已受到相当多的关注,但在应用环境中处理缺失数据的研究却很少。在这些环境中处理缺失数据的主要独特之处在于,缺失数据方法必须针对单个新个体执行,这排除了直接应用模型开发期间使用的主要方法。相应地,我们建议使用能够应用于单个新患者实际情况的缺失数据方法来进行模型验证。本文比较了在预测背景下针对新个体处理缺失数据的现有方法和新方法。这些方法基于:(i)仅基于观测数据的子模型;(ii)对缺失变量进行边缘化;或(iii)基于完全条件设定的插补(也称为链式方程)。在内部验证环境中对它们进行了比较,以突出在验证模型时能够应用于实际情况的缺失数据方法的使用。作为参考,在一组测试患者中将它们与使用链式方程进行多次插补的情况进行了比较,因为过去在验证研究中曾使用过这种方法。在一项模拟研究中对这些方法进行了评估,通过乐观校正C统计量和均方预测误差来衡量性能。此外,它们还应用于来自荷兰一个大型预防性植入式心脏复律除颤器患者队列的数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/781e/7586995/6a62f5e7bf71/SIM-39-3591-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/781e/7586995/412f5f95187f/SIM-39-3591-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/781e/7586995/28bb1368ea56/SIM-39-3591-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/781e/7586995/6a62f5e7bf71/SIM-39-3591-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/781e/7586995/412f5f95187f/SIM-39-3591-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/781e/7586995/28bb1368ea56/SIM-39-3591-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/781e/7586995/6a62f5e7bf71/SIM-39-3591-g003.jpg

相似文献

1
Handling missing predictor values when validating and applying a prediction model to new patients.在验证预测模型并将其应用于新患者时处理预测变量的缺失值。
Stat Med. 2020 Nov 10;39(25):3591-3607. doi: 10.1002/sim.8682. Epub 2020 Jul 20.
2
Real-time imputation of missing predictor values improved the application of prediction models in daily practice.实时插补缺失预测值可提高预测模型在日常实践中的应用。
J Clin Epidemiol. 2021 Jun;134:22-34. doi: 10.1016/j.jclinepi.2021.01.003. Epub 2021 Jan 19.
3
Multiple imputation methods for handling missing values in a longitudinal categorical variable with restrictions on transitions over time: a simulation study.多种插补方法处理具有时间过渡限制的纵向分类变量中的缺失值:一项模拟研究。
BMC Med Res Methodol. 2019 Jan 10;19(1):14. doi: 10.1186/s12874-018-0653-0.
4
Population median imputation was noninferior to complex approaches for imputing missing values in cardiovascular prediction models in clinical practice.在临床实践中,对于心血管预测模型中缺失值的插补,人群中位数插补并不逊于复杂方法。
J Clin Epidemiol. 2022 May;145:70-80. doi: 10.1016/j.jclinepi.2022.01.011. Epub 2022 Jan 21.
5
Multiple imputation for handling missing outcome data when estimating the relative risk.采用多重插补处理估计相对危险度时丢失的结局数据。
BMC Med Res Methodol. 2017 Sep 6;17(1):134. doi: 10.1186/s12874-017-0414-5.
6
Multiple imputation by chained equations for systematically and sporadically missing multilevel data.多水平数据系统缺失和随机缺失的链方程多重插补法。
Stat Methods Med Res. 2018 Jun;27(6):1634-1649. doi: 10.1177/0962280216666564. Epub 2016 Sep 19.
7
A comparison of multiple imputation methods for handling missing values in longitudinal data in the presence of a time-varying covariate with a non-linear association with time: a simulation study.存在与时间呈非线性关联的时变协变量时,用于处理纵向数据中缺失值的多种多重填补方法的比较:一项模拟研究。
BMC Med Res Methodol. 2017 Jul 25;17(1):114. doi: 10.1186/s12874-017-0372-y.
8
Multiple imputation with sequential penalized regression.多重插补与序贯惩罚回归。
Stat Methods Med Res. 2019 May;28(5):1311-1327. doi: 10.1177/0962280218755574. Epub 2018 Feb 16.
9
Methods for Handling Missing Variables in Risk Prediction Models.风险预测模型中缺失变量的处理方法。
Am J Epidemiol. 2016 Oct 1;184(7):545-551. doi: 10.1093/aje/kwv346. Epub 2016 Sep 14.
10
Adaptation of clinical prediction models for application in local settings.临床预测模型在当地环境中的应用改编。
Med Decis Making. 2012 May-Jun;32(3):E1-10. doi: 10.1177/0272989X12439755. Epub 2012 Mar 16.

引用本文的文献

1
Multivariable prognostic models for post-hepatectomy liver failure: An updated systematic review.肝切除术后肝衰竭的多变量预后模型:一项更新的系统评价。
World J Hepatol. 2025 Apr 27;17(4):103330. doi: 10.4254/wjh.v17.i4.103330.
2
Problem of pain in the USA: evaluating the generalisability of high-impact chronic pain models over time using National Health Interview Survey (NHIS) data.美国的疼痛问题:利用国家健康访谈调查(NHIS)数据评估高影响慢性疼痛模型随时间的普遍性。
BMJ Public Health. 2025 Jan 27;3(1):e001628. doi: 10.1136/bmjph-2024-001628. eCollection 2025.
3
Overcoming Missing Data: Accurately Predicting Cardiovascular Risk in Type 2 Diabetes, A Systematic Review.

本文引用的文献

1
Missing data and prediction: the pattern submodel.缺失数据和预测:模式子模型。
Biostatistics. 2020 Apr 1;21(2):236-252. doi: 10.1093/biostatistics/kxy040.
2
Bootstrap inference when using multiple imputation.当使用多重插补时的引导推断。
Stat Med. 2018 Jun 30;37(14):2252-2266. doi: 10.1002/sim.7654. Epub 2018 Apr 16.
3
Model checking in multiple imputation: an overview and case study.多重填补中的模型检验:综述与案例研究
克服数据缺失:2型糖尿病心血管风险的准确预测,一项系统评价
J Diabetes. 2025 Jan;17(1):e70049. doi: 10.1111/1753-0407.70049.
4
Machine learning approaches toward an understanding of acute kidney injury: current trends and future directions.机器学习方法对急性肾损伤的理解:当前趋势和未来方向。
Korean J Intern Med. 2024 Nov;39(6):882-897. doi: 10.3904/kjim.2024.098. Epub 2024 Oct 29.
5
Multiple imputation integrated to machine learning: predicting post-stroke recovery of ambulation after intensive inpatient rehabilitation.机器学习中的多重插补:预测强化住院康复后卒中后步行恢复情况。
Sci Rep. 2024 Oct 24;14(1):25188. doi: 10.1038/s41598-024-74537-8.
6
The use of imputation in clinical decision support systems: a cardiovascular risk management pilot vignette study among clinicians.临床决策支持系统中插补法的应用:一项针对临床医生的心血管风险管理试点案例研究
Eur Heart J Digit Health. 2024 Aug 10;5(5):572-581. doi: 10.1093/ehjdh/ztae058. eCollection 2024 Sep.
7
Construct prognostic models of multiple myeloma with pathway information incorporated.构建包含通路信息的多发性骨髓瘤预后模型。
PLoS Comput Biol. 2024 Sep 10;20(9):e1012444. doi: 10.1371/journal.pcbi.1012444. eCollection 2024 Sep.
8
Deeply-Learned Generalized Linear Models with Missing Data.具有缺失数据的深度广义线性模型
J Comput Graph Stat. 2024;33(2):638-650. doi: 10.1080/10618600.2023.2276122. Epub 2023 Dec 15.
9
Lifestyle predictors of colorectal cancer in European populations: a systematic review.欧洲人群中结直肠癌的生活方式预测因素:一项系统综述
BMJ Nutr Prev Health. 2024 Jan 4;7(1):183-190. doi: 10.1136/bmjnph-2022-000554. eCollection 2024.
10
The regulatory effect of zinc on the association between periodontitis and atherosclerotic cardiovascular disease: a cross-sectional study based on the National Health and Nutrition Examination Survey.锌对牙周炎与动脉粥样硬化性心血管疾病相关性的调节作用:基于全国健康和营养调查的横断面研究。
BMC Oral Health. 2024 Jun 18;24(1):703. doi: 10.1186/s12903-024-04473-6.
Emerg Themes Epidemiol. 2017 Aug 23;14:8. doi: 10.1186/s12982-017-0062-6. eCollection 2017.
4
Dutch outcome in implantable cardioverter-defibrillator therapy (DO-IT): registry design and baseline characteristics of a prospective observational cohort study to predict appropriate indication for implantable cardioverter-defibrillator.植入式心脏复律除颤器治疗的荷兰结果(DO-IT):一项前瞻性观察性队列研究的注册设计和基线特征,以预测植入式心脏复律除颤器的适当适应症
Neth Heart J. 2017 Oct;25(10):574-580. doi: 10.1007/s12471-017-1016-x.
5
Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: prospective cohort study.用于估计心血管疾病未来风险的QRISK3风险预测算法的开发与验证:前瞻性队列研究
BMJ. 2017 May 23;357:j2099. doi: 10.1136/bmj.j2099.
6
Assessment of predictive performance in incomplete data by combining internal validation and multiple imputation.通过结合内部验证和多重填补来评估不完整数据中的预测性能。
BMC Med Res Methodol. 2016 Oct 26;16(1):144. doi: 10.1186/s12874-016-0239-7.
7
Prediction models need appropriate internal, internal-external, and external validation.预测模型需要进行适当的内部验证、内部-外部联合验证以及外部验证。
J Clin Epidemiol. 2016 Jan;69:245-7. doi: 10.1016/j.jclinepi.2015.04.005. Epub 2015 Apr 18.
8
Validation of prediction models based on lasso regression with multiply imputed data.基于套索回归与多重填补数据的预测模型验证
BMC Med Res Methodol. 2014 Oct 16;14:116. doi: 10.1186/1471-2288-14-116.
9
A new framework to enhance the interpretation of external validation studies of clinical prediction models.一种增强临床预测模型外部验证研究解释的新框架。
J Clin Epidemiol. 2015 Mar;68(3):279-89. doi: 10.1016/j.jclinepi.2014.06.018. Epub 2014 Aug 30.
10
Joint modelling rationale for chained equations.联立方程的联合建模原理。
BMC Med Res Methodol. 2014 Feb 21;14:28. doi: 10.1186/1471-2288-14-28.