临床风险预测模型中的缺失数据插补与内部验证相结合

Combining Missing Data Imputation and Internal Validation in Clinical Risk Prediction Models.

作者信息

Mi Junhui, Tendulkar Rahul D, Sittenfeld Sarah M C, Patil Sujata, Zabor Emily C

机构信息

Department of Quantitative Health Sciences, Cleveland Clinic Research, Cleveland, Ohio, USA.

Department of Radiation Oncology, Taussig Cancer Institute, Cleveland Clinic, Cleveland, Ohio, USA.

出版信息

Stat Med. 2025 Aug;44(18-19):e70203. doi: 10.1002/sim.70203.

DOI:10.1002/sim.70203

PMID:40772740

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12330338/

Abstract

Methods to handle missing data have been extensively explored in the context of estimation and descriptive studies, with multiple imputation being the most widely used method in clinical research. However, in the context of clinical risk prediction models, where the goal is often to achieve high prediction accuracy and to make predictions for future patients, there are different considerations regarding the handling of missing covariate data. As a result, deterministic imputation is better suited to the setting of clinical risk prediction models, since the outcome is not included in the imputation model and the imputation method can be easily applied to future patients. In this paper, we provide a tutorial demonstrating how to conduct bootstrapping followed by deterministic imputation of missing covariate data to construct and internally validate the performance of a clinical risk prediction model in the presence of missing data. Simulation study results are provided to help guide when imputation may be appropriate in real-world applications.

摘要

在估计和描述性研究的背景下，处理缺失数据的方法已得到广泛探索，多重填补是临床研究中使用最广泛的方法。然而，在临床风险预测模型的背景下，其目标通常是实现高预测准确性并为未来患者进行预测，在处理协变量数据缺失方面有不同的考虑。因此，确定性填补更适合临床风险预测模型的设置，因为结果不包含在填补模型中，并且填补方法可以很容易地应用于未来患者。在本文中，我们提供了一个教程，展示了如何进行自抽样，然后对缺失的协变量数据进行确定性填补，以在存在缺失数据的情况下构建和内部验证临床风险预测模型的性能。提供了模拟研究结果，以帮助指导在实际应用中何时进行填补可能是合适的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e6da/12330338/d47c757245f1/SIM-44-0-g008.jpg

相似文献

Combining Missing Data Imputation and Internal Validation in Clinical Risk Prediction Models.

Stat Med. 2025 Aug;44(18-19):e70203. doi: 10.1002/sim.70203.

Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.

Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.

Assessing the comparative effects of interventions in COPD: a tutorial on network meta-analysis for clinicians.

Respir Res. 2024 Dec 21;25(1):438. doi: 10.1186/s12931-024-03056-x.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

The Black Book of Psychotropic Dosing and Monitoring.

Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.

Thromboprophylaxis during pregnancy and the puerperium: a systematic review and economic evaluation to estimate the value of future research.

Health Technol Assess. 2024 Mar;28(9):1-176. doi: 10.3310/DFWT3873.

Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?

Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.

Diagnostic test accuracy and cost-effectiveness of tests for codeletion of chromosomal arms 1p and 19q in people with glioma.

Cochrane Database Syst Rev. 2022 Mar 2;3(3):CD013387. doi: 10.1002/14651858.CD013387.pub2.

Are missing data adequately handled in cluster randomised trials? A systematic review and guidelines.

Clin Trials. 2014 Oct;11(5):590-600. doi: 10.1177/1740774514537136. Epub 2014 Jun 5.

Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.

Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.

本文引用的文献

The "Why" behind including "Y" in your imputation model.

Stat Methods Med Res. 2024 Jun;33(6):996-1020. doi: 10.1177/09622802241244608. Epub 2024 Apr 16.

Missing data imputation, prediction, and feature selection in diagnosis of vaginal prolapse.

BMC Med Res Methodol. 2023 Nov 6;23(1):259. doi: 10.1186/s12874-023-02079-0.

Imputation and missing indicators for handling missing data in the development and deployment of clinical prediction models: A simulation study.

Stat Methods Med Res. 2023 Aug;32(8):1461-1477. doi: 10.1177/09622802231165001. Epub 2023 Apr 27.

A multi-institutional prediction model to estimate the risk of recurrence and mortality after mastectomy for T1-2N1 breast cancer.

Cancer. 2022 Aug 15;128(16):3057-3066. doi: 10.1002/cncr.34352. Epub 2022 Jun 17.

Re-evaluation of the comparative effectiveness of bootstrap-based optimism correction methods in the development of multivariable clinical prediction models.

BMC Med Res Methodol. 2021 Jan 7;21(1):9. doi: 10.1186/s12874-020-01201-w.

Bootstrap inference for multiple imputation under uncongeniality and misspecification.

Stat Methods Med Res. 2020 Dec;29(12):3533-3546. doi: 10.1177/0962280220932189. Epub 2020 Jun 30.

Missing data and prediction: the pattern submodel.

Biostatistics. 2020 Apr 1;21(2):236-252. doi: 10.1093/biostatistics/kxy040.

Bootstrap inference when using multiple imputation.

Stat Med. 2018 Jun 30;37(14):2252-2266. doi: 10.1002/sim.7654. Epub 2018 Apr 16.

Variability in Predictions from Online Tools: A Demonstration Using Internet-Based Melanoma Predictors.

Ann Surg Oncol. 2018 Aug;25(8):2172-2177. doi: 10.1245/s10434-018-6370-4. Epub 2018 Feb 22.

Assessment of predictive performance in incomplete data by combining internal validation and multiple imputation.

BMC Med Res Methodol. 2016 Oct 26;16(1):144. doi: 10.1186/s12874-016-0239-7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

临床风险预测模型中的缺失数据插补与内部验证相结合

Combining Missing Data Imputation and Internal Validation in Clinical Risk Prediction Models.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献