多项研究中预测模型性能的荟萃分析：哪种尺度有助于确保 C 统计量和校准测量的研究间正态性？

Meta-analysis of prediction model performance across multiple studies: Which scale helps ensure between-study normality for the C-statistic and calibration measures?

机构信息

1 Research Institute for Primary Care and Health Sciences, Keele University, Staffordshire, UK.

2 Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands.

出版信息

Stat Methods Med Res. 2018 Nov;27(11):3505-3522. doi: 10.1177/0962280217705678. Epub 2017 May 8.

DOI:10.1177/0962280217705678

PMID:28480827

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6193210/

Abstract

If individual participant data are available from multiple studies or clusters, then a prediction model can be externally validated multiple times. This allows the model's discrimination and calibration performance to be examined across different settings. Random-effects meta-analysis can then be used to quantify overall (average) performance and heterogeneity in performance. This typically assumes a normal distribution of 'true' performance across studies. We conducted a simulation study to examine this normality assumption for various performance measures relating to a logistic regression prediction model. We simulated data across multiple studies with varying degrees of variability in baseline risk or predictor effects and then evaluated the shape of the between-study distribution in the C-statistic, calibration slope, calibration-in-the-large, and E/O statistic, and possible transformations thereof. We found that a normal between-study distribution was usually reasonable for the calibration slope and calibration-in-the-large; however, the distributions of the C-statistic and E/O were often skewed across studies, particularly in settings with large variability in the predictor effects. Normality was vastly improved when using the logit transformation for the C-statistic and the log transformation for E/O, and therefore we recommend these scales to be used for meta-analysis. An illustrated example is given using a random-effects meta-analysis of the performance of QRISK2 across 25 general practices.

摘要

如果可以从多个研究或群组获得个体参与者数据，那么可以多次对预测模型进行外部验证。这允许在不同的环境中检查模型的区分度和校准性能。然后，可以使用随机效应荟萃分析来量化总体（平均）性能和性能的异质性。这通常假设研究之间“真实”性能呈正态分布。我们进行了一项模拟研究，以检查与逻辑回归预测模型相关的各种性能指标的正态性假设。我们在多个研究中模拟了数据，这些研究具有不同程度的基线风险或预测因子效应的可变性，然后评估了 C 统计量、校准斜率、大校准和 E/O 统计量以及可能的转换的研究间分布的形状。我们发现，对于校准斜率和大校准，研究之间的正态分布通常是合理的；然而，C 统计量和 E/O 的分布在研究之间往往是偏态的，特别是在预测因子效应变化较大的情况下。当对 C 统计量使用对数变换，对 E/O 使用对数变换时，正态性得到了极大的改善，因此我们建议在荟萃分析中使用这些尺度。通过对 25 家普通实践中 QRISK2 的性能进行随机效应荟萃分析，给出了一个说明性示例。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c5b7/6193210/ce35afc66941/10.1177_0962280217705678-fig1.jpg

相似文献

Meta-analysis of prediction model performance across multiple studies: Which scale helps ensure between-study normality for the C-statistic and calibration measures?

Stat Methods Med Res. 2018 Nov;27(11):3505-3522. doi: 10.1177/0962280217705678. Epub 2017 May 8.

Multivariate meta-analysis of individual participant data helped externally validate the performance and implementation of a prediction model.

J Clin Epidemiol. 2016 Jan;69:40-50. doi: 10.1016/j.jclinepi.2015.05.009. Epub 2015 May 16.

Prognostic models for newly-diagnosed chronic lymphocytic leukaemia in adults: a systematic review and meta-analysis.

Cochrane Database Syst Rev. 2020 Jul 31;7(7):CD012022. doi: 10.1002/14651858.CD012022.pub2.

External validation of prognostic models predicting pre-eclampsia: individual participant data meta-analysis.

BMC Med. 2020 Nov 2;18(1):302. doi: 10.1186/s12916-020-01766-9.

Validation and development of models using clinical, biochemical and ultrasound markers for predicting pre-eclampsia: an individual participant data meta-analysis.

Health Technol Assess. 2020 Dec;24(72):1-252. doi: 10.3310/hta24720.

Estimation of required sample size for external validation of risk models for binary outcomes.

Stat Methods Med Res. 2021 Oct;30(10):2187-2206. doi: 10.1177/09622802211007522. Epub 2021 Apr 21.

The performance of prognostic models depended on the choice of missing value imputation algorithm: a simulation study.

J Clin Epidemiol. 2024 Dec;176:111539. doi: 10.1016/j.jclinepi.2024.111539. Epub 2024 Sep 24.

External validation of prognostic models to predict stillbirth using International Prediction of Pregnancy Complications (IPPIC) Network database: individual participant data meta-analysis.

Ultrasound Obstet Gynecol. 2022 Feb;59(2):209-219. doi: 10.1002/uog.23757.

Development and validation of prediction models for fetal growth restriction and birthweight: an individual participant data meta-analysis.

Health Technol Assess. 2024 Aug;28(47):1-119. doi: 10.3310/DABW4814.

A random effects meta-analysis model with Box-Cox transformation.

BMC Med Res Methodol. 2017 Jul 19;17(1):109. doi: 10.1186/s12874-017-0376-7.

引用本文的文献

Clinical impact of MRI-based risk calculators for prostate cancer diagnosis: a systematic review and meta-analysis.

Prostate Cancer Prostatic Dis. 2025 Aug 26. doi: 10.1038/s41391-025-01014-2.

External validation and update of the pediatric asthma risk score as a passive digital marker for childhood asthma using integrated electronic health records.

EClinicalMedicine. 2025 May 20;84:103254. doi: 10.1016/j.eclinm.2025.103254. eCollection 2025 Jun.

Prediction of Hypertension in the Pediatric Population Using Machine Learning and Transfer Learning: A Multicentric Analysis of the SAYCARE Study.

Int J Public Health. 2025 Mar 11;70:1607944. doi: 10.3389/ijph.2025.1607944. eCollection 2025.

Multicentre prospective study on the diagnostic and prognostic validity of malnutrition assessment tools in surgery.

Br J Surg. 2025 Feb 1;112(2). doi: 10.1093/bjs/znaf013.

Discrimination and calibration performances of non-laboratory-based and laboratory-based cardiovascular risk predictions: a systematic review.

Open Heart. 2025 Feb 10;12(1):e003147. doi: 10.1136/openhrt-2024-003147.

Machine learning approaches for risk prediction after percutaneous coronary intervention: a systematic review and meta-analysis.

Eur Heart J Digit Health. 2024 Oct 14;6(1):23-44. doi: 10.1093/ehjdh/ztae074. eCollection 2025 Jan.

First-Trimester Prediction Models Based on Maternal Characteristics for Adverse Pregnancy Outcomes: A Systematic Review and Meta-Analysis.

BJOG. 2025 Feb;132(3):243-265. doi: 10.1111/1471-0528.17983. Epub 2024 Oct 24.

Sudden cardiac death after myocardial infarction: individual participant data from pooled cohorts.

Eur Heart J. 2024 Nov 14;45(43):4616-4626. doi: 10.1093/eurheartj/ehae326.

Meta-Analysis of the Impact of Far-Red Light on Vegetable Crop Growth and Quality.

Plants (Basel). 2024 Sep 6;13(17):2508. doi: 10.3390/plants13172508.

Development and validation of prediction models for fetal growth restriction and birthweight: an individual participant data meta-analysis.

Health Technol Assess. 2024 Aug;28(47):1-119. doi: 10.3310/DABW4814.

本文引用的文献

A guide to systematic review and meta-analysis of prediction model performance.

BMJ. 2017 Jan 5;356:i6460. doi: 10.1136/bmj.i6460.

Random effects meta-analysis: Coverage performance of 95% confidence and prediction intervals following REML estimation.

Stat Med. 2017 Jan 30;36(2):301-317. doi: 10.1002/sim.7140. Epub 2016 Oct 7.

External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis: opportunities and challenges.

BMJ. 2016 Jun 22;353:i3140. doi: 10.1136/bmj.i3140.

Geographic and temporal validity of prediction models: different approaches were useful to examine model performance.

J Clin Epidemiol. 2016 Nov;79:76-85. doi: 10.1016/j.jclinepi.2016.05.007. Epub 2016 Jun 2.

A new concordance measure for risk prediction models in external validation settings.

Stat Med. 2016 Oct 15;35(23):4136-52. doi: 10.1002/sim.6997. Epub 2016 Jun 1.

A calibration hierarchy for risk models was defined: from utopia to empirical data.

J Clin Epidemiol. 2016 Jun;74:167-76. doi: 10.1016/j.jclinepi.2015.12.005. Epub 2016 Jan 6.

Multinational Assessment of Accuracy of Equations for Predicting Risk of Kidney Failure: A Meta-analysis.

JAMA. 2016 Jan 12;315(2):164-74. doi: 10.1001/jama.2015.18202.

Individual participant data (IPD) meta-analyses of diagnostic and prognostic modeling studies: guidance on their use.

PLoS Med. 2015 Oct 13;12(10):e1001886. doi: 10.1371/journal.pmed.1001886. eCollection 2015 Oct.

Multivariate meta-analysis of individual participant data helped externally validate the performance and implementation of a prediction model.

J Clin Epidemiol. 2016 Jan;69:40-50. doi: 10.1016/j.jclinepi.2015.05.009. Epub 2015 May 16.

Prediction models need appropriate internal, internal-external, and external validation.

J Clin Epidemiol. 2016 Jan;69:245-7. doi: 10.1016/j.jclinepi.2015.04.005. Epub 2015 Apr 18.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

多项研究中预测模型性能的荟萃分析：哪种尺度有助于确保 C 统计量和校准测量的研究间正态性？

Meta-analysis of prediction model performance across multiple studies: Which scale helps ensure between-study normality for the C-statistic and calibration measures?

机构信息

1 Research Institute for Primary Care and Health Sciences, Keele University, Staffordshire, UK.

2 Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands.

出版信息

Stat Methods Med Res. 2018 Nov;27(11):3505-3522. doi: 10.1177/0962280217705678. Epub 2017 May 8.

DOI:10.1177/0962280217705678

PMID:28480827

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6193210/

Abstract

摘要

多项研究中预测模型性能的荟萃分析：哪种尺度有助于确保 C 统计量和校准测量的研究间正态性？

Meta-analysis of prediction model performance across multiple studies: Which scale helps ensure between-study normality for the C-statistic and calibration measures?

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

多项研究中预测模型性能的荟萃分析：哪种尺度有助于确保 C 统计量和校准测量的研究间正态性？

Meta-analysis of prediction model performance across multiple studies: Which scale helps ensure between-study normality for the C-statistic and calibration measures?

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献