• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

单研究外部验证的陷阱,以一个预测脑动脉瘤性蛛网膜下腔出血后功能结局的模型为例。

Pitfalls of single-study external validation illustrated with a model predicting functional outcome after aneurysmal subarachnoid hemorrhage.

机构信息

Department of Neurology, Erasmus MC University Medical Center Rotterdam, 40 Doctor Molewaterplein, P.O. Box 2040, Rotterdam, Zuid-Holland, 3015 GD, The Netherlands.

Department of Public Health, Erasmus MC University Medical Center Rotterdam, Rotterdam, Zuid-Holland, The Netherlands.

出版信息

BMC Med Res Methodol. 2024 Aug 8;24(1):176. doi: 10.1186/s12874-024-02280-9.

DOI:10.1186/s12874-024-02280-9
PMID:39118007
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11308226/
Abstract

BACKGROUND

Prediction models are often externally validated with data from a single study or cohort. However, the interpretation of performance estimates obtained with single-study external validation is not as straightforward as assumed. We aimed to illustrate this by conducting a large number of external validations of a prediction model for functional outcome in subarachnoid hemorrhage (SAH) patients.

METHODS

We used data from the Subarachnoid Hemorrhage International Trialists (SAHIT) data repository (n = 11,931, 14 studies) to refit the SAHIT model for predicting a dichotomous functional outcome (favorable versus unfavorable), with the (extended) Glasgow Outcome Scale or modified Rankin Scale score, at a minimum of three months after discharge. We performed leave-one-cluster-out cross-validation to mimic the process of multiple single-study external validations. Each study represented one cluster. In each of these validations, we assessed discrimination with Harrell's c-statistic and calibration with calibration plots, the intercepts, and the slopes. We used random effects meta-analysis to obtain the (reference) mean performance estimates and between-study heterogeneity (I-statistic). The influence of case-mix variation on discriminative performance was assessed with the model-based c-statistic and we fitted a "membership model" to obtain a gross estimate of transportability.

RESULTS

Across 14 single-study external validations, model performance was highly variable. The mean c-statistic was 0.74 (95%CI 0.70-0.78, range 0.52-0.84, I = 0.92), the mean intercept was -0.06 (95%CI -0.37-0.24, range -1.40-0.75, I = 0.97), and the mean slope was 0.96 (95%CI 0.78-1.13, range 0.53-1.31, I = 0.90). The decrease in discriminative performance was attributable to case-mix variation, between-study heterogeneity, or a combination of both. Incidentally, we observed poor generalizability or transportability of the model.

CONCLUSIONS

We demonstrate two potential pitfalls in the interpretation of model performance with single-study external validation. With single-study external validation. (1) model performance is highly variable and depends on the choice of validation data and (2) no insight is provided into generalizability or transportability of the model that is needed to guide local implementation. As such, a single single-study external validation can easily be misinterpreted and lead to a false appreciation of the clinical prediction model. Cross-validation is better equipped to address these pitfalls.

摘要

背景

预测模型通常通过来自单个研究或队列的数据进行外部验证。然而,使用单研究外部验证获得的性能估计的解释并不像假设的那样简单。我们旨在通过对蛛网膜下腔出血(SAH)患者功能结局的预测模型进行大量的外部验证来说明这一点。

方法

我们使用蛛网膜下腔出血国际试验者(SAHIT)数据存储库的数据(n=11931,14 项研究)来重新拟合用于预测出院后至少三个月时出现二分类功能结局(有利与不利)的 SAHIT 模型,使用(扩展)格拉斯哥结局量表或改良 Rankin 量表评分。我们进行了单聚类留一交叉验证,以模拟多次单研究外部验证的过程。每个研究代表一个聚类。在这些验证中的每一个中,我们使用 Harrell 的 c 统计量评估区分度,并使用校准图、截距和斜率评估校准。我们使用随机效应荟萃分析获得(参考)平均性能估计值和研究间异质性(I 统计量)。使用基于模型的 c 统计量评估病例组合变异对判别性能的影响,并拟合“成员模型”以获得迁移能力的大致估计值。

结果

在 14 项单研究外部验证中,模型性能变化很大。平均 c 统计量为 0.74(95%CI 0.70-0.78,范围 0.52-0.84,I=0.92),平均截距为-0.06(95%CI -0.37-0.24,范围-1.40-0.75,I=0.97),平均斜率为 0.96(95%CI 0.78-1.13,范围 0.53-1.31,I=0.90)。判别性能的下降归因于病例组合的变化、研究间的异质性或两者的组合。顺便说一句,我们观察到模型的通用性或可转移性较差。

结论

我们在使用单研究外部验证解释模型性能时展示了两个潜在的陷阱。通过单研究外部验证,(1)模型性能变化很大,取决于验证数据的选择,(2)无法提供模型通用性或可转移性的信息,这是指导本地实施所必需的。因此,单一的单研究外部验证很容易被误解,并导致对临床预测模型的错误评价。交叉验证更适合解决这些问题。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/60ea/11308226/ec2da5a612ff/12874_2024_2280_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/60ea/11308226/ec2da5a612ff/12874_2024_2280_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/60ea/11308226/ec2da5a612ff/12874_2024_2280_Fig1_HTML.jpg

相似文献

1
Pitfalls of single-study external validation illustrated with a model predicting functional outcome after aneurysmal subarachnoid hemorrhage.单研究外部验证的陷阱,以一个预测脑动脉瘤性蛛网膜下腔出血后功能结局的模型为例。
BMC Med Res Methodol. 2024 Aug 8;24(1):176. doi: 10.1186/s12874-024-02280-9.
2
External Validation of the Subarachnoid Hemorrhage International Trialists (SAHIT) Predictive Model Using the Barrow Ruptured Aneurysm Trial (BRAT) Cohort.应用巴罗破裂动脉瘤试验(BRAT)队列对蛛网膜下腔出血国际试验者(SAHIT)预测模型进行外部验证。
Neurosurgery. 2020 Jan 1;86(1):101-106. doi: 10.1093/neuros/nyy600.
3
Assessment of the Subarachnoid Hemorrhage International Trialists (SAHIT) Models for Dichotomized Long-Term Functional Outcome Prediction After Aneurysmal Subarachnoid Hemorrhage in a United Kingdom Multicenter Cohort Study.英国多中心队列研究中蛛网膜下腔出血国际试验者(SAHIT)模型对动脉瘤性蛛网膜下腔出血后二分法长期功能预后预测的评估
Neurosurgery. 2020 Nov 16;87(6):1269-1276. doi: 10.1093/neuros/nyaa299.
4
External Validation of a Neural Network Model in Aneurysmal Subarachnoid Hemorrhage: A Comparison With Conventional Logistic Regression Models.神经网络模型在动脉瘤性蛛网膜下腔出血中的外部验证:与传统逻辑回归模型的比较
Neurosurgery. 2022 May 1;90(5):552-561. doi: 10.1227/neu.0000000000001857.
5
Prognostic value of premorbid hypertension and neurological status in aneurysmal subarachnoid hemorrhage: pooled analyses of individual patient data in the SAHIT repository.病前高血压和神经状态在动脉瘤性蛛网膜下腔出血中的预后价值:SAHIT数据库中个体患者数据的汇总分析
J Neurosurg. 2015 Mar;122(3):644-52. doi: 10.3171/2014.10.JNS132694. Epub 2015 Jan 2.
6
Development and validation of outcome prediction models for aneurysmal subarachnoid haemorrhage: the SAHIT multinational cohort study.颅内动脉瘤性蛛网膜下腔出血结局预测模型的建立和验证:SAHIT 多中心队列研究。
BMJ. 2018 Jan 18;360:j5745. doi: 10.1136/bmj.j5745.
7
Neuroimaging characteristics of ruptured aneurysm as predictors of outcome after aneurysmal subarachnoid hemorrhage: pooled analyses of the SAHIT cohort.破裂动脉瘤的神经影像学特征作为动脉瘤性蛛网膜下腔出血后预后的预测指标:SAHIT队列的汇总分析
J Neurosurg. 2016 Jun;124(6):1703-11. doi: 10.3171/2015.4.JNS142753. Epub 2015 Oct 23.
8
CRP (C-Reactive Protein) in Outcome Prediction After Subarachnoid Hemorrhage and the Role of Machine Learning.蛛网膜下腔出血后结局预测的 C 反应蛋白(CRP)和机器学习的作用。
Stroke. 2021 Oct;52(10):3276-3285. doi: 10.1161/STROKEAHA.120.030950. Epub 2021 Jul 9.
9
Prediction of Outcome After Aneurysmal Subarachnoid Hemorrhage.动脉瘤性蛛网膜下腔出血预后的预测。
Stroke. 2019 Apr;50(4):837-844. doi: 10.1161/STROKEAHA.118.023902.
10

引用本文的文献

1
Grading Scores for Identifying Patients at Risk of Delayed Cerebral Ischemia and Neurological Outcome in Spontaneous Subarachnoid Hemorrhage: A Comparison of Receiver Operator Curve Analysis.用于识别自发性蛛网膜下腔出血患者发生迟发性脑缺血风险及神经功能结局的分级评分:受试者工作特征曲线分析的比较
Neurocrit Care. 2025 Apr 28. doi: 10.1007/s12028-025-02270-9.

本文引用的文献

1
Performance metrics for models designed to predict treatment effect.用于预测治疗效果的模型的性能指标。
BMC Med Res Methodol. 2023 Jul 8;23(1):165. doi: 10.1186/s12874-023-01974-w.
2
There is no such thing as a validated prediction model.没有经过验证的预测模型这种东西。
BMC Med. 2023 Feb 24;21(1):70. doi: 10.1186/s12916-023-02779-w.
3
Development of the SAFETEA Scores for Predicting Risks of Complications of Preventive Endovascular or Microneurosurgical Intracranial Aneurysm Occlusion.SAFETEA 评分用于预测预防性血管内或显微神经外科颅内动脉瘤闭塞术并发症风险的开发。
Neurology. 2022 Oct 17;99(16):e1725-e1737. doi: 10.1212/WNL.0000000000200978.
4
External validation of prognostic models: what, why, how, when and where?预后模型的外部验证:是什么、为什么、如何、何时以及何地?
Clin Kidney J. 2020 Nov 24;14(1):49-58. doi: 10.1093/ckj/sfaa188. eCollection 2021 Jan.
5
Assessment of heterogeneity in an individual participant data meta-analysis of prediction models: An overview and illustration.个体参与者数据荟萃分析中预测模型异质性的评估:概述和实例。
Stat Med. 2019 Sep 30;38(22):4290-4309. doi: 10.1002/sim.8296. Epub 2019 Aug 2.
6
Early Prognostication of 1-Year Outcome After Subarachnoid Hemorrhage: The FRESH Score Validation.蛛网膜下腔出血 1 年后预后的早期预测:FRESH 评分验证。
J Stroke Cerebrovasc Dis. 2019 Oct;28(10):104280. doi: 10.1016/j.jstrokecerebrovasdis.2019.06.038. Epub 2019 Jul 18.
7
Impact of predictor measurement heterogeneity across settings on the performance of prediction models: A measurement error perspective.预测指标在不同环境下的变异性对预测模型性能的影响:测量误差的角度。
Stat Med. 2019 Aug 15;38(18):3444-3459. doi: 10.1002/sim.8183. Epub 2019 May 31.
8
Prediction of Outcome After Aneurysmal Subarachnoid Hemorrhage.动脉瘤性蛛网膜下腔出血预后的预测。
Stroke. 2019 Apr;50(4):837-844. doi: 10.1161/STROKEAHA.118.023902.
9
Development and validation of outcome prediction models for aneurysmal subarachnoid haemorrhage: the SAHIT multinational cohort study.颅内动脉瘤性蛛网膜下腔出血结局预测模型的建立和验证:SAHIT 多中心队列研究。
BMJ. 2018 Jan 18;360:j5745. doi: 10.1136/bmj.j5745.
10
A new concordance measure for risk prediction models in external validation settings.一种用于外部验证环境中风险预测模型的新一致性度量。
Stat Med. 2016 Oct 15;35(23):4136-52. doi: 10.1002/sim.6997. Epub 2016 Jun 1.