基于二分类结局的医疗质量评估校准带。

Calibration belt for quality-of-care assessment based on dichotomous outcomes.

机构信息

Astrophysics Sector, Scuola Internazionale Superiore di Studi Avanzati and Instituto Nazionale di Fisica Nucleare Sezione di Trieste, Trieste, Italy.

出版信息

PLoS One. 2011 Feb 23;6(2):e16110. doi: 10.1371/journal.pone.0016110.

DOI:10.1371/journal.pone.0016110

PMID:21373178

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3043050/

Abstract

Prognostic models applied in medicine must be validated on independent samples, before their use can be recommended. The assessment of calibration, i.e., the model's ability to provide reliable predictions, is crucial in external validation studies. Besides having several shortcomings, statistical techniques such as the computation of the standardized mortality ratio (SMR) and its confidence intervals, the Hosmer-Lemeshow statistics, and the Cox calibration test, are all non-informative with respect to calibration across risk classes. Accordingly, calibration plots reporting expected versus observed outcomes across risk subsets have been used for many years. Erroneously, the points in the plot (frequently representing deciles of risk) have been connected with lines, generating false calibration curves. Here we propose a methodology to create a confidence band for the calibration curve based on a function that relates expected to observed probabilities across classes of risk. The calibration belt allows the ranges of risk to be spotted where there is a significant deviation from the ideal calibration, and the direction of the deviation to be indicated. This method thus offers a more analytical view in the assessment of quality of care, compared to other approaches.

摘要

应用于医学的预后模型必须在独立样本上进行验证，然后才能推荐使用。校准的评估，即模型提供可靠预测的能力，在外部验证研究中至关重要。除了存在几个缺点外，统计技术，如标准化死亡率（SMR）及其置信区间的计算、Hosmer-Lemeshow 统计和 Cox 校准测试，在整个风险类别中都无法提供关于校准的信息。因此，多年来一直使用报告风险亚组之间预期与观察结果的校准图。错误地，该图中的点（通常代表风险的十分位数）已用线连接，从而生成了错误的校准曲线。在这里，我们提出了一种基于将预期概率与风险类别相关联的函数来创建校准曲线置信带的方法。校准带允许识别出与理想校准存在显著偏差的风险范围，并指示偏差的方向。与其他方法相比，这种方法为评估护理质量提供了更具分析性的视角。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b5c/3043050/4199d2549fa6/pone.0016110.g001.jpg

相似文献

Calibration belt for quality-of-care assessment based on dichotomous outcomes.

PLoS One. 2011 Feb 23;6(2):e16110. doi: 10.1371/journal.pone.0016110.

Community-wide assessment of intensive care outcomes using a physiologically based prognostic measure: implications for critical care delivery from Cleveland Health Quality Choice.

Chest. 1999 Mar;115(3):793-801. doi: 10.1378/chest.115.3.793.

External validation of the SAPS II, APACHE II and APACHE III prognostic models in South England: a multicentre study.

Intensive Care Med. 2003 Feb;29(2):249-56. doi: 10.1007/s00134-002-1607-9. Epub 2003 Jan 18.

[Prognostic estimation in critical patients. Validation of a new and very simple system of prognostic estimation of survival in an intensive care unit].

Med Intensiva. 2006 Apr;30(3):101-8. doi: 10.1016/s0210-5691(06)74482-5.

Prospective independent validation of APACHE III models in an Australian tertiary adult intensive care unit.

Anaesth Intensive Care. 2002 Jun;30(3):308-15. doi: 10.1177/0310057X0203000307.

A comparison of the performance of a model based on administrative data and a model based on clinical data: effect of severity of illness on standardized mortality ratios of intensive care units.

Crit Care Med. 2012 Feb;40(2):373-8. doi: 10.1097/CCM.0b013e318232d7b0.

Assessment of the performance of five intensive care scoring models within a large Scottish database.

Crit Care Med. 2000 Jun;28(6):1820-7. doi: 10.1097/00003246-200006000-00023.

Factors affecting the performance of the models in the Mortality Probability Model II system and strategies of customization: a simulation study.

Crit Care Med. 1996 Jan;24(1):57-63. doi: 10.1097/00003246-199601000-00011.

Hospital mortality prediction for intermediate care patients: Assessing the generalizability of the Intermediate Care Unit Severity Score (IMCUSS).

J Crit Care. 2018 Aug;46:94-98. doi: 10.1016/j.jcrc.2018.05.009. Epub 2018 May 19.

A new calibration test and a reappraisal of the calibration belt for the assessment of prediction models based on dichotomous outcomes.

Stat Med. 2014 Jun 30;33(14):2390-407. doi: 10.1002/sim.6100. Epub 2014 Feb 4.

引用本文的文献

External validation of the Malaria Scoring System in a non-endemic emergency department.

Intern Emerg Med. 2025 Jul 17. doi: 10.1007/s11739-025-04044-9.

Development and validation of a risk score to predict neonatal mortality among NICU admissions in Southern Ethiopia: a retrospective follow-up study.

Front Pediatr. 2025 Jun 12;13:1496019. doi: 10.3389/fped.2025.1496019. eCollection 2025.

Preoperative CT-based radiomics model for predicting muscle invasion in patients with upper tract urothelial carcinoma below T3 stage.

Abdom Radiol (NY). 2025 May 17. doi: 10.1007/s00261-025-04979-9.

Predicting Suicidal Ideation Among Native American High Schoolers in California.

Arch Suicide Res. 2025 Apr 18:1-18. doi: 10.1080/13811118.2025.2490154.

Performance of Pediatric Risk of Mortality IV in Brazilian PICUs: A Multicenter Prospective Study.

Crit Care Explor. 2025 Mar 28;7(4):e1243. doi: 10.1097/CCE.0000000000001243. eCollection 2025 Apr 1.

Prognostic factors for successful extubation in newborns with congenital diaphragmatic hernia.

Front Pediatr. 2025 Jan 27;13:1530467. doi: 10.3389/fped.2025.1530467. eCollection 2025.

Optimizing anesthesia management based on early identification of electroencephalogram burst suppression risk in non-cardiac surgery patients: a visualized dynamic nomogram.

Ann Med. 2024 Dec;56(1):2407067. doi: 10.1080/07853890.2024.2407067. Epub 2024 Sep 24.

Development and validation of a prognosis risk score model for neonatal mortality in the Amhara region, Ethiopia. A prospective cohort study.

Glob Health Action. 2024 Dec 31;17(1):2392354. doi: 10.1080/16549716.2024.2392354. Epub 2024 Aug 30.

Pretreatment Sarcopenia and MRI-Based Radiomics to Predict the Response of Neoadjuvant Chemotherapy in Triple-Negative Breast Cancer.

Bioengineering (Basel). 2024 Jun 28;11(7):663. doi: 10.3390/bioengineering11070663.

Risk prediction models for postpartum glucose intolerance in women with a history of gestational diabetes mellitus: a scoping review.

J Diabetes Metab Disord. 2023 Oct 27;23(1):115-124. doi: 10.1007/s40200-023-01330-1. eCollection 2024 Jun.

本文引用的文献

Assessing the calibration of mortality benchmarks in critical care: The Hosmer-Lemeshow test revisited.

Crit Care Med. 2007 Sep;35(9):2052-6. doi: 10.1097/01.CCM.0000275267.64078.B0.

One model, several results: the paradox of the Hosmer-Lemeshow goodness-of-fit test for the logistic regression model.

J Epidemiol Biostat. 2000;5(4):251-3.

A new Simplified Acute Physiology Score (SAPS II) based on a European/North American multicenter study.

JAMA. 1993;270(24):2957-63. doi: 10.1001/jama.270.24.2957.

A review of goodness of fit statistics for use in the development of logistic regression models.

Am J Epidemiol. 1982 Jan;115(1):92-106. doi: 10.1093/oxfordjournals.aje.a113284.

The quality of care. How can it be assessed?

JAMA. 1988;260(12):1743-8. doi: 10.1001/jama.260.12.1743.

Validation techniques for logistic regression models.

Stat Med. 1991 Aug;10(8):1213-26. doi: 10.1002/sim.4780100805.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于二分类结局的医疗质量评估校准带。

Calibration belt for quality-of-care assessment based on dichotomous outcomes.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献