Astrophysics Sector, Scuola Internazionale Superiore di Studi Avanzati and Instituto Nazionale di Fisica Nucleare Sezione di Trieste, Trieste, Italy.
PLoS One. 2011 Feb 23;6(2):e16110. doi: 10.1371/journal.pone.0016110.
Prognostic models applied in medicine must be validated on independent samples, before their use can be recommended. The assessment of calibration, i.e., the model's ability to provide reliable predictions, is crucial in external validation studies. Besides having several shortcomings, statistical techniques such as the computation of the standardized mortality ratio (SMR) and its confidence intervals, the Hosmer-Lemeshow statistics, and the Cox calibration test, are all non-informative with respect to calibration across risk classes. Accordingly, calibration plots reporting expected versus observed outcomes across risk subsets have been used for many years. Erroneously, the points in the plot (frequently representing deciles of risk) have been connected with lines, generating false calibration curves. Here we propose a methodology to create a confidence band for the calibration curve based on a function that relates expected to observed probabilities across classes of risk. The calibration belt allows the ranges of risk to be spotted where there is a significant deviation from the ideal calibration, and the direction of the deviation to be indicated. This method thus offers a more analytical view in the assessment of quality of care, compared to other approaches.
应用于医学的预后模型必须在独立样本上进行验证,然后才能推荐使用。校准的评估,即模型提供可靠预测的能力,在外部验证研究中至关重要。除了存在几个缺点外,统计技术,如标准化死亡率(SMR)及其置信区间的计算、Hosmer-Lemeshow 统计和 Cox 校准测试,在整个风险类别中都无法提供关于校准的信息。因此,多年来一直使用报告风险亚组之间预期与观察结果的校准图。错误地,该图中的点(通常代表风险的十分位数)已用线连接,从而生成了错误的校准曲线。在这里,我们提出了一种基于将预期概率与风险类别相关联的函数来创建校准曲线置信带的方法。校准带允许识别出与理想校准存在显著偏差的风险范围,并指示偏差的方向。与其他方法相比,这种方法为评估护理质量提供了更具分析性的视角。