布里尔评分并不评估诊断试验或预测模型的临床效用。

The Brier score does not evaluate the clinical utility of diagnostic tests or prediction models.

作者信息

Assel Melissa, Sjoberg Daniel D, Vickers Andrew J

机构信息

Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, USA.

出版信息

Diagn Progn Res. 2017 Dec 2;1:19. doi: 10.1186/s41512-017-0020-3. eCollection 2017.

DOI:10.1186/s41512-017-0020-3

PMID:31093548

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6460786/

Abstract

BACKGROUND

A variety of statistics have been proposed as tools to help investigators assess the value of diagnostic tests or prediction models. The Brier score has been recommended on the grounds that it is a proper scoring rule that is affected by both discrimination and calibration. However, the Brier score is prevalence dependent in such a way that the rank ordering of tests or models may inappropriately vary by prevalence.

METHODS

We explored four common clinical scenarios: comparison of a highly accurate binary test with a continuous prediction model of moderate predictiveness; comparison of two binary tests where the importance of sensitivity versus specificity is inversely associated with prevalence; comparison of models and tests to default strategies of assuming that all or no patients are positive; and comparison of two models with miscalibration in opposite directions.

RESULTS

In each case, we found that the Brier score gave an inappropriate rank ordering of the tests and models. Conversely, net benefit, a decision-analytic measure, gave results that always favored the preferable test or model.

CONCLUSIONS

Brier score does not evaluate clinical value of diagnostic tests or prediction models. We advocate, as an alternative, the use of decision-analytic measures such as net benefit.

TRIAL REGISTRATION

Not applicable.

摘要

背景

已提出多种统计方法作为工具，以帮助研究人员评估诊断试验或预测模型的价值。推荐使用Brier评分，因为它是一种恰当的评分规则，受区分度和校准的影响。然而，Brier评分依赖于患病率，以至于试验或模型的排序可能会因患病率而不适当地变化。

方法

我们探讨了四种常见的临床情况：将高度准确的二元试验与预测性中等的连续预测模型进行比较；比较两种二元试验，其中敏感性与特异性的重要性与患病率呈负相关；将模型和试验与假设所有患者或无患者为阳性的默认策略进行比较；比较两个校准方向相反的模型。

结果

在每种情况下，我们发现Brier评分对试验和模型的排序都不合适。相反，净效益作为一种决策分析指标，其结果总是有利于更优的试验或模型。

结论

Brier评分不能评估诊断试验或预测模型的临床价值。作为替代方法，我们提倡使用净效益等决策分析指标。

试验注册

不适用。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

布里尔评分并不评估诊断试验或预测模型的临床效用。

The Brier score does not evaluate the clinical utility of diagnostic tests or prediction models.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

TRIAL REGISTRATION

背景

方法

结果

结论

试验注册

相似文献

引用本文的文献

本文引用的文献

布里尔评分并不评估诊断试验或预测模型的临床效用。

The Brier score does not evaluate the clinical utility of diagnostic tests or prediction models.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

TRIAL REGISTRATION

背景

方法

结果

结论

试验注册

相似文献

引用本文的文献

本文引用的文献