Suppr超能文献

基于平滑估计的受试者工作特征曲线长度的置信区间。

Confidence intervals for the length of the receiver-operating characteristic curve based on a smooth estimator.

机构信息

Anesthesiology Department, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.

Faculty of Health Sciences, Universidad Autonoma de Chile, Providencia, Chile.

出版信息

Stat Methods Med Res. 2023 May;32(5):978-993. doi: 10.1177/09622802231160053. Epub 2023 Mar 15.

Abstract

A good diagnostic test should show different behavior on both the positive and the negative populations. However, this is not enough for having a good classification system. The binary classification problem is a complex task, which implies to define decision criteria. The knowledge of the level of dissimilarity between the two involved distributions is not enough. We also have to know how to define those decision criteria. The length of the receiver-operating characteristic curve has been proposed as an index of the optimal discriminatory capacity of a biomarker. It is related not with the actual but with the optimal classification capacity of the considered diagnostic test. One particularity of this index is that its estimation should be based on parametric or smoothed models. We explore here the behavior of a kernel density estimator-based approximation for estimating the length of the receiver-operating characteristic curve. We prove the asymptotic distribution of the resulting statistic, propose a parametric bootstrap algorithm for confidence intervals construction, discuss the role that the bandwidth parameter plays in the quality of the provided estimations and, via Monte Carlo simulations, study its finite-sample behavior considering four different criteria for the bandwidth selection. The practical use of the length of the receiver-operating characteristic curve is illustrated through two real-world examples.

摘要

一个好的诊断测试应该在阳性和阴性人群中表现出不同的行为。然而,这对于建立一个良好的分类系统来说还不够。二分类问题是一项复杂的任务,需要定义决策标准。仅仅了解两个相关分布之间的差异程度是不够的,我们还必须知道如何定义这些决策标准。接收者操作特性曲线的长度已被提议作为生物标志物最佳区分能力的指标。它与诊断测试的实际分类能力有关,而不是与最佳分类能力有关。该指标的一个特点是,其估计应该基于参数或平滑模型。我们在这里探讨了基于核密度估计的近似值来估计接收者操作特性曲线长度的行为。我们证明了所得统计量的渐近分布,提出了用于置信区间构建的参数自举算法,讨论了带宽参数在提供估计质量中的作用,并通过蒙特卡罗模拟,考虑了带宽选择的四个不同标准,研究了其有限样本行为。通过两个实际示例说明了接收者操作特性曲线长度的实际用途。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验