Suppr超能文献

无金标准时诊断试验准确性的估计

Estimation of Diagnostic Test Accuracy Without Gold Standards.

作者信息

Sun Ao, Zhou Xiao-Hua

机构信息

Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China.

Department of Biostatistics and Beijing International Center for Mathematical Research, Peking University, Beijing, China.

出版信息

Stat Med. 2025 Feb 10;44(3-4):e10315. doi: 10.1002/sim.10315.

Abstract

The ideal evaluation of diagnostic test performance requires a reference test that is free of errors. However, for many diseases, obtaining such a "gold standard" reference is either impossible or prohibitively expensive. Estimating test accuracy in the absence of a gold standard is therefore a significant challenge. In this article, we introduce and categorize existing methods for evaluating diagnostic tests without a gold standard, considering factors such as the type and number of tests, as well as the structure of the observed data. For each method, we provide a comprehensive introduction and analysis of its underlying assumptions, model architecture, identifiability, estimation techniques, and inference procedures. We use R to conduct simulations for widely applicable models, validating assumptions, comparing models, and assessing their reliability. Additionally, we present real-world examples along with the corresponding R code for these models, enabling readers to better understand how to apply them effectively. Beyond diagnostic medicine, we underscore that the issue of imperfect gold standards affects other fields, drawing parallels to the noisy label problem in machine learning. By highlighting similarities and differences across these domains, we open pathways for further research. The primary aim of this article is to consolidate existing methods for assessing test accuracy in the absence of a gold standard and to provide practical guidance for researchers seeking to apply these methods effectively.

摘要

对诊断测试性能的理想评估需要一个无误差的参考测试。然而,对于许多疾病来说,获得这样一个“金标准”参考要么是不可能的,要么成本高得令人望而却步。因此,在没有金标准的情况下估计测试准确性是一项重大挑战。在本文中,我们介绍并分类了在没有金标准的情况下评估诊断测试的现有方法,考虑了诸如测试类型和数量以及观测数据结构等因素。对于每种方法,我们对其基本假设、模型架构、可识别性、估计技术和推断程序进行了全面的介绍和分析。我们使用R对广泛适用的模型进行模拟,验证假设、比较模型并评估其可靠性。此外,我们还给出了这些模型的实际例子以及相应的R代码,使读者能够更好地理解如何有效地应用它们。除了诊断医学,我们强调不完美金标准的问题也影响其他领域,并将其与机器学习中的噪声标签问题进行类比。通过突出这些领域的异同,我们为进一步的研究开辟了道路。本文的主要目的是整合在没有金标准的情况下评估测试准确性的现有方法,并为寻求有效应用这些方法的研究人员提供实用指导。

相似文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验