使用易出错标准的诊断测试分数验证

Diagnostic Test Score Validation With a Fallible Criterion.

作者信息

Jewsbury Paul A

机构信息

Educational Testing Service, Princeton, NJ, USA.

出版信息

Appl Psychol Meas. 2019 Nov;43(8):579-596. doi: 10.1177/0146621618817785. Epub 2018 Dec 13.

DOI:10.1177/0146621618817785

PMID:31551637

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6745629/

Abstract

Criterion-related validation of diagnostic test scores for a construct of interest is complicated by the unavailability of the construct directly. The standard method, Known Group Validation, assumes an infallible reference test in place of the construct, but infallible reference tests are rare. In contrast, Mixed Group Validation allows for a fallible reference test, but has been found to make strong assumptions not appropriate for the majority of diagnostic test validation studies. The Neighborhood model is adapted for the purpose of diagnostic test validation, which makes alternate, but also strong, assumptions. The statistical properties of the Neighborhood model are evaluated and the assumptions are reviewed in the context of diagnostic test validation. Alternatively, strong assumptions may be avoided by estimating only intervals for the validity estimates, instead of point estimates. The Method of Bounds is also adapted for the purpose of diagnostic test validation, and an extension, Method of Bounds-Test Validation, is introduced here for the first time. All three point-estimate methods were found to make strong assumptions concerning the conditional relationships between the tests and the construct of interest, and all three lack robustness to assumption violation. The Method of Bounds-Test Validation was found to perform well across a range of plausible simulated datasets where the point-estimate methods failed. The point-estimate methods are recommended in special cases where the assumptions can be justified, while the interval methods are appropriate more generally.

摘要

由于无法直接获得感兴趣的构想，因此针对该构想的诊断测试分数的标准关联效度验证变得复杂。标准方法“已知组验证”假定使用无误的参考测试来替代该构想，但无误的参考测试很少见。相比之下，“混合组验证”允许使用有误差的参考测试，但已发现它做出的强假设不适用于大多数诊断测试验证研究。“邻域模型”是为诊断测试验证目的而改编的，它也做出了替代的但同样很强的假设。评估了邻域模型的统计特性，并在诊断测试验证的背景下审查了这些假设。或者，可以通过仅估计效度估计值的区间而不是点估计值来避免强假设。“边界法”也适用于诊断测试验证目的，这里首次引入了其扩展方法“边界-测试验证法”。发现所有三种点估计方法都对测试与感兴趣的构想之间的条件关系做出了强假设，并且这三种方法都缺乏对假设违背的稳健性。在一系列合理的模拟数据集中，当点估计方法失败时，发现边界-测试验证法表现良好。在假设可以得到证明的特殊情况下推荐使用点估计方法，而区间方法更普遍适用。

相似文献

Diagnostic Test Score Validation With a Fallible Criterion.使用易出错标准的诊断测试分数验证

Appl Psychol Meas. 2019 Nov;43(8):579-596. doi: 10.1177/0146621618817785. Epub 2018 Dec 13.

A description of mixed group validation.混合组验证的描述。

Assessment. 2014 Apr;21(2):170-80. doi: 10.1177/1073191112473176. Epub 2013 Jan 29.

Considerations underlying the use of mixed group validation.使用混合组验证的基本考虑因素。

Psychol Assess. 2013 Mar;25(1):204-15. doi: 10.1037/a0030063. Epub 2012 Oct 1.

Biomarker validation with an imperfect reference: Issues and bounds.生物标志物验证中的不完美参考：问题与界限。

Stat Methods Med Res. 2018 Oct;27(10):2933-2945. doi: 10.1177/0962280216689806. Epub 2017 Feb 6.

Diagnosing diagnostic tests: evaluating the assumptions underlying the estimation of sensitivity and specificity in the absence of a gold standard.诊断诊断测试：在没有金标准的情况下评估灵敏度和特异度估计所依据的假设。

Prev Vet Med. 2005 Apr;68(1):19-33. doi: 10.1016/j.prevetmed.2005.01.006.

Interval estimation for a proportion using a double-sampling scheme with two fallible classifiers.使用具有两个易错分类器的双重抽样方案对比例进行区间估计。

Stat Methods Med Res. 2018 Aug;27(8):2478-2503. doi: 10.1177/0962280216681599. Epub 2016 Dec 29.

Latent class models in diagnostic studies when there is no reference standard--a systematic review.无参考标准诊断研究中潜类别模型的系统评价。

Am J Epidemiol. 2014 Feb 15;179(4):423-31. doi: 10.1093/aje/kwt286. Epub 2013 Nov 21.

On the interpretation of test sensitivity in the two-test two-population problem: assumptions matter.在两测试两总体问题中的测试灵敏度解释：假设很重要。

Prev Vet Med. 2009 Oct 1;91(2-4):116-21. doi: 10.1016/j.prevetmed.2009.06.006. Epub 2009 Aug 3.

[Evaluation of diagnostic tests in the absence of a gold standard using an Anaplasma marginale field data set].[使用边缘无形体现场数据集在无金标准情况下对诊断测试进行评估]

Berl Munch Tierarztl Wochenschr. 2005 Sep-Oct;118(9-10):416-22.

Evaluation of diagnostic tests when there is no gold standard. A review of methods.在没有金标准时诊断试验的评估。方法综述。

Health Technol Assess. 2007 Dec;11(50):iii, ix-51. doi: 10.3310/hta11500.

引用本文的文献

Invited Commentary: Bayesian Inference with Multiple Tests.特邀评论：多重检验的贝叶斯推断

Neuropsychol Rev. 2023 Sep;33(3):643-652. doi: 10.1007/s11065-023-09604-4. Epub 2023 Aug 18.

本文引用的文献

Kinds versus continua: a review of psychometric approaches to uncover the structure of psychiatric constructs.类别与连续体：揭示精神疾病结构的心理测量方法综述

Psychol Med. 2016 Jun;46(8):1567-79. doi: 10.1017/S0033291715001944. Epub 2016 Mar 21.

Dyadic Short Forms of the Wechsler Adult Intelligence Scale-IV.韦氏成人智力量表第四版的二元简式

Arch Clin Neuropsychol. 2015 Aug;30(5):404-12. doi: 10.1093/arclin/acv035. Epub 2015 Jun 8.

A Bayesian approach to mixed group validation of performance validity tests.一种用于绩效效度测试混合组验证的贝叶斯方法。

Psychol Assess. 2015 Sep;27(3):763-76. doi: 10.1037/pas0000085. Epub 2015 Mar 30.

The reliability of clinical diagnoses: state of the art.临床诊断的可靠性：现状。

Annu Rev Clin Psychol. 2014;10:111-30. doi: 10.1146/annurev-clinpsy-032813-153739. Epub 2014 Jan 2.

Diagnostic accuracy of a bayesian latent group analysis for the detection of malingering-related poor effort.贝叶斯潜在类别分析检测与伪装相关的努力不足的诊断准确性。

Clin Neuropsychol. 2013;27(6):1019-42. doi: 10.1080/13854046.2013.806677. Epub 2013 Jun 14.

Social and neuro-cognition as distinct cognitive factors in schizophrenia: a systematic review.精神分裂症中社会认知和神经认知作为不同的认知因素：系统综述。

Schizophr Res. 2013 Aug;148(1-3):3-11. doi: 10.1016/j.schres.2013.05.009. Epub 2013 May 31.

Evaluating research for clinical significance: using critically appraised topics to enhance evidence-based neuropsychology.

Clin Neuropsychol. 2014;28(4):653-68. doi: 10.1080/13854046.2013.776636. Epub 2013 Mar 7.

A description of mixed group validation.混合组验证的描述。

Assessment. 2014 Apr;21(2):170-80. doi: 10.1177/1073191112473176. Epub 2013 Jan 29.

Considerations underlying the use of mixed group validation.使用混合组验证的基本考虑因素。

Psychol Assess. 2013 Mar;25(1):204-15. doi: 10.1037/a0030063. Epub 2012 Oct 1.

Estimating the accuracy of neurocognitive effort measures in the absence of a "gold standard".在缺乏“金标准”的情况下估计神经认知努力测量的准确性。

Psychol Assess. 2012 Dec;24(4):815-22. doi: 10.1037/a0028195. Epub 2012 Apr 30.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验