Wang Dewei, McMahan Christopher S, Tebbs Joshua M, Bilder Christopher R
Department of Statistics, University of South Carolina, Columbia, SC 29208, USA.
Department of Mathematical Sciences, Clemson University, Clemson, SC 29634, USA.
Comput Stat Data Anal. 2018 Jun;122:156-166. doi: 10.1016/j.csda.2018.01.005. Epub 2018 Feb 1.
Screening procedures for infectious diseases, such as HIV, often involve pooling individual specimens together and testing the pools. For diseases with low prevalence, group testing (or pooled testing) can be used to classify individuals as diseased or not while providing considerable cost savings when compared to testing specimens individually. The pooling literature is replete with group testing case identification algorithms including Dorfman testing, higher-stage hierarchical procedures, and array testing. Although these algorithms are usually evaluated on the basis of the expected number of tests and classification accuracy, most evaluations in the literature do not account for the continuous nature of the testing responses and thus invoke potentially restrictive assumptions to characterize an algorithm's performance. Commonly used case identification algorithms in group testing are considered and are evaluated by taking a different approach. Instead of treating testing responses as binary random variables (i.e., diseased/not), evaluations are made by exploiting an assay's underlying continuous biomarker distributions for positive and negative individuals. In doing so, a general framework to describe the operating characteristics of group testing case identification algorithms is provided when these distributions are known. The methodology is illustrated using two HIV testing examples taken from the pooling literature.
传染病(如艾滋病毒)的筛查程序通常涉及将个体样本汇集在一起并对汇集样本进行检测。对于低流行率的疾病,分组检测(或混合检测)可用于将个体分类为患病或未患病,与单独检测样本相比,能显著节省成本。关于分组检测的文献中充斥着分组检测病例识别算法,包括 Dorfman 检测、高级分层程序和阵列检测。尽管这些算法通常根据预期检测次数和分类准确性进行评估,但文献中的大多数评估并未考虑检测结果的连续性质,因此采用了可能具有限制性的假设来描述算法的性能。本文考虑了分组检测中常用的病例识别算法,并采用不同方法对其进行评估。评估不是将检测结果视为二元随机变量(即患病/未患病),而是通过利用检测针对阳性和阴性个体的潜在连续生物标志物分布来进行。这样做时,当这些分布已知时,提供了一个描述分组检测病例识别算法操作特征的通用框架。使用从分组检测文献中选取的两个艾滋病毒检测示例对该方法进行了说明。