McMahan Christopher S, Tebbs Joshua M, Hanson Timothy E, Bilder Christopher R
Department of Mathematical Sciences, Clemson University, Clemson, South Carolina 29634, U.S.A.
Department of Statistics, University of South Carolina, Columbia, South Carolina 29208, U.S.A.
Biometrics. 2017 Dec;73(4):1443-1452. doi: 10.1111/biom.12704. Epub 2017 Apr 12.
Group testing involves pooling individual specimens (e.g., blood, urine, swabs, etc.) and testing the pools for the presence of a disease. When individual covariate information is available (e.g., age, gender, number of sexual partners, etc.), a common goal is to relate an individual's true disease status to the covariates in a regression model. Estimating this relationship is a nonstandard problem in group testing because true individual statuses are not observed and all testing responses (on pools and on individuals) are subject to misclassification arising from assay error. Previous regression methods for group testing data can be inefficient because they are restricted to using only initial pool responses and/or they make potentially unrealistic assumptions regarding the assay accuracy probabilities. To overcome these limitations, we propose a general Bayesian regression framework for modeling group testing data. The novelty of our approach is that it can be easily implemented with data from any group testing protocol. Furthermore, our approach will simultaneously estimate assay accuracy probabilities (along with the covariate effects) and can even be applied in screening situations where multiple assays are used. We apply our methods to group testing data collected in Iowa as part of statewide screening efforts for chlamydia, and we make user-friendly R code available to practitioners.
分组检测涉及将个体样本(如血液、尿液、拭子等)汇集起来,并检测这些样本池是否存在疾病。当个体协变量信息可用时(如年龄、性别、性伴侣数量等),一个常见的目标是在回归模型中将个体的真实疾病状态与协变量联系起来。在分组检测中估计这种关系是一个非标准问题,因为无法观察到个体的真实状态,并且所有检测结果(样本池和个体的)都可能因检测误差而出现错误分类。以前用于分组检测数据的回归方法可能效率低下,因为它们仅限于使用初始样本池的检测结果,和/或它们对检测准确性概率做出了潜在不现实的假设。为了克服这些局限性,我们提出了一个用于对分组检测数据进行建模的通用贝叶斯回归框架。我们方法的新颖之处在于它可以很容易地用来自任何分组检测方案的数据来实现。此外,我们的方法将同时估计检测准确性概率(以及协变量效应),甚至可以应用于使用多种检测方法的筛查情况。我们将我们的方法应用于爱荷华州收集的分组检测数据,作为全州衣原体筛查工作的一部分,并且我们向从业者提供了用户友好的R代码。