Cai Bryan, Ioannidis John P A, Bendavid Eran, Tian Lu
Department of Computer Science, Stanford University, Stanford, CA, USA.
Department of Medicine, Stanford University, Stanford, CA, USA.
J Appl Stat. 2022 Jan 4;50(11-12):2599-2623. doi: 10.1080/02664763.2021.2019687. eCollection 2023.
To make informative public policy decisions in battling the ongoing COVID-19 pandemic, it is important to know the disease prevalence in a population. There are two intertwined difficulties in estimating this prevalence based on testing results from a group of subjects. First, the test is prone to measurement error with unknown sensitivity and specificity. Second, the prevalence tends to be low at the initial stage of the pandemic and we may not be able to determine if a positive test result is a false positive due to the imperfect test specificity. The statistical inference based on a large sample approximation or conventional bootstrap may not be valid in such cases. In this paper, we have proposed a set of confidence intervals, whose validity doesn't depend on the sample size in the unweighted setting. For the weighted setting, the proposed inference is equivalent to hybrid bootstrap methods, whose performance is also more robust than those based on asymptotic approximations. The methods are used to reanalyze data from a study investigating the antibody prevalence in Santa Clara County, California in addition to several other seroprevalence studies. Simulation studies have been conducted to examine the finite-sample performance of the proposed method.
为了在抗击持续的新冠疫情中做出明智的公共政策决策,了解人群中的疾病流行率很重要。基于一组受试者的检测结果来估计这种流行率存在两个相互交织的困难。首先,该检测容易出现测量误差,其灵敏度和特异性未知。其次,在疫情初期流行率往往较低,由于检测特异性不完善,我们可能无法确定阳性检测结果是否为假阳性。在这种情况下,基于大样本近似或传统自助法的统计推断可能无效。在本文中,我们提出了一组置信区间,其有效性在未加权设置下不依赖于样本量。对于加权设置,所提出的推断等同于混合自助法,其性能也比基于渐近近似的方法更稳健。这些方法除了用于重新分析其他几项血清流行率研究的数据外,还用于重新分析一项调查加利福尼亚州圣克拉拉县抗体流行率的研究数据。已进行模拟研究以检验所提方法的有限样本性能。