Epidemiology and Biostatistics Program, School of Public Health, University of Nevada Las Vegas, Las Vegas, NV, USA.
Stat Methods Med Res. 2020 Oct;29(10):3006-3018. doi: 10.1177/0962280220913971. Epub 2020 Apr 3.
Clustered binary data are commonly encountered in many medical research studies with several binary outcomes from each cluster. Asymptotic methods are traditionally used for confidence interval calculations. However, these intervals often have unsatisfactory performance with regards to coverage for a study with a small sample size or the actual proportion near the boundary. To improve the coverage probability, exact Buehler's one-sided intervals may be utilized, but they are computationally intensive in this setting. Therefore, we propose using importance sampling to calculate confidence intervals that almost always guarantee the coverage. We conduct extensive simulation studies to compare the performance of the existing asymptotic intervals and the new accurate intervals using importance sampling. The new intervals based on the asymptotic Wilson score for sample space ordering perform better than others, and they are recommended for use in practice.
聚类二项数据在许多医学研究中经常遇到,每个聚类都有多个二项结果。传统上使用渐近方法进行置信区间计算。然而,对于小样本量或实际边界附近的比例,这些区间的覆盖范围往往不理想。为了提高覆盖率概率,可以使用精确的 Buehler 单侧区间,但在这种情况下计算量很大。因此,我们建议使用重要性抽样来计算置信区间,这些区间几乎总是可以保证覆盖率。我们进行了广泛的模拟研究,比较了使用重要性抽样的现有渐近区间和新的精确区间的性能。基于样本空间排序的渐近威尔逊得分的新区间表现优于其他区间,建议在实践中使用。