Shan Guogen, Gerstenberger Shawn
School of Community Health Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, United States of America.
PLoS One. 2017 Dec 20;12(12):e0188709. doi: 10.1371/journal.pone.0188709. eCollection 2017.
This research is motivated by one of our survey studies to assess the potential influence of introducing zebra mussels to the Lake Mead National Recreation Area, Nevada. One research question in this study is to investigate the association between the boating activity type and the awareness of zebra mussels. A chi-squared test is often used for testing independence between two factors with nominal levels. When the null hypothesis of independence between two factors is rejected, we are often left wondering where does the significance come from. Cell residuals, including standardized residuals and adjusted residuals, are traditionally used in testing for cell significance, which is often known as a post hoc test after a statistically significant chi-squared test. In practice, the limiting distributions of these residuals are utilized for statistical inference. However, they may lead to different conclusions based on the calculated p-values, and their p-values could be over- o6r under-estimated due to the unsatisfactory performance of asymptotic approaches with regards to type I error control. In this article, we propose new exact p-values by using Fisher's approach based on three commonly used test statistics to order the sample space. We theoretically prove that the proposed new exact p-values based on these test statistics are the same. Based on our extensive simulation studies, we show that the existing asymptotic approach based on adjusted residual is often more likely to reject the null hypothesis as compared to the exact approach due to the inflated family-wise error rates as observed. We would recommend the proposed exact p-value for use in practice as a valuable post hoc analysis technique for chi-squared analysis.
这项研究的动机源于我们的一项调查研究,旨在评估向内华达州米德湖国家休闲区引入斑马贻贝的潜在影响。本研究的一个研究问题是调查划船活动类型与斑马贻贝知晓度之间的关联。卡方检验常用于检验两个具有名义水平的因素之间的独立性。当两个因素之间独立性的原假设被拒绝时,我们常常会想显著性来自何处。传统上,包括标准化残差和调整后残差在内的单元格残差用于检验单元格显著性,这通常在卡方检验具有统计学显著性后被称为事后检验。在实际应用中,这些残差的极限分布被用于统计推断。然而,基于计算出的p值,它们可能会得出不同的结论,并且由于在控制I型错误方面渐近方法的性能不佳,其p值可能被高估或低估。在本文中,我们基于三种常用的检验统计量,采用费舍尔方法提出了新的精确p值,以对样本空间进行排序。我们从理论上证明了基于这些检验统计量提出的新精确p值是相同的。基于我们广泛的模拟研究,我们表明,与精确方法相比,现有的基于调整后残差的渐近方法由于观察到的族系错误率膨胀,往往更有可能拒绝原假设。我们建议将所提出的精确p值作为卡方分析中有价值的事后分析技术在实际中使用。