Yin Guosheng, Ma Yanyuan
Department of Statistics and Actuarial Science The University of Hong Kong Pokfulam Road, Hong Kong
Electron J Stat. 2013;7:412-427. doi: 10.1214/13-EJS773.
The Pearson test statistic is constructed by partitioning the data into bins and computing the difference between the observed and expected counts in these bins. If the maximum likelihood estimator (MLE) of the original data is used, the statistic generally does not follow a chi-squared distribution or any explicit distribution. We propose a bootstrap-based modification of the Pearson test statistic to recover the chi-squared distribution. We compute the observed and expected counts in the partitioned bins by using the MLE obtained from a bootstrap sample. This bootstrap-sample MLE adjusts exactly the right amount of randomness to the test statistic, and recovers the chi-squared distribution. The bootstrap chi-squared test is easy to implement, as it only requires fitting exactly the same model to the bootstrap data to obtain the corresponding MLE, and then constructs the bin counts based on the original data. We examine the test size and power of the new model diagnostic procedure using simulation studies and illustrate it with a real data set.
皮尔逊检验统计量是通过将数据划分为多个区间并计算这些区间内观测计数与预期计数之间的差异来构建的。如果使用原始数据的最大似然估计量(MLE),该统计量通常不遵循卡方分布或任何明确的分布。我们提出了一种基于自助法的皮尔逊检验统计量修正方法,以恢复卡方分布。我们通过使用从自助样本获得的MLE来计算划分区间内的观测计数和预期计数。这种自助样本MLE能为检验统计量精确调整适量的随机性,并恢复卡方分布。自助卡方检验易于实施,因为它只需要对自助数据拟合完全相同的模型以获得相应的MLE,然后根据原始数据构建区间计数。我们使用模拟研究来检验新模型诊断程序的检验规模和功效,并用一个真实数据集进行说明。