Sham P C, Curtis D
Department of Psychological Medicine, Institute of Psychiatry, Denmark Hill, London.
Ann Hum Genet. 1995 Jan;59(1):97-105. doi: 10.1111/j.1469-1809.1995.tb01608.x.
In an association analysis comparing cases and controls with respect to allele frequencies at a highly polymorphic locus, a potential problem is that the conventional chi-squared test may not be valid for a large, sparse contingency table. However, reliance on statistics with known asymptotic distribution is now unnecessary, as Monte Carlo simulations can be performed to estimate the significance level of any test statistic. We have implemented a Monte Carlo method for four 'chi-squared' test statistics, three of which involved combination of alleles, and evaluated their performance on a real data set. Combining rare alleles to avoid small expected cell counts, and considering each allele in turn against the rest, reduced the power to detect a genuine association when the number of alleles was very large. We should either not combine alleles at all, or combine them in such a way that preserves the evidence for an association.
在一项关于高度多态性位点上等位基因频率的病例与对照的关联分析中,一个潜在问题是传统的卡方检验对于大型稀疏列联表可能无效。然而,现在无需依赖具有已知渐近分布的统计量,因为可以进行蒙特卡罗模拟来估计任何检验统计量的显著性水平。我们针对四个“卡方”检验统计量实现了一种蒙特卡罗方法,其中三个涉及等位基因的组合,并在一个真实数据集上评估了它们的性能。当等位基因数量非常大时,组合稀有等位基因以避免预期单元格计数过小,以及依次将每个等位基因与其他等位基因进行比较,会降低检测真实关联的效能。我们要么根本不组合等位基因,要么以保留关联证据的方式进行组合。