Zhou Yi-Hui, Mayhew Gregory, Sun Zhibin, Xu Xiaolin, Zou Fei, Wright Fred A
Department of Statistics, North Carolina State University, Raleigh, NC, U.S.A.
Department of Biostatistics, University of North Carolina, Chapel Hill, NC, U.S.A.
Stat. 2013 Jan 1;2(1):292-302. doi: 10.1002/sta4.37.
The Mantel and Knox space-time clustering statistics are popular tools to establish transmissibility of a disease and detect outbreaks. The most commonly used null distributional approximations may provide poor fits, and researchers often resort to direct sampling from the permutation distribution. However, the exact first four moments for these statistics are available, and Pearson distributional approximations are often effective. Thus, our first goal is to clarify the literature and to make these tools more widely available. In addition, by rewriting terms in the statistics we obtain the exact first four permutation moments for the most commonly used quadratic form statistics, which need not be positive definite. The extension of this work to quadratic forms greatly expands the utility of density approximations for these problems, including for high-dimensional applications, where the statistics must be extreme in order to exceed stringent testing thresholds. We demonstrate the methods using examples from the investigation of disease transmission in cattle, the association of a gene expression pathway with breast cancer survival, regional genetic association with cystic fibrosis lung disease, and hypothesis testing for smoothed local linear regression.
曼特尔(Mantel)和诺克斯(Knox)时空聚类统计是用于确定疾病传播性和检测疫情爆发的常用工具。最常用的零分布近似可能拟合效果不佳,研究人员常常求助于从排列分布中直接抽样。然而,这些统计量的确切前四阶矩是已知的,并且皮尔逊分布近似通常很有效。因此,我们的首要目标是澄清相关文献,并使这些工具更广泛地可用。此外,通过重写统计量中的项,我们得到了最常用的二次型统计量的确切前四个排列矩,这些二次型统计量不一定是正定的。将这项工作扩展到二次型极大地扩展了这些问题的密度近似的效用,包括在高维应用中,在高维应用中,统计量必须非常极端才能超过严格的检验阈值。我们用牛病传播调查、基因表达途径与乳腺癌生存的关联、囊性纤维化肺病的区域遗传关联以及平滑局部线性回归的假设检验等例子来演示这些方法。