Chen Bingshu E, Sakoda Lori C, Hsing Ann W, Rosenberg Philip S
Biostatistics Branch, Department of Health and Human Services, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Rockville, Maryland 20852-7244, USA.
Genet Epidemiol. 2006 Sep;30(6):495-507. doi: 10.1002/gepi.20162.
In case-control studies of unrelated subjects, gene-based hypothesis tests consider whether any tested feature in a candidate gene--single nucleotide polymorphisms (SNPs), haplotypes, or both--are associated with disease. Standard statistical tests are available that control the false-positive rate at the nominal level over all polymorphisms considered. However, more powerful tests can be constructed that use permutation resampling to account for correlations between polymorphisms and test statistics. A key question is whether the gain in power is large enough to justify the computational burden. We compared the computationally simple Simes Global Test to the min P test, which considers the permutation distribution of the minimum p-value from marginal tests of each SNP. In simulation studies incorporating empirical haplotype structures in 15 genes, the min P test controlled the type I error, and was modestly more powerful than the Simes test, by 2.1 percentage points on average. When disease susceptibility was conferred by a haplotype, the min P test sometimes, but not always, under-performed haplotype analysis. A resampling-based omnibus test combining the min P and haplotype frequency test controlled the type I error, and closely tracked the more powerful of the two component tests. This test achieved consistent gains in power (5.7 percentage points on average), compared to a simple Bonferroni test of Simes and haplotype analysis. Using data from the Shanghai Biliary Tract Cancer Study, the advantages of the newly proposed omnibus test were apparent in a population-based study of bile duct cancer and polymorphisms in the prostaglandin-endoperoxide synthase 2 (PTGS2) gene.
在无关个体的病例对照研究中,基于基因的假设检验考虑候选基因中的任何检测特征(单核苷酸多态性(SNP)、单倍型或两者)是否与疾病相关。有标准的统计检验方法可用于在所有考虑的多态性中将假阳性率控制在名义水平。然而,可以构建更强大的检验方法,使用置换重采样来考虑多态性与检验统计量之间的相关性。一个关键问题是检验效能的提高是否足以证明计算负担的合理性。我们将计算简单的西姆斯全局检验与最小P值检验进行了比较,最小P值检验考虑了每个SNP边际检验中最小p值的置换分布。在纳入15个基因的经验单倍型结构的模拟研究中,最小P值检验控制了I型错误,并且比西姆斯检验的检验效能略高,平均高出2.1个百分点。当疾病易感性由单倍型赋予时,最小P值检验有时(但并非总是)表现不如单倍型分析。一种结合最小P值检验和单倍型频率检验的基于重采样的综合检验控制了I型错误,并紧密跟踪了两个组成检验中检验效能更高的那个。与简单的西姆斯邦费罗尼检验和单倍型分析相比,该检验在检验效能上实现了一致的提高(平均提高5.7个百分点)。使用来自上海胆道癌研究的数据,新提出的综合检验的优势在一项基于人群的胆管癌与前列腺素内过氧化物合酶2(PTGS2)基因多态性的研究中显而易见。