Medical Research Council Biostatistics Unit, Institute of Public Health, Cambridge, United Kingdom.
PLoS One. 2012;7(7):e41018. doi: 10.1371/journal.pone.0041018. Epub 2012 Jul 31.
It has been suggested that pathway analysis can complement single-SNP analysis in exploring genomewide association data. Pathway analysis incorporates the available biological knowledge of genes and SNPs and is expected to improve the chances of revealing the underlying genetic architecture of complex traits. Methods for pathway analysis can be classified as competitive (enrichment) or self-contained (association) according to the hypothesis tested. Although association tests are statistically more powerful than enrichment tests they can be difficult to calibrate because biases in analysis accumulate across multiple SNPs or genes. Furthermore, enrichment tests can be more scientifically relevant than association tests, as they detect pathways with relatively more evidence for association than the remaining genes. Here we show how some well known association tests can be simply adapted to test for enrichment, and compare their performance to some established enrichment tests. We propose versions of the Adaptive Rank Truncated Product (ARTP), Tail Strength Measure and Fisher's combination of p-values for testing the enrichment null hypothesis. We compare the behaviour of these proposed methods with the established Hypergeometric Test and Gene-Set Enrichment Analysis (GSEA). The results of the simulation study show that the modified version of the ARTP method has generally the best performance across the situations considered. The methods were also applied for finding enriched pathways for body mass index (BMI) and platelet function phenotypes. The pathway analysis of BMI identified the Vasoactive Intestinal Peptide pathway as significantly associated with BMI. This pathway has been previously reported as associated with BMI and the risk of obesity. The ARTP method was the method that identified the largest number of enriched pathways across all tested pathway databases and phenotypes. The simulation and data application results are in agreement with previous work on association tests and suggests that the ARTP should be preferred for both enrichment and association testing.
有人认为,通路分析可以补充单核苷酸多态性分析,以探索全基因组关联数据。通路分析整合了基因和 SNP 的现有生物学知识,有望提高揭示复杂性状潜在遗传结构的机会。根据所测试的假设,通路分析方法可分为竞争(富集)或自包含(关联)。尽管关联检验在统计学上比富集检验更有力,但由于分析中的偏差在多个 SNP 或基因上积累,因此它们可能难以校准。此外,富集检验可能比关联检验更具科学相关性,因为它们检测到的通路与剩余基因相比,具有更多关联证据。在这里,我们展示了如何简单地改编一些著名的关联检验来进行富集检验,并将它们的性能与一些已建立的富集检验进行比较。我们提出了自适应秩截断乘积(ARTP)、尾部强度测量和 Fisher 检验的一些版本,用于检验富集零假设。我们比较了这些拟议方法与已建立的超几何检验和基因集富集分析(GSEA)的行为。模拟研究的结果表明,在考虑的情况下,ARTP 方法的修改版本通常具有最佳性能。这些方法还应用于寻找身体质量指数(BMI)和血小板功能表型的富集途径。BMI 的通路分析将血管活性肠肽通路确定为与 BMI 显著相关。该通路先前已被报道与 BMI 和肥胖风险相关。ARTP 方法是在所有测试的通路数据库和表型中识别出最多富集通路的方法。模拟和数据应用结果与关联检验的先前工作一致,并表明 ARTP 应该优先用于富集和关联检验。