Laboratory of Population Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America.
PLoS Genet. 2011 Jun;7(6):e1002101. doi: 10.1371/journal.pgen.1002101. Epub 2011 Jun 9.
Genome-wide association studies (GWAS) have become increasingly common due to advances in technology and have permitted the identification of differences in single nucleotide polymorphism (SNP) alleles that are associated with diseases. However, while typical GWAS analysis techniques treat markers individually, complex diseases (cancers, diabetes, and Alzheimers, amongst others) are unlikely to have a single causative gene. Thus, there is a pressing need for multi-SNP analysis methods that can reveal system-level differences in cases and controls. Here, we present a novel multi-SNP GWAS analysis method called Pathways of Distinction Analysis (PoDA). The method uses GWAS data and known pathway-gene and gene-SNP associations to identify pathways that permit, ideally, the distinction of cases from controls. The technique is based upon the hypothesis that, if a pathway is related to disease risk, cases will appear more similar to other cases than to controls (or vice versa) for the SNPs associated with that pathway. By systematically applying the method to all pathways of potential interest, we can identify those for which the hypothesis holds true, i.e., pathways containing SNPs for which the samples exhibit greater within-class similarity than across classes. Importantly, PoDA improves on existing single-SNP and SNP-set enrichment analyses, in that it does not require the SNPs in a pathway to exhibit independent main effects. This permits PoDA to reveal pathways in which epistatic interactions drive risk. In this paper, we detail the PoDA method and apply it to two GWAS: one of breast cancer and the other of liver cancer. The results obtained strongly suggest that there exist pathway-wide genomic differences that contribute to disease susceptibility. PoDA thus provides an analytical tool that is complementary to existing techniques and has the power to enrich our understanding of disease genomics at the systems-level.
全基因组关联研究(GWAS)由于技术的进步而变得越来越普遍,并且已经确定了与疾病相关的单核苷酸多态性(SNP)等位基因的差异。然而,尽管典型的 GWAS 分析技术单独处理标记物,但复杂疾病(癌症、糖尿病和阿尔茨海默病等)不太可能具有单个致病基因。因此,迫切需要能够揭示病例和对照之间系统水平差异的多 SNP 分析方法。在这里,我们提出了一种新的多 SNP GWAS 分析方法,称为区分途径分析(PoDA)。该方法使用 GWAS 数据和已知的途径-基因和基因-SNP 关联来识别途径,理想情况下,可以区分病例和对照。该技术基于这样的假设:如果一个途径与疾病风险有关,那么与该途径相关的 SNP 情况下会比对照更相似(反之亦然)。通过系统地将该方法应用于所有具有潜在兴趣的途径,我们可以确定那些假设成立的途径,即包含 SNP 的途径,这些 SNP 在样本中表现出比跨类更高的类内相似性。重要的是,PoDA 改进了现有的单 SNP 和 SNP 集富集分析,因为它不需要途径中的 SNP 表现出独立的主要效应。这使得 PoDA 能够揭示导致风险的上位相互作用的途径。在本文中,我们详细介绍了 PoDA 方法,并将其应用于两项 GWAS:一项是乳腺癌,另一项是肝癌。所得结果强烈表明存在导致疾病易感性的全基因组途径差异。因此,PoDA 提供了一种分析工具,补充了现有技术,并具有增强我们对系统水平疾病基因组学理解的能力。