Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, QLD, Australia.
Shandong Provincial Key Laboratory of Functional Macromolecular Biophysics, Dezhou University, Dezhou, Shandong, China.
Sci Rep. 2017 Jun 14;7(1):3512. doi: 10.1038/s41598-017-03826-2.
Genome-wide association studies (GWAS) have successfully identified single variants associated with diseases. To increase the power of GWAS, gene-based and pathway-based tests are commonly employed to detect more risk factors. However, the gene- and pathway-based association tests may be biased towards genes or pathways containing a large number of single-nucleotide polymorphisms (SNPs) with small P-values caused by high linkage disequilibrium (LD) correlations. To address such bias, numerous pathway-based methods have been developed. Here we propose a novel method, DGAT-path, to divide all SNPs assigned to genes in each pathway into LD blocks, and to sum the chi-square statistics of LD blocks for assessing the significance of the pathway by permutation tests. The method was proven robust with the type I error rate >1.6 times lower than other methods. Meanwhile, the method displays a higher power and is not biased by the pathway size. The applications to the GWAS summary statistics for schizophrenia and breast cancer indicate that the detected top pathways contain more genes close to associated SNPs than other methods. As a result, the method identified 17 and 12 significant pathways containing 20 and 21 novel associated genes, respectively for two diseases. The method is available online by http://sparks-lab.org/server/DGAT-path .
全基因组关联研究(GWAS)已经成功地鉴定出与疾病相关的单一变异。为了提高 GWAS 的功效,通常采用基于基因和基于途径的检验来检测更多的风险因素。然而,基于基因和途径的关联检验可能偏向于包含大量具有高连锁不平衡(LD)相关性的小 P 值单核苷酸多态性(SNP)的基因或途径。为了解决这种偏差,已经开发了许多基于途径的方法。在这里,我们提出了一种新的方法,DGAT-path,将分配给每个途径中基因的所有 SNP 划分为 LD 块,并通过置换检验对 LD 块的卡方统计量进行求和,以评估途径的显著性。该方法被证明具有较高的稳健性,I 型错误率比其他方法高 1.6 倍以上。同时,该方法显示出更高的功效,不受途径大小的影响。该方法应用于精神分裂症和乳腺癌的 GWAS 汇总统计数据表明,检测到的顶级途径包含更多的基因,这些基因更接近相关的 SNP。因此,该方法分别确定了 17 个和 12 个包含 20 个和 21 个新关联基因的显著途径。该方法可通过 http://sparks-lab.org/server/DGAT-path 在线使用。