de Leeuw Christiaan A, Mooij Joris M, Heskes Tom, Posthuma Danielle
Department of Complex Trait Genetics, Center for Neurogenomics and Cognitive Research, VU University Amsterdam, Amsterdam, The Netherlands; Institute for Computing and Information Sciences, Radboud University Nijmegen, Nijmegen, The Netherlands.
Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands.
PLoS Comput Biol. 2015 Apr 17;11(4):e1004219. doi: 10.1371/journal.pcbi.1004219. eCollection 2015 Apr.
By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn's Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn's Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn's Disease data was found to be considerably faster as well.
通过以生物学上有意义的方式汇总复杂性状的数据,基因和基因集分析成为单标记分析的重要补充。然而,尽管目前存在各种基因和基因集分析方法,但它们普遍存在一些问题。大多数方法的统计功效受到标记间连锁不平衡的强烈影响,多标记关联往往难以检测,并且依赖排列来计算p值往往使分析在计算上非常昂贵。为了解决这些问题,我们开发了MAGMA,一种用于基因和基因集分析的新型工具。基因分析基于多元回归模型,以提供更好的统计性能。基因集分析围绕基因分析构建为一个单独的层,以提供额外的灵活性。这种基因集分析还使用回归结构,以便将其推广到对基因连续特性的分析以及对多个基因集和其他基因特性的同时分析。通过模拟和对克罗恩病数据的分析来评估MAGMA的性能,并将其与其他一些基因和基因集分析工具进行比较。结果表明,在基因和基因集分析方面,MAGMA比其他工具具有显著更高的功效,在保持正确的I型错误率的同时,识别出更多与克罗恩病相关的基因和基因集。此外,还发现对克罗恩病数据的MAGMA分析速度也快得多。