Vital-IT Group, Molecular Modeling Group, Swiss Institute of Bioinformatics, Lausanne, Switzerland.
Bioinformatics. 2010 Jun 1;26(11):1468-9. doi: 10.1093/bioinformatics/btq147. Epub 2010 Apr 7.
Genome-wide association studies have become widely used tools to study effects of genetic variants on complex diseases. While it is of great interest to extend existing analysis methods by considering interaction effects between pairs of loci, the large number of possible tests presents a significant computational challenge. The number of computations is further multiplied in the study of gene expression quantitative trait mapping, in which tests are performed for thousands of gene phenotypes simultaneously.
We present FastEpistasis, an efficient parallel solution extending the PLINK epistasis module, designed to test for epistasis effects when analyzing continuous phenotypes. Our results show that the algorithm scales with the number of processors and offers a reduction in computation time when several phenotypes are analyzed simultaneously. FastEpistasis is capable of testing the association of a continuous trait with all single nucleotide polymorphism (SNP) pairs from 500 000 SNPs, totaling 125 billion tests, in a population of 5000 individuals in 29, 4 or 0.5 days using 8, 64 or 512 processors.
FastEpistasis is open source and available free of charge only for non-commercial users from http://www.vital-it.ch/software/FastEpistasis.
全基因组关联研究已成为广泛用于研究遗传变异对复杂疾病影响的工具。虽然通过考虑两个基因座之间的相互作用效果来扩展现有分析方法具有很大的意义,但大量可能的测试带来了重大的计算挑战。在基因表达数量性状定位的研究中,测试同时对数千个基因表型进行,计算量进一步增加。
我们提出了 FastEpistasis,这是一种有效的并行解决方案,扩展了 PLINK 连锁不平衡模块,旨在分析连续表型时测试连锁不平衡效应。我们的结果表明,该算法与处理器数量成正比,并在同时分析多个表型时减少计算时间。FastEpistasis 能够在 5000 个人的群体中测试连续特征与来自 50 万个 SNP 的所有 SNP 对的关联,使用 8、64 或 512 个处理器可在 29、4 或 0.5 天内完成总共 1250 亿次测试。
FastEpistasis 是开源的,仅供非商业用户免费使用,可从 http://www.vital-it.ch/software/FastEpistasis 获得。