Department of Medical Statistics and Epidemiology, School of Public Health, Guangdong Medical College, Dongguan, Guangdong, China.
PLoS One. 2011;6(10):e26435. doi: 10.1371/journal.pone.0026435. Epub 2011 Oct 25.
The rapid advance in large-scale SNP-chip technologies offers us great opportunities in elucidating the genetic basis of complex diseases. Methods for large-scale interactions analysis have been under development from several sources. Due to several difficult issues (e.g., sparseness of data in high dimensions and low replication or validation rate), development of fast, powerful and robust methods for detecting various forms of gene-gene interactions continues to be a challenging task.
METHODOLOGY/PRINCIPAL FINDINGS: In this article, we have developed an evolution-based method to search for genome-wide epistasis in a case-control design. From an evolutionary perspective, we view that human diseases originate from ancient mutations and consider that the underlying genetic variants play a role in differentiating human population into the healthy and the diseased. Based on this concept, traditional evolutionary measure, fixation index (Fst) for two unlinked loci, which measures the genetic distance between populations, should be able to reveal the responsible genetic interplays for disease traits. To validate our proposal, we first investigated the theoretical distribution of Fst by using extensive simulations. Then, we explored its power for detecting gene-gene interactions via SNP markers, and compared it with the conventional Pearson Chi-square test, mutual information based test and linkage disequilibrium based test under several disease models. The proposed evolution-based method outperformed these compared methods in dominant and additive models, no matter what the disease allele frequencies were. However, its performance was relatively poor in a recessive model. Finally, we applied the proposed evolution-based method to analysis of a published dataset. Our results showed that the P value of the Fst -based statistic is smaller than those obtained by the LD-based statistic or Poisson regression models.
CONCLUSIONS/SIGNIFICANCE: With rapidly growing large-scale genetic association studies, the proposed evolution-based method can be a promising tool in the identification of epistatic effects.
大规模 SNP 芯片技术的快速发展为我们阐明复杂疾病的遗传基础提供了巨大的机会。从多个来源开发了用于大规模相互作用分析的方法。由于存在几个困难的问题(例如,高维数据稀疏,复制或验证率低),开发用于检测各种形式的基因-基因相互作用的快速,强大和鲁棒的方法仍然是一项具有挑战性的任务。
方法/主要发现:在本文中,我们开发了一种基于进化的方法,用于在病例对照设计中搜索全基因组的上位性。从进化的角度来看,我们认为人类疾病起源于古代突变,并认为潜在的遗传变异在将人类群体分为健康和患病方面发挥了作用。基于这一概念,传统的进化度量标准,即两个非连锁基因座的固定指数(Fst),用于衡量种群之间的遗传距离,应该能够揭示疾病特征的相关遗传相互作用。为了验证我们的建议,我们首先使用广泛的模拟研究了 Fst 的理论分布。然后,我们通过 SNP 标记探索了其检测基因-基因相互作用的能力,并在几种疾病模型下将其与传统的 Pearson 卡方检验,基于互信息的检验和基于连锁不平衡的检验进行了比较。在显性和加性模型中,无论疾病等位基因频率如何,提出的基于进化的方法均优于这些比较方法。但是,在隐性模型中,其性能相对较差。最后,我们将提出的基于进化的方法应用于已发表数据集的分析。我们的结果表明,基于 Fst 的统计量的 P 值小于基于 LD 的统计量或泊松回归模型获得的 P 值。
结论/意义:随着大规模遗传关联研究的迅速发展,所提出的基于进化的方法可能是识别上位效应的有前途的工具。