Moore Jason H, Hahn Lance W, Ritchie Marylyn D, Thornton Tricia A, White Bill C
Program in Human Genetics, Department of Molecular Physiology and Biophysics, 519 Light Hall, Vanderbilt University Medical School, Nashville, TN 37232-0700, USA.
Appl Soft Comput. 2004 Feb 1;4(1):79-86. doi: 10.1016/j.asoc.2003.08.003.
Simulation studies are useful in various disciplines for a number of reasons including the development and evaluation of new computational and statistical methods. This is particularly true in human genetics and genetic epidemiology where new analytical methods are needed for the detection and characterization of disease susceptibility genes whose effects are complex, nonlinear, and partially or solely dependent on the effects of other genes (i.e. epistasis or gene-gene interaction). Despite this need, the development of complex genetic models that can be used to simulate data is not always intuitive. In fact, only a few such models have been published. We have previously developed a genetic algorithm approach to discovering complex genetic models in which two single nucleotide polymorphisms (SNPs) influence disease risk solely through nonlinear interactions. In this paper, we extend this approach for the discovery of high-order epistasis models involving three to five SNPs. We demonstrate that the genetic algorithm is capable of routinely discovering interesting high-order epistasis models in which each SNP influences risk of disease only through interactions with the other SNPs in the model. This study opens the door for routine simulation of complex gene-gene interactions among SNPs for the development and evaluation of new statistical and computational approaches for identifying common, complex multifactorial disease susceptibility genes.
由于多种原因,模拟研究在各个学科中都很有用,包括新的计算和统计方法的开发与评估。在人类遗传学和遗传流行病学中尤其如此,在这些领域中,需要新的分析方法来检测和表征疾病易感基因,这些基因的作用复杂、非线性,部分或完全依赖于其他基因的作用(即上位性或基因-基因相互作用)。尽管有这种需求,但可用于模拟数据的复杂遗传模型的开发并不总是直观的。事实上,只有少数这样的模型被发表。我们之前开发了一种遗传算法方法来发现复杂遗传模型,其中两个单核苷酸多态性(SNP)仅通过非线性相互作用影响疾病风险。在本文中,我们扩展了这种方法以发现涉及三到五个SNP的高阶上位性模型。我们证明遗传算法能够常规地发现有趣的高阶上位性模型,其中每个SNP仅通过与模型中的其他SNP相互作用影响疾病风险。这项研究为SNP之间复杂基因-基因相互作用的常规模拟打开了大门,用于开发和评估识别常见复杂多因素疾病易感基因的新统计和计算方法。