Hodge Susan E, Hager Valerie R, Greenberg David A
Battelle Center for Mathematical Medicine, The Research Institute, Nationwide Children's Hospital, Columbus, Ohio, 43215, United States of America.
Department of Pediatrics, College of Medicine, The Ohio State University, Columbus, Ohio, 43215, United States of America.
PLoS One. 2016 Jan 11;11(1):e0146240. doi: 10.1371/journal.pone.0146240. eCollection 2016.
Detecting gene-gene interaction in complex diseases has become an important priority for common disease genetics, but most current approaches to detecting interaction start with disease-marker associations. These approaches are based on population allele frequency correlations, not genetic inheritance, and therefore cannot exploit the rich information about inheritance contained within families. They are also hampered by issues of rigorous phenotype definition, multiple test correction, and allelic and locus heterogeneity. We recently developed, tested, and published a powerful gene-gene interaction detection strategy based on conditioning family data on a known disease-causing allele or a disease-associated marker allele4. We successfully applied the method to disease data and used computer simulation to exhaustively test the method for some epistatic models. We knew that the statistic we developed to indicate interaction was less reliable when applied to more-complex interaction models. Here, we improve the statistic and expand the testing procedure. We computer-simulated multipoint linkage data for a disease caused by two interacting loci. We examined epistatic as well as additive models and compared them with heterogeneity models. In all our models, the at-risk genotypes are "major" in the sense that among affected individuals, a substantial proportion has a disease-related genotype. One of the loci (A) has a known disease-related allele (as would have been determined from a previous analysis). We removed (pruned) family members who did not carry this allele; the resultant dataset is referred to as "stratified." This elimination step has the effect of raising the "penetrance" and detectability at the second locus (B). We used the lod scores for the stratified and unstratified data sets to calculate a statistic that either indicated the presence of interaction or indicated that no interaction was detectable. We show that the new method is robust and reliable for a wide range of parameters. Our statistic performs well both with the epistatic models (false negative rates, i.e., failing to detect interaction, ranging from 0 to 2.5%) and with the heterogeneity models (false positive rates, i.e., falsely detecting interaction, ≤1%). It works well with the additive model except when allele frequencies at the two loci differ widely. We explore those features of the additive model that make detecting interaction more difficult. All testing of this method suggests that it provides a reliable approach to detecting gene-gene interaction.
在复杂疾病中检测基因-基因相互作用已成为常见疾病遗传学的一项重要优先任务,但目前大多数检测相互作用的方法都是从疾病-标记物关联入手。这些方法基于群体等位基因频率相关性,而非遗传遗传,因此无法利用家族中包含的丰富遗传信息。它们还受到严格的表型定义、多重检验校正以及等位基因和基因座异质性问题的阻碍。我们最近开发、测试并发表了一种强大的基因-基因相互作用检测策略,该策略基于以已知致病等位基因或疾病相关标记等位基因为条件的家族数据。我们成功地将该方法应用于疾病数据,并使用计算机模拟对一些上位性模型进行了详尽测试。我们知道,我们开发的用于指示相互作用的统计量在应用于更复杂的相互作用模型时可靠性较低。在此,我们改进了该统计量并扩展了测试程序。我们对由两个相互作用基因座引起的疾病进行了计算机模拟多点连锁数据。我们研究了上位性模型以及加性模型,并将它们与异质性模型进行了比较。在我们所有的模型中,风险基因型在以下意义上是“主要的”:在受影响个体中,很大一部分具有与疾病相关的基因型。其中一个基因座(A)具有已知的与疾病相关的等位基因(如先前分析所确定)。我们去除了不携带该等位基因的家庭成员;所得数据集称为“分层的”。这一消除步骤具有提高第二个基因座(B)的“外显率”和可检测性的效果。我们使用分层和未分层数据集的对数似然比分数来计算一个统计量,该统计量要么表明存在相互作用,要么表明未检测到相互作用。我们表明,新方法对于广泛的参数具有稳健性和可靠性。我们的统计量在上位性模型(假阴性率,即未能检测到相互作用,范围为0至2.5%)和异质性模型(假阳性率,即错误地检测到相互作用,≤1%)中均表现良好。除了两个基因座的等位基因频率差异很大时,它在加性模型中也表现良好。我们探讨了加性模型中使检测相互作用更加困难的那些特征。对该方法的所有测试表明,它为检测基因-基因相互作用提供了一种可靠的方法。