Talluri Rajesh, Shete Sanjay
Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA. ; Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
Cancer Inform. 2015 Feb 10;14(Suppl 2):17-23. doi: 10.4137/CIN.S17289. eCollection 2015.
Epistasis helps to explain how multiple single-nucleotide polymorphisms (SNPs) interact to cause disease. A variety of tools have been developed to detect epistasis. In this article, we explore the strengths and weaknesses of an information theory approach for detecting epistasis and compare it to the logistic regression approach through simulations. We consider several scenarios to simulate the involvement of SNPs in an epistasis network with respect to linkage disequilibrium patterns among them and the presence or absence of main and interaction effects. We conclude that the information theory approach more efficiently detects interaction effects when main effects are absent, whereas, in general, the logistic regression approach is appropriate in all scenarios but results in higher false positives. We compute epistasis networks for SNPs in the FSD1L gene using a two-phase head and neck cancer genome-wide association study involving 2,185 cases and 4,507 controls to demonstrate the practical application of the methods.
上位性有助于解释多个单核苷酸多态性(SNP)如何相互作用导致疾病。已经开发了多种工具来检测上位性。在本文中,我们探讨了一种用于检测上位性的信息论方法的优缺点,并通过模拟将其与逻辑回归方法进行比较。我们考虑了几种情况,以模拟SNP在上位性网络中的参与情况,包括它们之间的连锁不平衡模式以及主效应和交互效应的存在与否。我们得出结论,当不存在主效应时,信息论方法能更有效地检测交互效应,而一般来说,逻辑回归方法在所有情况下都适用,但会导致更高的假阳性率。我们使用一项涉及2185例病例和4507例对照的两阶段头颈癌全基因组关联研究,计算了FSD1L基因中SNP的上位性网络,以证明这些方法的实际应用。