Abo Alchamlat Sinan, Farnir Frédéric
Department of Biostatistics, Faculty of Veterinary Medicine, FARAH, University of Liège, Sart Tilman B43, 4000, Liege, Belgium.
BMC Bioinformatics. 2017 Mar 21;18(1):184. doi: 10.1186/s12859-017-1599-7.
Finding epistatic interactions in large association studies like genome-wide association studies (GWAS) with the nowadays-available large volume of genomic data is a challenging and largely unsolved issue. Few previous studies could handle genome-wide data due to the intractable difficulties met in searching a combinatorial explosive search space and statistically evaluating epistatic interactions given a limited number of samples. Our work is a contribution to this field. We propose a novel approach combining K-Nearest Neighbors (KNN) and Multi Dimensional Reduction (MDR) methods for detecting gene-gene interactions as a possible alternative to existing algorithms, e especially in situations where the number of involved determinants is high. After describing the approach, a comparison of our method (KNN-MDR) to a set of the other most performing methods (i.e., MDR, BOOST, BHIT, MegaSNPHunter and AntEpiSeeker) is carried on to detect interactions using simulated data as well as real genome-wide data.
Experimental results on both simulated data and real genome-wide data show that KNN-MDR has interesting properties in terms of accuracy and power, and that, in many cases, it significantly outperforms its recent competitors.
The presented methodology (KNN-MDR) is valuable in the context of loci and interactions mapping and can be seen as an interesting addition to the arsenal used in complex traits analyses.
在全基因组关联研究(GWAS)这类大型关联研究中,利用如今可得的大量基因组数据来寻找上位性相互作用是一个具有挑战性且很大程度上尚未解决的问题。由于在搜索组合爆炸式搜索空间以及在样本数量有限的情况下对上位性相互作用进行统计评估时遇到棘手困难,之前很少有研究能够处理全基因组数据。我们的工作是对该领域的一项贡献。我们提出了一种将K近邻(KNN)和多维约简(MDR)方法相结合的新方法,用于检测基因 - 基因相互作用,作为现有算法的一种可能替代方案,特别是在涉及决定因素数量众多的情况下。在描述该方法之后,将我们的方法(KNN - MDR)与一组其他性能最佳的方法(即MDR、BOOST、BHIT、MegaSNPHunter和AntEpiSeeker)进行比较,以使用模拟数据以及真实全基因组数据来检测相互作用。
在模拟数据和真实全基因组数据上的实验结果表明,KNN - MDR在准确性和功效方面具有有趣的特性,并且在许多情况下,它显著优于其近期的竞争对手。
所提出的方法(KNN - MDR)在基因座和相互作用定位的背景下具有价值,并且可以被视为复杂性状分析中所使用方法库的一个有趣补充。