Upton Alex, Trelles Oswaldo, Cornejo-García José Antonio, Perkins James Richard
Brief Bioinform. 2016 May;17(3):368-79. doi: 10.1093/bib/bbv058. Epub 2015 Aug 13.
It is becoming clear that most human diseases have a complex etiology that cannot be explained by single nucleotide polymorphisms (SNPs) or simple additive combinations; the general consensus is that they are caused by combinations of multiple genetic variations. The limited success of some genome-wide association studies is partly a result of this focus on single genetic markers. A more promising approach is to take into account epistasis, by considering the association of multiple SNP interactions with disease. However, as genomic data continues to grow in resolution, and genome and exome sequencing become more established, the number of combinations of variants to consider increases rapidly. Two potential solutions should be considered: the use of high-performance computing, which allows us to consider a larger number of variables, and heuristics to make the solution more tractable, essential in the case of genome sequencing. In this review, we look at different computational methods to analyse epistatic interactions within disease-related genetic data sets created by microarray technology. We also review efforts to use epistatic analysis results to produce biomarkers for diagnostic tests and give our views on future directions in this field in light of advances in sequencing technology and variants in non-coding regions.
越来越明显的是,大多数人类疾病具有复杂的病因,无法用单核苷酸多态性(SNP)或简单的累加组合来解释;普遍的共识是,它们是由多种基因变异的组合引起的。一些全基因组关联研究取得的有限成功部分是由于专注于单一遗传标记的结果。一种更有前景的方法是通过考虑多个SNP相互作用与疾病的关联来考虑上位性。然而,随着基因组数据分辨率的不断提高,以及基因组和外显子组测序变得更加成熟,需要考虑的变异组合数量迅速增加。应考虑两种潜在的解决方案:使用高性能计算,这使我们能够考虑更多变量;以及启发式方法,使解决方案更易于处理,这在基因组测序的情况下至关重要。在这篇综述中,我们探讨了不同的计算方法,以分析由微阵列技术创建的疾病相关遗传数据集中的上位性相互作用。我们还回顾了利用上位性分析结果生成诊断测试生物标志物的努力,并根据测序技术的进展和非编码区变异,对该领域的未来方向发表了我们的看法。