Transformations Bioinformatics, Health and Biosecurity, Commonwealth Scientific and Industrial Research Organisation (CSIRO), North Ryde, NSW, 2113, Australia.
The Kinghorn Cancer Centre, Darlinghurst, NSW, 2010, Australia.
Sci Rep. 2021 Aug 5;11(1):15923. doi: 10.1038/s41598-021-94959-y.
Complex genetic diseases may be modulated by a large number of epistatic interactions affecting a polygenic phenotype. Identifying these interactions is difficult due to computational complexity, especially in the case of higher-order interactions where more than two genomic variants are involved. In this paper, we present BitEpi, a fast and accurate method to test all possible combinations of up to four bi-allelic variants (i.e. Single Nucleotide Variant or SNV for short). BitEpi introduces a novel bitwise algorithm that is 1.7 and 56 times faster for 3-SNV and 4-SNV search, than established software. The novel entropy statistic used in BitEpi is 44% more accurate to identify interactive SNVs, incorporating a p-value-based significance testing. We demonstrate BitEpi on real world data of 4900 samples and 87,000 SNPs. We also present EpiExplorer to visualize the potentially large number of individual and interacting SNVs in an interactive Cytoscape graph. EpiExplorer uses various visual elements to facilitate the discovery of true biological events in a complex polygenic environment.
复杂的遗传疾病可能受到许多影响多基因表型的上位性相互作用的调节。由于计算复杂性,特别是在涉及两个以上基因组变异的高阶相互作用的情况下,识别这些相互作用具有挑战性。在本文中,我们提出了 BitEpi,这是一种快速而准确的方法,可用于测试多达四个双等位基因变异(即单核苷酸变异或简称 SNV)的所有可能组合。BitEpi 引入了一种新的位运算算法,与已有的软件相比,3-SNV 和 4-SNV 搜索的速度分别提高了 1.7 倍和 56 倍。BitEpi 中使用的新熵统计量在识别交互 SNV 时更准确,包含基于 p 值的显著性检验。我们在 4900 个样本和 87000 个 SNPs 的真实世界数据上演示了 BitEpi。我们还展示了 EpiExplorer,用于在交互式 Cytoscape 图中可视化潜在大量的个体和交互 SNV。EpiExplorer 使用各种可视化元素来促进在复杂的多基因环境中发现真正的生物学事件。