Rajaraman Ashok, Ma Jian
Computational Biology Department, School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, 15213, USA.
BMC Bioinformatics. 2016 Nov 11;17(Suppl 14):414. doi: 10.1186/s12859-016-1262-8.
Reconstructing ancestral gene orders in the presence of duplications is important for a better understanding of genome evolution. Current methods for ancestral reconstruction are limited by either computational constraints or the availability of reliable gene trees, and often ignore duplications altogether. Recently, methods that consider duplications in ancestral reconstructions have been developed, but the quality of reconstruction, counted as the number of contiguous ancestral regions found, decreases rapidly with the number of duplicated genes, complicating the application of such approaches to mammalian genomes. However, such high fragmentation is not encountered when reconstructing mammalian genomes at the synteny-block level, although the relative positions of genes in such reconstruction cannot be recovered.
We propose a new heuristic method, MULTIRES, to reconstruct ancestral gene orders with duplications guided by homologous synteny blocks for a set of related descendant genomes. The method uses a synteny-level reconstruction to break the gene-order problem into several subproblems, which are then combined in order to disambiguate duplicated genes. We applied this method to both simulated and real data. Our results showed that MULTIRES outperforms other methods in terms of gene content, gene adjacency, and common interval recovery.
This work demonstrates that the inclusion of synteny-level information can help us obtain better gene-level reconstructions. Our algorithm provides a basic toolbox for reconstructing ancestral gene orders with duplications. The source code of MULTIRES is available on https://github.com/ma-compbio/MultiRes .
在存在基因复制的情况下重建祖先基因顺序对于更好地理解基因组进化至关重要。当前用于祖先重建的方法受到计算限制或可靠基因树可用性的制约,并且常常完全忽略基因复制。最近,已经开发出在祖先重建中考虑基因复制的方法,但是重建质量(以找到的连续祖先区域数量来衡量)会随着复制基因数量的增加而迅速下降,这使得此类方法在哺乳动物基因组中的应用变得复杂。然而,在以同线基因块水平重建哺乳动物基因组时不会遇到这种高度碎片化的情况,尽管在这种重建中基因的相对位置无法恢复。
我们提出了一种新的启发式方法MULTIRES,用于在同源同线基因块的引导下,为一组相关的后代基因组重建带有基因复制的祖先基因顺序。该方法使用同线水平的重建将基因顺序问题分解为几个子问题,然后将这些子问题组合起来以消除复制基因的歧义。我们将此方法应用于模拟数据和真实数据。我们的结果表明,MULTIRES在基因内容、基因邻接和共同区间恢复方面优于其他方法。
这项工作表明纳入同线水平的信息可以帮助我们获得更好的基因水平重建。我们的算法为重建带有基因复制的祖先基因顺序提供了一个基本工具箱。MULTIRES的源代码可在https://github.com/ma-compbio/MultiRes上获取。