INRIA Lille-Nord-Europe, Université Lille 1, LIFL, UMR CNRS 8022, Villeneuve d'Ascq, Villeurbanne, France.
Bioinformatics. 2011 Oct 1;27(19):2664-71. doi: 10.1093/bioinformatics/btr461. Epub 2011 Aug 16.
The ancestor of birds and mammals lived approximately 300 million years ago. Inferring its genome organization is key to understanding the differentiated evolution of these two lineages. However, detecting traces of its chromosomal organization in its extant descendants is difficult due to the accumulation of molecular evolution since birds and mammals lineages diverged.
We address several methodological issues for the detection and assembly of ancestral genomic features of ancient vertebrate genomes, which encompass adjacencies, contiguous segments, syntenies and double syntenies in the context of a whole genome duplication. Using generic, but stringent, methods for all these problems, some of them new, we analyze 15 vertebrate genomes, including 12 amniotes and 3 teleost fishes, and infer a high-resolution genome organization of the amniote ancestral genome, composed of 39 ancestral linkage groups at a resolution of 100 kb. We extensively discuss the validity and robustness of the method to variations of data and parameters. We introduce a support value for each of the groups, and show that 36 out of 39 have maximum support.
Single methodological principle cannot currently be used to infer the organization of the amniote ancestral genome, and we demonstrate that it is possible to gather several principles into a computational paleogenomics pipeline. This strategy offers a solid methodological base for the reconstruction of ancient vertebrate genomes.
Source code, in C++ and Python, is available at http://www.cecm.sfu.ca/~cchauve/SUPP/AMNIOTE2010/
Supplementary data are available at Bioinformatics online.
鸟类和哺乳动物的祖先生活在大约 3 亿年前。推断其基因组组织是理解这两个谱系分化进化的关键。然而,由于鸟类和哺乳动物谱系分化后分子进化的积累,在其现存的后代中检测其染色体组织的痕迹是困难的。
我们解决了检测和组装古代脊椎动物基因组祖先基因组特征的几个方法问题,这些特征包括整个基因组复制背景下的邻近性、连续片段、同线性和双重同线性。使用通用但严格的方法解决所有这些问题,其中一些是新的,我们分析了 15 个脊椎动物基因组,包括 12 个羊膜动物和 3 个硬骨鱼,并推断出羊膜动物祖先基因组的高分辨率基因组组织,由 39 个祖先连锁群组成,分辨率为 100kb。我们广泛讨论了该方法对数据和参数变化的有效性和稳健性。我们为每个组引入了一个支持值,并表明 39 个组中有 36 个具有最大支持。
目前不能使用单一的方法原则来推断羊膜动物祖先基因组的组织,我们证明可以将几种原则汇集到一个计算古基因组学管道中。该策略为重建古代脊椎动物基因组提供了坚实的方法基础。
C++和 Python 的源代码可在 http://www.cecm.sfu.ca/~cchauve/SUPP/AMNIOTE2010/ 获得。
补充数据可在生物信息学在线获得。