Goryunov D V, Nagaev B E, Nikolaev M Yu, Alexeevski A V, Troitsky A V
Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119991, Russia.
Biochemistry (Mosc). 2015 Nov;80(11):1522-7. doi: 10.1134/S0006297915110152.
Stability of composition and sequence of genes was shown earlier in 13 mitochondrial genomes of mosses (Rensing, S. A., et al. (2008) Science, 319, 64-69). It is of interest to study the evolution of mitochondrial genomes not only at the gene level, but also on the level of nucleotide sequences. To do this, we have constructed a "nucleotide pangenome" for mitochondrial genomes of 24 moss species. The nucleotide pangenome is a set of aligned nucleotide sequences of orthologous genome fragments covering the totality of all genomes. The nucleotide pangenome was constructed using specially developed new software, NPG-explorer (NPGe). The stable part of the mitochondrial genome (232 stable blocks) is shown to be, on average, 45% of its length. In the joint alignment of stable blocks, 82% of positions are conserved. The phylogenetic tree constructed with the NPGe program is in good correlation with other phylogenetic reconstructions. With the NPGe program, 30 blocks have been identified with repeats no shorter than 50 bp. The maximal length of a block with repeats is 140 bp. Duplications in the mitochondrial genomes of mosses are rare. On average, the genome contains about 500 bp in large duplications. The total length of insertions and deletions was determined in each genome. The losses and gains of DNA regions are rather active in mitochondrial genomes of mosses, and such rearrangements presumably can be used as additional markers in the reconstruction of phylogeny.
苔藓植物13个线粒体基因组的基因组成和序列稳定性已在早期得到证实(伦辛,S. A.等人(2008年)《科学》,319卷,64 - 69页)。研究线粒体基因组的进化不仅在基因层面,而且在核苷酸序列层面都很有意义。为此,我们构建了24种苔藓植物线粒体基因组的“核苷酸泛基因组”。核苷酸泛基因组是一组比对后的直系同源基因组片段的核苷酸序列,涵盖了所有基因组的总体。核苷酸泛基因组是使用专门开发的新软件NPG - explorer(NPGe)构建的。线粒体基因组的稳定部分(232个稳定区段)平均占其长度的45%。在稳定区段的联合比对中,82%的位置是保守的。用NPGe程序构建的系统发育树与其他系统发育重建结果具有良好的相关性。通过NPGe程序,已鉴定出30个长度不短于50 bp的重复区段。具有重复的区段的最大长度为140 bp。苔藓植物线粒体基因组中的重复很少见。平均而言,基因组中大约有500 bp的大重复。测定了每个基因组中插入和缺失的总长度。苔藓植物线粒体基因组中DNA区域的缺失和增加相当活跃,这种重排大概可以用作系统发育重建中的额外标记。