Suppr超能文献

一种用于鉴定高度保守元件及对囊泡虫超门进行进化分析的方法。

A method for identification of highly conserved elements and evolutionary analysis of superphylum Alveolata.

作者信息

Rubanov Lev I, Seliverstov Alexandr V, Zverkov Oleg A, Lyubetsky Vassily A

机构信息

Institute for Information Transmission Problems (Kharkevich Institute), Russian Academy of Sciences, Bolshoi Karetnyi per. 19, Building 1, Moscow, 127051, Russia.

出版信息

BMC Bioinformatics. 2016 Sep 20;17:385. doi: 10.1186/s12859-016-1257-5.

Abstract

BACKGROUND

Perfectly or highly conserved DNA elements were found in vertebrates, invertebrates, and plants by various methods. However, little is known about such elements in protists. The evolutionary distance between apicomplexans can be very high, in particular, due to the positive selection pressure on them. This complicates the identification of highly conserved elements in alveolates, which is overcome by the proposed algorithm.

RESULTS

A novel algorithm is developed to identify highly conserved DNA elements. It is based on the identification of dense subgraphs in a specially built multipartite graph (whose parts correspond to genomes). Specifically, the algorithm does not rely on genome alignments, nor pre-identified perfectly conserved elements; instead, it performs a fast search for pairs of words (in different genomes) of maximum length with the difference below the specified edit distance. Such pair defines an edge whose weight equals the maximum (or total) length of words assigned to its ends. The graph composed of these edges is then compacted by merging some of its edges and vertices. The dense subgraphs are identified by a cellular automaton-like algorithm; each subgraph defines a cluster composed of similar inextensible words from different genomes. Almost all clusters are considered as predicted highly conserved elements. The algorithm is applied to the nuclear genomes of the superphylum Alveolata, and the corresponding phylogenetic tree is built and discussed.

CONCLUSION

We proposed an algorithm for the identification of highly conserved elements. The multitude of identified elements was used to infer the phylogeny of Alveolata.

摘要

背景

通过各种方法在脊椎动物、无脊椎动物和植物中发现了完全或高度保守的DNA元件。然而,关于原生生物中的此类元件却知之甚少。顶复门生物之间的进化距离可能非常大,特别是由于它们受到正选择压力。这使得在肺泡虫中鉴定高度保守元件变得复杂,而本文提出的算法克服了这一难题。

结果

开发了一种用于鉴定高度保守DNA元件的新算法。它基于在专门构建的多部分图(其部分对应于基因组)中识别密集子图。具体而言,该算法不依赖于基因组比对,也不依赖于预先确定的完全保守元件;相反,它快速搜索不同基因组中长度最大且差异低于指定编辑距离的单词对。这样的单词对定义了一条边,其权重等于分配给其两端的单词的最大(或总)长度。然后通过合并其一些边和顶点来压缩由这些边组成的图。通过类似细胞自动机的算法识别密集子图;每个子图定义一个由来自不同基因组的相似不可扩展单词组成的簇。几乎所有簇都被视为预测的高度保守元件。该算法应用于肺泡虫超门的核基因组,并构建和讨论了相应的系统发育树。

结论

我们提出了一种鉴定高度保守元件的算法。所鉴定的大量元件被用于推断肺泡虫的系统发育。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c78f/5028923/399fd8b5d64e/12859_2016_1257_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验