Suppr超能文献

利用互信息分析原口动物内含子模式中的系统发育信号。

Analysis of phylogenetic signal in protostomial intron patterns using Mutual Information.

作者信息

Hill Natascha, Leow Alexander, Bleidorn Christoph, Groth Detlef, Tiedemann Ralph, Selbig Joachim, Hartmann Stefanie

机构信息

Department of Bioinformatics, Institute for Biochemistry and Biology, University of Potsdam, Potsdam, Germany.

出版信息

Theory Biosci. 2013 Jun;132(2):93-104. doi: 10.1007/s12064-012-0173-0. Epub 2012 Dec 18.

Abstract

Many deep evolutionary divergences still remain unresolved, such as those among major taxa of the Lophotrochozoa. As alternative phylogenetic markers, the intron-exon structure of eukaryotic genomes and the patterns of absence and presence of spliceosomal introns appear to be promising. However, given the potential homoplasy of intron presence, the phylogenetic analysis of this data using standard evolutionary approaches has remained a challenge. Here, we used Mutual Information (MI) to estimate the phylogeny of Protostomia using gene structure data, and we compared these results with those obtained with Dollo Parsimony. Using full genome sequences from nine Metazoa, we identified 447 groups of orthologous sequences with 21,732 introns in 4,870 unique intron positions. We determined the shared absence and presence of introns in the corresponding sequence alignments and have made this data available in "IntronBase", a web-accessible and downloadable SQLite database. Our results obtained using Dollo Parsimony are obviously misled through systematic errors that arise from multiple intron loss events, but extensive filtering of data improved the quality of the estimated phylogenies. Mutual Information, in contrast, performs better with larger datasets, but at the same time it requires a complete data set, which is difficult to obtain for orthologs from a large number of taxa. Nevertheless, Mutual Information-based distances proved to be useful in analyzing this kind of data, also because the estimation of MI-based distances is independent of evolutionary models and therefore no pre-definitions of ancestral and derived character states are necessary.

摘要

许多深层次的进化分歧仍未得到解决,比如冠轮动物门主要类群之间的分歧。作为替代的系统发育标记,真核生物基因组的内含子-外显子结构以及剪接体内含子的有无模式似乎很有前景。然而,鉴于内含子存在的潜在同塑性,使用标准进化方法对这些数据进行系统发育分析仍然是一项挑战。在此,我们使用互信息(MI),利用基因结构数据来估计原口动物的系统发育,并将这些结果与使用多洛简约法得到的结果进行比较。利用来自9种后生动物的全基因组序列,我们在4870个独特内含子位置鉴定出447组直系同源序列,其中有21732个内含子。我们确定了相应序列比对中内含子的共同缺失和存在情况,并已将这些数据发布在“IntronBase”中,这是一个可通过网络访问和下载的SQLite数据库。我们使用多洛简约法得到的结果明显受到多个内含子丢失事件导致的系统误差的误导,但对数据进行广泛过滤提高了估计系统发育的质量。相比之下,互信息在较大数据集上表现更好,但同时它需要完整的数据集,而从大量分类群中获取直系同源物的完整数据集是很困难的。尽管如此,基于互信息的距离在分析这类数据时被证明是有用的,这也是因为基于互信息的距离估计独立于进化模型,因此无需预先定义祖先和衍生特征状态。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验