Suppr超能文献

用于拼接核心基因的替代方法表明原核生物系统发育深层节点缺乏分辨率。

Alternative methods for concatenation of core genes indicate a lack of resolution in deep nodes of the prokaryotic phylogeny.

作者信息

Bapteste E, Susko E, Leigh J, Ruiz-Trillo I, Bucknam J, Doolittle W F

机构信息

Canadian Institute for Advanced Research and Genome Atlantic, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada.

出版信息

Mol Biol Evol. 2008 Jan;25(1):83-91. doi: 10.1093/molbev/msm229. Epub 2007 Oct 16.

Abstract

It has recently been proposed that a well-resolved Tree of Life can be achieved through concatenation of shared genes. There are, however, several difficulties with such an approach, especially in the prokaryotic part of this tree. We tackled some of them using a new combination of maximum likelihood-based methods, developed in order to practice as safe and careful concatenations as possible. First, we used the application concaterpillar on carefully aligned core genes. This application uses a hierarchical likelihood-ratio test framework to assess both the topological congruence between gene phylogenies (i.e., whether different genes share the same evolutionary history) and branch-length congruence (i.e., whether genes that share the same history share the same pattern of relative evolutionary rates). We thus tested if these core genes can be concatenated or should be instead categorized into different incongruent sets. Second, we developed a heat map approach studying the evolution of the phylogenetic support for different bipartitions, when the number of sites of different phylogenetic quality in the concatenation increases. These heatmaps allow us to follow which phylogenetic signals increase or decrease as the concatenation progresses and to detect emerging artifactual groupings, that is, groups that are more and more supported when more and more homoplasic sites are thrown in the analysis. We showed that, as far as 7 major prokaryotic lineages are concerned, only 22 core genes can be said to be congruent and can be safely concatenated. This number is even smaller than the number of genes retained to reconstruct a "Tree of One Per Cent." Furthermore, the concatenation of these 22 markers leads to an unresolved tree as the only groupings in the concatenation tree seem to reflect emerging artifacts. Using concatenated core genes as a valid framework to classify uncharacterized environmental sequences can thus be misleading.

摘要

最近有人提出,通过拼接共享基因可以构建出一个解析度良好的生命之树。然而,这种方法存在几个难点,尤其是在这棵树的原核生物部分。我们使用了基于最大似然法的新组合方法来解决其中一些问题,该方法的开发是为了尽可能安全谨慎地进行拼接。首先,我们将concaterpillar应用于仔细比对的核心基因。该应用使用分层似然比检验框架来评估基因系统发育之间的拓扑一致性(即不同基因是否共享相同的进化历史)和分支长度一致性(即共享相同历史的基因是否共享相同的相对进化速率模式)。因此,我们测试了这些核心基因是否可以拼接,或者是否应该被分类到不同的不一致集合中。其次,我们开发了一种热图方法,研究当拼接中不同系统发育质量的位点数量增加时,不同二分法的系统发育支持度的演变。这些热图使我们能够追踪随着拼接的进行哪些系统发育信号增加或减少,并检测新出现的人为分组,即随着越来越多的同塑位点被纳入分析,得到越来越多支持的分组。我们表明,就7个主要原核生物谱系而言,只有22个核心基因可以被认为是一致的并且可以安全地拼接。这个数字甚至比为重建“百分之一树”而保留的基因数量还要少。此外,这22个标记的拼接导致了一个未解析的树,因为拼接树中的唯一分组似乎反映了新出现的人为因素。因此,使用拼接的核心基因作为对未表征的环境序列进行分类的有效框架可能会产生误导。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验