Suppr超能文献

巴斯德氏菌科的全基因组系统发育分析。

A whole-genome phylogeny of the family Pasteurellaceae.

机构信息

Sackler Institute of Comparative Genomics, American Museum of Natural History, 79th Street at Central Park West, New York, NY 10024, USA.

出版信息

Mol Phylogenet Evol. 2010 Mar;54(3):950-6. doi: 10.1016/j.ympev.2009.08.010. Epub 2009 Aug 15.

Abstract

A phylogenomic approach was used to generate an amino acid phylogeny for 12 whole genomes representing 10 species in the family Pasteurellaceae. Orthology of genes was determined using an approach similar to OrthologID (http://nypg.bio.nyu.edu/orthologid/about.html) and resulted in the generation of a matrix with 3130 genes with 1,194,615 aligned amino acid characters of which 239,504 characters are phylogenetically informative. Phylogenetic analysis of the concatenated matrix using all standard approaches (maximum parsimony, maximum likelihood, and Bayesian analysis) results in a single extremely robust phylogenetic hypothesis for the species examined in this study. Remarkably, no single gene partition gives the same tree as the concatenated analysis. By analyzing partitioned support in the data matrix, we show that there is very little negative support emanating from individual gene partitions to suggest that the concatenated hypothesis is not tenable. The large number of characters in the matrix allows us to test hypotheses concerning missing data and character number in phylogenomic studies, and we conclude that matrices constructed using genome level information are very robust to missing data. We show that a very large number of concatenated gene sequences (>160) are needed to reliably obtain the same topology as the overall analysis.

摘要

采用系统发生基因组学方法为巴氏杆菌科 10 个种的 12 个全基因组生成了一个氨基酸系统发生树。使用类似于 OrthologID(http://nypg.bio.nyu.edu/orthologid/about.html)的方法确定基因的同源性,生成了一个具有 3130 个基因的矩阵,其中有 1194615 个对齐的氨基酸字符,其中 239504 个字符具有系统发生信息。使用所有标准方法(最大简约法、最大似然法和贝叶斯分析)对连接矩阵进行系统发生分析,得到了一个单一的、非常稳健的本研究中检查的物种系统发生假说。值得注意的是,没有一个单独的基因分区给出与连接分析相同的树。通过分析数据矩阵中的分区支持,我们表明,来自单个基因分区的负面支持非常少,这表明连接假说是不可行的。矩阵中的字符数量非常多,允许我们测试关于系统发生研究中缺失数据和字符数量的假设,我们得出结论,使用基因组水平信息构建的矩阵对缺失数据非常稳健。我们表明,需要非常大量的连接基因序列(>160)才能可靠地获得与整体分析相同的拓扑结构。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验