Suppr超能文献

在蛋白质系统发生学中考虑溶剂可及性和二级结构显然是有益的。

Accounting for solvent accessibility and secondary structure in protein phylogenetics is clearly beneficial.

机构信息

Méthodes et Algorithmes pour la Bioinformatique, LIRMM, CNRS-Université Montpellier II, 161 rue Ada, Montpellier Cedex 5, France.

出版信息

Syst Biol. 2010 May;59(3):277-87. doi: 10.1093/sysbio/syq002. Epub 2010 Mar 10.

Abstract

Amino acid substitution models are essential to most methods to infer phylogenies from protein data. These models represent the ways in which proteins evolve and substitutions accumulate along the course of time. It is widely accepted that the substitution processes vary depending on the structural configuration of the protein residues. However, this information is very rarely used in phylogenetic studies, though the 3-dimensional structure of dozens of thousands of proteins has been elucidated. Here, we reinvestigate the question in order to fill this gap. We use an improved estimation methodology and a very large database comprising 1471 nonredundant globular protein alignments with structural annotations to estimate new amino acid substitution models accounting for the secondary structure and solvent accessibility of the residues. These models incorporate a confidence coefficient that is estimated from the data and reflects the reliability and usefulness of structural annotations in the analyzed sequences. Our results with 300 independent test alignments show an impressive likelihood gain compared with standard models such as JTT or WAG. Moreover, the use of these models induces significant topological changes in the inferred trees, which should be of primary interest to phylogeneticists. Our data, models, and software are available for download from http://atgc.lirmm.fr/phyml-structure/.

摘要

氨基酸替代模型对于大多数从蛋白质数据推断系统发育的方法来说都是必不可少的。这些模型代表了蛋白质随时间进化和替代积累的方式。人们普遍认为,替代过程取决于蛋白质残基的结构构象。然而,尽管已经阐明了数万个蛋白质的三维结构,但这种信息在系统发育研究中很少被使用。在这里,我们重新研究了这个问题,以填补这一空白。我们使用改进的估计方法和一个包含 1471 个具有结构注释的非冗余球状蛋白质比对的大型数据库,来估计新的氨基酸替代模型,这些模型考虑了残基的二级结构和溶剂可及性。这些模型包含一个置信系数,该系数是根据数据估计的,反映了分析序列中结构注释的可靠性和有用性。我们使用 300 个独立的测试比对的结果表明,与 JTT 或 WAG 等标准模型相比,我们的模型具有令人印象深刻的可能性增益。此外,在推断的树中,这些模型的使用会引起显著的拓扑变化,这应该是系统发育学家的主要关注点。我们的数据、模型和软件可从 http://atgc.lirmm.fr/phyml-structure/ 下载。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验