Suppr超能文献

一种具有独立位点的蛋白质进化模型,可从蛋白质数据库中重现位点特异性氨基酸分布。

A protein evolution model with independent sites that reproduces site-specific amino acid distributions from the Protein Data Bank.

作者信息

Bastolla Ugo, Porto Markus, Roman H Eduardo, Vendruscolo Michele

机构信息

Centro de Biología Molecular Severo Ochoa, CSIC-UAM, Cantoblanco, 28049 Madrid, Spain.

出版信息

BMC Evol Biol. 2006 May 31;6:43. doi: 10.1186/1471-2148-6-43.

Abstract

BACKGROUND

Since thermodynamic stability is a global property of proteins that has to be conserved during evolution, the selective pressure at a given site of a protein sequence depends on the amino acids present at other sites. However, models of molecular evolution that aim at reconstructing the evolutionary history of macromolecules become computationally intractable if such correlations between sites are explicitly taken into account.

RESULTS

We introduce an evolutionary model with sites evolving independently under a global constraint on the conservation of structural stability. This model consists of a selection process, which depends on two hydrophobicity parameters that can be computed from protein sequences without any fit, and a mutation process for which we consider various models. It reproduces quantitatively the results of Structurally Constrained Neutral (SCN) simulations of protein evolution in which the stability of the native state is explicitly computed and conserved. We then compare the predicted site-specific amino acid distributions with those sampled from the Protein Data Bank (PDB). The parameters of the mutation model, whose number varies between zero and five, are fitted from the data. The mean correlation coefficient between predicted and observed site-specific amino acid distributions is larger than = 0.70 for a mutation model with no free parameters and no genetic code. In contrast, considering only the mutation process with no selection yields a mean correlation coefficient of = 0.56 with three fitted parameters. The mutation model that best fits the data takes into account increased mutation rate at CpG dinucleotides, yielding = 0.90 with five parameters.

CONCLUSION

The effective selection process that we propose reproduces well amino acid distributions as observed in the protein sequences in the PDB. Its simplicity makes it very promising for likelihood calculations in phylogenetic studies. Interestingly, in this approach the mutation process influences the effective selection process, i.e. selection and mutation must be entangled in order to obtain effectively independent sites. This interdependence between mutation and selection reflects the deep influence that mutation has on the evolutionary process: The bias in the mutation influences the thermodynamic properties of the evolving proteins, in agreement with comparative studies of bacterial proteomes, and it also influences the rate of accepted mutations.

摘要

背景

由于热力学稳定性是蛋白质的一种全局属性,在进化过程中必须得以保留,因此蛋白质序列中给定位点的选择压力取决于其他位点所存在的氨基酸。然而,如果明确考虑位点之间的这种相关性,旨在重建大分子进化历史的分子进化模型在计算上就会变得难以处理。

结果

我们引入了一种进化模型,其中位点在结构稳定性守恒的全局约束下独立进化。该模型由一个选择过程和一个突变过程组成,选择过程取决于两个可从蛋白质序列计算得出且无需任何拟合的疏水性参数,对于突变过程我们考虑了各种模型。它定量地再现了蛋白质进化的结构约束中性(SCN)模拟结果,在该模拟中明确计算并保留了天然状态的稳定性。然后我们将预测的位点特异性氨基酸分布与从蛋白质数据库(PDB)中采样得到的分布进行比较。突变模型的参数数量在零到五个之间,这些参数是根据数据进行拟合的。对于一个没有自由参数且没有遗传密码的突变模型,预测的和观察到的位点特异性氨基酸分布之间的平均相关系数大于 = 0.70。相比之下,仅考虑没有选择的突变过程,在有三个拟合参数的情况下,平均相关系数为 = 0.56。最符合数据的突变模型考虑了CpG二核苷酸处增加的突变率,在有五个参数的情况下,得到 = 0.90。

结论

我们提出的有效选择过程能够很好地再现如在PDB中的蛋白质序列中所观察到的氨基酸分布。其简单性使其在系统发育研究中的似然计算方面非常有前景。有趣的是,在这种方法中,突变过程会影响有效选择过程,即选择和突变必须相互交织才能获得有效的独立位点。突变与选择之间的这种相互依赖性反映了突变对进化过程的深刻影响:突变偏差会影响正在进化的蛋白质的热力学性质,这与细菌蛋白质组的比较研究一致,并且它还会影响被接受的突变率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d051/1570368/517e9ac2a3a5/1471-2148-6-43-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验