可能性包含什么？蛋白质进化的简单模型和结构可行重建对可能性的贡献。

What's in a likelihood? Simple models of protein evolution and the contribution of structurally viable reconstructions to the likelihood.

机构信息

Department of Biological Science, Section of Ecology and Evolution, Florida State University, Tallahassee, FL 32306-4120, USA.

出版信息

Syst Biol. 2011 Mar;60(2):161-74. doi: 10.1093/sysbio/syq088. Epub 2011 Jan 12.

DOI:10.1093/sysbio/syq088

PMID:21233085

Abstract

Most phylogenetic models of protein evolution assume that sites are independent and identically distributed. Interactions between sites are ignored, and the likelihood can be conveniently calculated as the product of the individual site likelihoods. The calculation considers all possible transition paths (also called substitution histories or mappings) that are consistent with the observed states at the terminals, and the probability density of any particular reconstruction depends on the substitution model. The likelihood is the integral of the probability density of each substitution history taken over all possible histories that are consistent with the observed data. We investigated the extent to which transition paths that are incompatible with a protein's three-dimensional structure contribute to the likelihood. Several empirical amino acid models were tested for sequence pairs of different degrees of divergence. When simulating substitutional histories starting from a real sequence, the structural integrity of the simulated sequences quickly disintegrated. This result indicates that simple models are clearly unable to capture the constraints on sequence evolution. However, when we sampled transition paths between real sequences from the posterior probability distribution according to these same models, we found that the sampled histories were largely consistent with the tertiary structure. This suggests that simple empirical substitution models may be adequate for interpolating changes between observed sequences during phylogenetic inference despite the fact that the models cannot predict the effects of structural constraints from first principles. This study is significant because it provides a quantitative assessment of the biological realism of substitution models from the perspective of protein structure, and it provides insight on the prospects for improving models of protein sequence evolution.

摘要

大多数蛋白质进化的系统发育模型都假设各位点是独立且同分布的。各位点间的相互作用被忽略，并且似然可以方便地计算为各单个位点似然的乘积。这种计算考虑了与末端观察状态一致的所有可能的转换路径（也称为替代历史或映射），任何特定重建的概率密度取决于替代模型。似然是对与观察数据一致的所有可能历史中的每个替代历史的概率密度进行积分。我们研究了与蛋白质三维结构不兼容的转换路径对似然的贡献程度。针对不同分歧程度的序列对，测试了几种经验氨基酸模型。当从真实序列开始模拟替代历史时，模拟序列的结构完整性很快就瓦解了。这一结果表明，简单的模型显然无法捕捉序列进化的约束。然而，当我们根据这些相同的模型从后验概率分布中对真实序列之间的转换路径进行抽样时，我们发现抽样的历史在很大程度上与三级结构一致。这表明，尽管这些模型无法从第一性原理预测结构约束的影响，但简单的经验替代模型可能足以在系统发育推断中对观察序列之间的变化进行插值。这项研究意义重大，因为它从蛋白质结构的角度对替代模型的生物学真实性进行了定量评估，并为改进蛋白质序列进化模型提供了思路。

相似文献

What's in a likelihood? Simple models of protein evolution and the contribution of structurally viable reconstructions to the likelihood.可能性包含什么？蛋白质进化的简单模型和结构可行重建对可能性的贡献。

Syst Biol. 2011 Mar;60(2):161-74. doi: 10.1093/sysbio/syq088. Epub 2011 Jan 12.

Empirical models for substitution in ribosomal RNA.核糖体RNA中替代的经验模型。

Mol Biol Evol. 2004 Mar;21(3):419-27. doi: 10.1093/molbev/msh029. Epub 2003 Dec 5.

Site interdependence attributed to tertiary structure in amino acid sequence evolution.氨基酸序列进化中归因于三级结构的位点相互依赖性。

Gene. 2005 Mar 14;347(2):207-17. doi: 10.1016/j.gene.2004.12.011. Epub 2005 Feb 19.

A new formulation of protein evolutionary models that account for structural constraints.一种新的蛋白质进化模型公式，该公式考虑了结构约束。

Mol Biol Evol. 2014 Mar;31(3):736-49. doi: 10.1093/molbev/mst240. Epub 2013 Dec 3.

Assessing site-interdependent phylogenetic models of sequence evolution.评估序列进化的位点依赖系统发育模型。

Mol Biol Evol. 2006 Sep;23(9):1762-75. doi: 10.1093/molbev/msl041. Epub 2006 Jun 20.

Maximum-Likelihood Phylogenetic Inference with Selection on Protein Folding Stability.基于蛋白质折叠稳定性选择的最大似然系统发育推断

Mol Biol Evol. 2015 Aug;32(8):2195-207. doi: 10.1093/molbev/msv085. Epub 2015 Apr 2.

An empirical codon model for protein sequence evolution.一种用于蛋白质序列进化的经验密码子模型。

Mol Biol Evol. 2007 Jul;24(7):1464-79. doi: 10.1093/molbev/msm064. Epub 2007 Mar 30.

Modelling the evolution of protein coding sequences sampled from Measurably Evolving Populations.对从可测量进化种群中采样的蛋白质编码序列的进化进行建模。

Genome Inform. 2008;21:150-64.

Heterotachy and functional shift in protein evolution.蛋白质进化中的异速进化与功能转变。

IUBMB Life. 2003 Apr-May;55(4-5):257-65. doi: 10.1080/1521654031000123330.

Accounting for solvent accessibility and secondary structure in protein phylogenetics is clearly beneficial.在蛋白质系统发生学中考虑溶剂可及性和二级结构显然是有益的。

Syst Biol. 2010 May;59(3):277-87. doi: 10.1093/sysbio/syq002. Epub 2010 Mar 10.

引用本文的文献

One origin for metallo-β-lactamase activity, or two? An investigation assessing a diverse set of reconstructed ancestral sequences based on a sample of phylogenetic trees.金属β-内酰胺酶活性的起源是一个还是两个？基于系统发育树样本对一组多样化的重建祖先序列进行的调查。

J Mol Evol. 2014 Oct;79(3-4):117-29. doi: 10.1007/s00239-014-9639-7. Epub 2014 Sep 4.

The evolution of protein structures and structural ensembles under functional constraint.功能约束下蛋白质结构和结构集合的演变。

Genes (Basel). 2011 Oct 28;2(4):748-62. doi: 10.3390/genes2040748.

A penalized-likelihood method to estimate the distribution of selection coefficients from phylogenetic data.一种从系统发育数据估计选择系数分布的惩罚似然方法。

Genetics. 2014 May;197(1):257-71. doi: 10.1534/genetics.114.162263. Epub 2014 Feb 14.

The interface of protein structure, protein biophysics, and molecular evolution.蛋白质结构、蛋白质生物物理学和分子进化的界面。

Protein Sci. 2012 Jun;21(6):769-85. doi: 10.1002/pro.2071. Epub 2012 Apr 23.

Estimating the distribution of selection coefficients from phylogenetic data using sitewise mutation-selection models.利用基于位点的突变-选择模型从系统发育数据估计选择系数的分布。

Genetics. 2012 Mar;190(3):1101-15. doi: 10.1534/genetics.111.136432. Epub 2011 Dec 29.

Biophysical and structural considerations for protein sequence evolution.蛋白质序列进化的生物物理和结构考虑因素。

BMC Evol Biol. 2011 Dec 16;11:361. doi: 10.1186/1471-2148-11-361.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

可能性包含什么？蛋白质进化的简单模型和结构可行重建对可能性的贡献。

What's in a likelihood? Simple models of protein evolution and the contribution of structurally viable reconstructions to the likelihood.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献