Suppr超能文献

可能性包含什么?蛋白质进化的简单模型和结构可行重建对可能性的贡献。

What's in a likelihood? Simple models of protein evolution and the contribution of structurally viable reconstructions to the likelihood.

机构信息

Department of Biological Science, Section of Ecology and Evolution, Florida State University, Tallahassee, FL 32306-4120, USA.

出版信息

Syst Biol. 2011 Mar;60(2):161-74. doi: 10.1093/sysbio/syq088. Epub 2011 Jan 12.

Abstract

Most phylogenetic models of protein evolution assume that sites are independent and identically distributed. Interactions between sites are ignored, and the likelihood can be conveniently calculated as the product of the individual site likelihoods. The calculation considers all possible transition paths (also called substitution histories or mappings) that are consistent with the observed states at the terminals, and the probability density of any particular reconstruction depends on the substitution model. The likelihood is the integral of the probability density of each substitution history taken over all possible histories that are consistent with the observed data. We investigated the extent to which transition paths that are incompatible with a protein's three-dimensional structure contribute to the likelihood. Several empirical amino acid models were tested for sequence pairs of different degrees of divergence. When simulating substitutional histories starting from a real sequence, the structural integrity of the simulated sequences quickly disintegrated. This result indicates that simple models are clearly unable to capture the constraints on sequence evolution. However, when we sampled transition paths between real sequences from the posterior probability distribution according to these same models, we found that the sampled histories were largely consistent with the tertiary structure. This suggests that simple empirical substitution models may be adequate for interpolating changes between observed sequences during phylogenetic inference despite the fact that the models cannot predict the effects of structural constraints from first principles. This study is significant because it provides a quantitative assessment of the biological realism of substitution models from the perspective of protein structure, and it provides insight on the prospects for improving models of protein sequence evolution.

摘要

大多数蛋白质进化的系统发育模型都假设各位点是独立且同分布的。各位点间的相互作用被忽略,并且似然可以方便地计算为各单个位点似然的乘积。这种计算考虑了与末端观察状态一致的所有可能的转换路径(也称为替代历史或映射),任何特定重建的概率密度取决于替代模型。似然是对与观察数据一致的所有可能历史中的每个替代历史的概率密度进行积分。我们研究了与蛋白质三维结构不兼容的转换路径对似然的贡献程度。针对不同分歧程度的序列对,测试了几种经验氨基酸模型。当从真实序列开始模拟替代历史时,模拟序列的结构完整性很快就瓦解了。这一结果表明,简单的模型显然无法捕捉序列进化的约束。然而,当我们根据这些相同的模型从后验概率分布中对真实序列之间的转换路径进行抽样时,我们发现抽样的历史在很大程度上与三级结构一致。这表明,尽管这些模型无法从第一性原理预测结构约束的影响,但简单的经验替代模型可能足以在系统发育推断中对观察序列之间的变化进行插值。这项研究意义重大,因为它从蛋白质结构的角度对替代模型的生物学真实性进行了定量评估,并为改进蛋白质序列进化模型提供了思路。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验