• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

蛋白质中位点特异性的进化速率最好被建模为非独立且严格相对的。

Site-specific evolutionary rates in proteins are better modeled as non-independent and strictly relative.

作者信息

Fernandes Andrew D, Atchley William R

机构信息

Department of Biochemistry, The University of Western Ontario, London, Ontario N6A5C1, Canada.

出版信息

Bioinformatics. 2008 Oct 1;24(19):2177-83. doi: 10.1093/bioinformatics/btn395. Epub 2008 Jul 28.

DOI:10.1093/bioinformatics/btn395
PMID:18662926
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2553437/
Abstract

MOTIVATION

In a nucleotide or amino acid sequence, not all sites evolve at the same rate, due to differing selective constraints at each site. Currently in computational molecular evolution, models incorporating rate heterogeneity always share two assumptions. First, the rate of evolution at each site is assumed to be independent of every other site. Second, the values of these rates are assumed to be drawn from a known prior distribution. Although often assumed to be small, the actual effect of these assumptions has not been previously quantified in the literature.

RESULTS

Herein we describe an algorithm to simultaneously infer the set of n-1 relative rates that parameterize the likelihood of an n-site alignment. Unlike previous work (a) these relative rates are completely identifiable and distinct from the branch-length parameters, and (b) a far more general class of rate priors can be used, and their effects quantified. Although described in a Bayesian framework, we discuss a future maximum likelihood extension.

CONCLUSIONS

Using both synthetic data and alignments from the Myc, Max and p53 protein families, we find that inferring relative rather than absolute rates has several advantages. First, both empirical likelihoods and Bayes factors show strong preference for the relative-rate model, with a mean Delta ln P=-0.458 per alignment site. Second, the computed likelihoods and Bayes factors were essentially independent of the relative-rate prior, indicating that good estimates of the posterior rate distribution are not required a priori. Third, a novel finding is that rates can be accurately inferred even when up to approximately 4 substitutions per site have occurred. Thus biologically relevant putative hypervariable sites can be identified as easily as conserved sites. Lastly, our model treats rates and tree branch-lengths as completely identifiable, allowing for the first time coherent simultaneous inference of branch-lengths and site-specific evolutionary rates.

AVAILABILITY

Source code for the utility described is available under a BSD-style license at http://www.fernandes.org/txp/article/9/site-specific-relative-evolutionary-rates.

摘要

动机

在核苷酸或氨基酸序列中,由于每个位点受到的选择约束不同,并非所有位点都以相同的速率进化。目前在计算分子进化中,纳入速率异质性的模型总是共享两个假设。第一,假设每个位点的进化速率与其他任何位点无关。第二,假设这些速率的值来自已知的先验分布。尽管通常认为这些假设的影响较小,但此前文献中尚未对其实际影响进行量化。

结果

在此我们描述一种算法,用于同时推断一组n - 1个相对速率,这些速率参数化了n个位点比对的似然性。与先前的工作不同,(a)这些相对速率是完全可识别的,并且与分支长度参数不同;(b)可以使用更广泛的一类速率先验,并对其影响进行量化。尽管是在贝叶斯框架下描述的,但我们讨论了未来的最大似然扩展。

结论

使用来自Myc、Max和p53蛋白家族的合成数据和比对,我们发现推断相对速率而非绝对速率有几个优点。第一,经验似然性和贝叶斯因子都强烈偏好相对速率模型,每个比对位点的平均Δln P = -0.458。第二,计算出的似然性和贝叶斯因子基本上与相对速率先验无关,这表明无需先验地对后验速率分布进行良好估计。第三,一个新发现是,即使每个位点发生多达约4次替换,速率也能被准确推断。因此,生物学上相关的假定高变位点可以像保守位点一样容易地被识别。最后,我们的模型将速率和树分支长度视为完全可识别的,首次允许对分支长度和位点特异性进化速率进行连贯的同时推断。

可用性

所描述实用程序的源代码可在http://www.fernandes.org/txp/article/9/site-specific-relative-evolutionary-rates 以BSD风格许可获取。

相似文献

1
Site-specific evolutionary rates in proteins are better modeled as non-independent and strictly relative.蛋白质中位点特异性的进化速率最好被建模为非独立且严格相对的。
Bioinformatics. 2008 Oct 1;24(19):2177-83. doi: 10.1093/bioinformatics/btn395. Epub 2008 Jul 28.
2
Bayesian coestimation of phylogeny and sequence alignment.系统发育与序列比对的贝叶斯联合估计
BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83.
3
Evolutionary HMMs: a Bayesian approach to multiple alignment.进化隐马尔可夫模型:一种用于多序列比对的贝叶斯方法。
Bioinformatics. 2001 Sep;17(9):803-20. doi: 10.1093/bioinformatics/17.9.803.
4
Comparison of site-specific rate-inference methods for protein sequences: empirical Bayesian methods are superior.蛋白质序列位点特异性速率推断方法的比较:经验贝叶斯方法更具优势。
Mol Biol Evol. 2004 Sep;21(9):1781-91. doi: 10.1093/molbev/msh194. Epub 2004 Jun 16.
5
Tail paradox, partial identifiability, and influential priors in Bayesian branch length inference.贝叶斯分支长度推断中的尾部悖论、部分可识别性和有影响的先验。
Mol Biol Evol. 2012 Jan;29(1):325-35. doi: 10.1093/molbev/msr210. Epub 2011 Sep 2.
6
Exploring among-site rate variation models in a maximum likelihood framework using empirical data: effects of model assumptions on estimates of topology, branch lengths, and bootstrap support.在最大似然框架下使用经验数据探索位点间速率变化模型:模型假设对拓扑结构、分支长度和自展支持度估计的影响。
Syst Biol. 2001 Feb;50(1):67-86.
7
Impact of taxon sampling on the estimation of rates of evolution at sites.分类群抽样对位点进化速率估计的影响。
Mol Biol Evol. 2005 Mar;22(3):784-91. doi: 10.1093/molbev/msi065. Epub 2004 Dec 8.
8
Robust sequence alignment using evolutionary rates coupled with an amino acid substitution matrix.使用进化速率结合氨基酸替换矩阵进行稳健的序列比对。
BMC Bioinformatics. 2015 Aug 14;16:255. doi: 10.1186/s12859-015-0688-8.
9
PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny.PhyloGibbs:一种整合了系统发育的吉布斯采样基序查找器。
PLoS Comput Biol. 2005 Dec;1(7):e67. doi: 10.1371/journal.pcbi.0010067. Epub 2005 Dec 9.
10
PROCOV: maximum likelihood estimation of protein phylogeny under covarion models and site-specific covarion pattern analysis.PROCOV:共变模型下蛋白质系统发育的最大似然估计及位点特异性共变模式分析
BMC Evol Biol. 2009 Sep 8;9:225. doi: 10.1186/1471-2148-9-225.

引用本文的文献

1
Relative Evolutionary Rates in Proteins Are Largely Insensitive to the Substitution Model.蛋白质的相对进化率在很大程度上不受替换模型的影响。
Mol Biol Evol. 2018 Sep 1;35(9):2307-2317. doi: 10.1093/molbev/msy127.
2
Calculating site-specific evolutionary rates at the amino-acid or codon level yields similar rate estimates.在氨基酸或密码子水平计算特定位点的进化速率会得出相似的速率估计值。
PeerJ. 2017 May 30;5:e3391. doi: 10.7717/peerj.3391. eCollection 2017.
3
Causes of evolutionary rate variation among protein sites.蛋白质位点间进化速率变化的原因。
Nat Rev Genet. 2016 Feb;17(2):109-21. doi: 10.1038/nrg.2015.18. Epub 2016 Jan 19.
4
A mechanistic stress model of protein evolution accounts for site-specific evolutionary rates and their relationship with packing density and flexibility.一种蛋白质进化的机械应力模型解释了特定部位的进化速率及其与包装密度和柔韧性的关系。
BMC Evol Biol. 2014 Apr 9;14:78. doi: 10.1186/1471-2148-14-78.
5
Biochemical and functional evidence of p53 homology is inconsistent with molecular phylogenetics for distant sequences.p53 同源性的生化及功能证据与远缘序列的分子系统发育学不一致。
J Mol Evol. 2008 Jul;67(1):51-67. doi: 10.1007/s00239-008-9124-2. Epub 2008 Jun 17.

本文引用的文献

1
Among-site rate variation and its impact on phylogenetic analyses.种间变异率及其对系统发育分析的影响。
Trends Ecol Evol. 1996 Sep;11(9):367-72. doi: 10.1016/0169-5347(96)10041-0.
2
An invariant form for the prior probability in estimation problems.估计问题中先验概率的一种不变形式。
Proc R Soc Lond A Math Phys Sci. 1946;186(1007):453-61. doi: 10.1098/rspa.1946.0056.
3
Gaussian quadrature formulae for arbitrary positive measures.任意正定测度的高斯求积公式。
Evol Bioinform Online. 2007 Feb 15;2:251-9.
4
Biochemical and functional evidence of p53 homology is inconsistent with molecular phylogenetics for distant sequences.p53 同源性的生化及功能证据与远缘序列的分子系统发育学不一致。
J Mol Evol. 2008 Jul;67(1):51-67. doi: 10.1007/s00239-008-9124-2. Epub 2008 Jun 17.
5
BAli-Phy: simultaneous Bayesian inference of alignment and phylogeny.BAli-Phy:比对和系统发育的同步贝叶斯推断
Bioinformatics. 2006 Aug 15;22(16):2047-8. doi: 10.1093/bioinformatics/btl175. Epub 2006 May 5.
6
A gamma mixture model better accounts for among site rate heterogeneity.伽马混合模型能更好地解释位点间的速率异质性。
Bioinformatics. 2005 Sep 1;21 Suppl 2:ii151-8. doi: 10.1093/bioinformatics/bti1125.
7
Joint Bayesian estimation of alignment and phylogeny.比对与系统发育的联合贝叶斯估计。
Syst Biol. 2005 Jun;54(3):401-18. doi: 10.1080/10635150590947041.
8
Site-specific evolutionary rate inference: taking phylogenetic uncertainty into account.位点特异性进化速率推断:考虑系统发育不确定性
J Mol Evol. 2005 Mar;60(3):345-53. doi: 10.1007/s00239-004-0183-8.
9
Sequence signatures and the probabilistic identification of proteins in the Myc-Max-Mad network.Myc-Max-Mad网络中的序列特征与蛋白质的概率识别
Proc Natl Acad Sci U S A. 2005 May 3;102(18):6401-6. doi: 10.1073/pnas.0408964102. Epub 2005 Apr 25.
10
A simple hierarchical approach to modeling distributions of substitution rates.一种用于模拟替换率分布的简单分层方法。
Mol Biol Evol. 2005 Feb;22(2):223-34. doi: 10.1093/molbev/msi009. Epub 2004 Oct 13.