• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一个使用最大似然法从多个蛋白质家族推导出来的蛋白质进化通用经验模型。

A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach.

作者信息

Whelan S, Goldman N

机构信息

Department of Zoology, University of Cambridge, Cambridge, England.

出版信息

Mol Biol Evol. 2001 May;18(5):691-9. doi: 10.1093/oxfordjournals.molbev.a003851.

DOI:10.1093/oxfordjournals.molbev.a003851
PMID:11319253
Abstract

Phylogenetic inference from amino acid sequence data uses mainly empirical models of amino acid replacement and is therefore dependent on those models. Two of the more widely used models, the Dayhoff and JTT models, are estimated using similar methods that can utilize large numbers of sequences from many unrelated protein families but are somewhat unsatisfactory because they rely on assumptions that may lead to systematic error and discard a large amount of the information within the sequences. The alternative method of maximum-likelihood estimation may utilize the information in the sequence data more efficiently and suffers from no systematic error, but it has previously been applicable to relatively few sequences related by a single phylogenetic tree. Here, we combine the best attributes of these two methods using an approximate maximum-likelihood method. We implemented this approach to estimate a new model of amino acid replacement from a database of globular protein sequences comprising 3,905 amino acid sequences split into 182 protein families. While the new model has an overall structure similar to those of other commonly used models, there are significant differences. The new model outperforms the Dayhoff and JTT models with respect to maximum-likelihood values for a large majority of the protein families in our database. This suggests that it provides a better overall fit to the evolutionary process in globular proteins and may lead to more accurate phylogenetic tree estimates. Potentially, this matrix, and the methods used to generate it, may also be useful in other areas of research, such as biological sequence database searching, sequence alignment, and protein structure prediction, for which an accurate description of amino acid replacement is required.

摘要

基于氨基酸序列数据的系统发育推断主要使用氨基酸替换的经验模型,因此依赖于这些模型。两种使用较为广泛的模型,即Dayhoff模型和JTT模型,是通过类似的方法估计出来的,这些方法可以利用来自许多不相关蛋白质家族的大量序列,但它们有些不尽人意,因为它们依赖的假设可能会导致系统误差,并且会丢弃序列中的大量信息。最大似然估计的替代方法可能能更有效地利用序列数据中的信息,并且不存在系统误差,但此前它仅适用于由单个系统发育树关联的相对较少的序列。在这里,我们使用一种近似最大似然方法结合了这两种方法的最佳特性。我们实施了这种方法,从一个包含3905个氨基酸序列、分为182个蛋白质家族的球状蛋白质序列数据库中估计出一个新的氨基酸替换模型。虽然新模型的整体结构与其他常用模型相似,但也存在显著差异。对于我们数据库中的绝大多数蛋白质家族,新模型在最大似然值方面优于Dayhoff模型和JTT模型。这表明它能更好地整体拟合球状蛋白质的进化过程,可能会带来更准确的系统发育树估计。潜在地,这个矩阵以及用于生成它的方法,在其他研究领域,如生物序列数据库搜索、序列比对和蛋白质结构预测中也可能有用,因为这些领域需要对氨基酸替换进行准确描述。

相似文献

1
A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach.一个使用最大似然法从多个蛋白质家族推导出来的蛋白质进化通用经验模型。
Mol Biol Evol. 2001 May;18(5):691-9. doi: 10.1093/oxfordjournals.molbev.a003851.
2
Bayesian coestimation of phylogeny and sequence alignment.系统发育与序列比对的贝叶斯联合估计
BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83.
3
A class frequency mixture model that adjusts for site-specific amino acid frequencies and improves inference of protein phylogeny.一种根据特定位点氨基酸频率进行调整并改进蛋白质系统发育推断的类频率混合模型。
BMC Evol Biol. 2008 Dec 16;8:331. doi: 10.1186/1471-2148-8-331.
4
Modeling residue usage in aligned protein sequences via maximum likelihood.通过最大似然法对比对后的蛋白质序列中的残基使用情况进行建模。
Mol Biol Evol. 1996 Dec;13(10):1368-74. doi: 10.1093/oxfordjournals.molbev.a025583.
5
An improved general amino acid replacement matrix.一种改进的通用氨基酸置换矩阵。
Mol Biol Evol. 2008 Jul;25(7):1307-20. doi: 10.1093/molbev/msn067. Epub 2008 Mar 26.
6
Pandit: a database of protein and associated nucleotide domains with inferred trees.潘迪特:一个带有推断树的蛋白质及相关核苷酸结构域数据库。
Bioinformatics. 2003 Aug 12;19(12):1556-63. doi: 10.1093/bioinformatics/btg188.
7
QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution.QMaker:一种快速准确的蛋白质进化经验模型估计方法。
Syst Biol. 2021 Aug 11;70(5):1046-1060. doi: 10.1093/sysbio/syab010.
8
PROCOV: maximum likelihood estimation of protein phylogeny under covarion models and site-specific covarion pattern analysis.PROCOV:共变模型下蛋白质系统发育的最大似然估计及位点特异性共变模式分析
BMC Evol Biol. 2009 Sep 8;9:225. doi: 10.1186/1471-2148-9-225.
9
An amino acid substitution-selection model adjusts residue fitness to improve phylogenetic estimation.氨基酸替换选择模型调整残基适合度以改进系统发育估计。
Mol Biol Evol. 2014 Apr;31(4):779-92. doi: 10.1093/molbev/msu044. Epub 2014 Jan 16.
10
SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees.SATe-II:一种非常快速且准确的同时估计多个序列比对和系统发育树的方法。
Syst Biol. 2012 Jan;61(1):90-106. doi: 10.1093/sysbio/syr095. Epub 2011 Dec 1.

引用本文的文献

1
Diversity of RNA Viruses and Circular Viroid-like Elements in spp. in Near-Natural Forests of Bosnia and Herzegovina.波斯尼亚和黑塞哥维那近自然森林中 spp. 里RNA病毒和类环状病毒样元件的多样性。
Viruses. 2025 Aug 20;17(8):1144. doi: 10.3390/v17081144.
2
Protein Structural Phylogenetics.蛋白质结构系统发育学
Genome Biol Evol. 2025 Jul 30;17(8). doi: 10.1093/gbe/evaf139.
3
Reduced Amino Acid Substitution Matrices Find Traces of Ancient Coding Alphabets in Modern Day Proteins.简化氨基酸替换矩阵在现代蛋白质中发现古代编码字母表的痕迹。
Mol Biol Evol. 2025 Sep 1;42(9). doi: 10.1093/molbev/msaf197.
4
Metagenomic Insights Into the Role of Gut Microbes in the Defensive Ink "Tsunabi" of Physeteroid Whales.宏基因组学揭示肠道微生物在抹香鲸类防御性墨汁“Tsunabi”中的作用
Ecol Evol. 2025 Aug 8;15(8):e71910. doi: 10.1002/ece3.71910. eCollection 2025 Aug.
5
Ectodysplasin overexpression reveals spatiotemporally dynamic tooth formation competency in stickleback and zebrafish.外胚层发育不全蛋白的过表达揭示了棘鱼和斑马鱼中时空动态的牙齿形成能力。
bioRxiv. 2025 May 7:2025.05.01.651241. doi: 10.1101/2025.05.01.651241.
6
ISGylation and E3 ubiquitin ligases: an Atlantic salmon genetic perspective.ISGylation与E3泛素连接酶:从大西洋鲑鱼遗传学角度分析
Front Immunol. 2025 Jun 24;16:1554680. doi: 10.3389/fimmu.2025.1554680. eCollection 2025.
7
Origin of immunoglobulins and T cell receptors: A candidate gene for invasion by the RAG transposon.免疫球蛋白和T细胞受体的起源:RAG转座子入侵的一个候选基因。
Sci Adv. 2025 Jul 4;11(27):eadw1273. doi: 10.1126/sciadv.adw1273.
8
Antibacterial microcins are ubiquitous and functionally diverse across bacterial communities.抗菌微菌素在细菌群落中普遍存在且功能多样。
Nat Commun. 2025 Jul 1;16(1):6048. doi: 10.1038/s41467-025-61151-z.
9
Hantavirus co-circulation in common shrews () in Sweden.汉坦病毒在瑞典普通鼩鼱()中的共同传播。 (注:原文括号里内容缺失,所以译文括号处也无法准确完整翻译)
Virus Evol. 2025 May 28;11(1):veaf038. doi: 10.1093/ve/veaf038. eCollection 2025.
10
Apicortin, a Putative Apicomplexan-Specific Protein, Is Present in Deep-Branching Opisthokonts.顶体蛋白,一种假定的顶复门特异性蛋白,存在于进化分支较深的后鞭毛生物中。
Biology (Basel). 2025 May 28;14(6):620. doi: 10.3390/biology14060620.