• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种新的氨基酸分类标准和方法。

A new criterion and method for amino acid classification.

作者信息

Kosiol Carolin, Goldman Nick, Buttimore Nigel H

机构信息

School of Mathematics, Trinity College, University of Dublin, Dublin 2, Ireland.

出版信息

J Theor Biol. 2004 May 7;228(1):97-106. doi: 10.1016/j.jtbi.2003.12.010.

DOI:10.1016/j.jtbi.2003.12.010
PMID:15064085
Abstract

It is accepted that many evolutionary changes of amino acid sequence in proteins are conservative: the replacement of one amino acid by another residue has a far greater chance of being accepted if the two residues have similar properties. It is difficult, however, to identify relevant physicochemical properties that capture this similarity. In this paper we introduce a criterion that determines similarity from an evolutionary point of view. Our criterion is based on the description of protein evolution by a Markov process and the corresponding matrix of instantaneous replacement rates. It is inspired by the conductance, a quantity that reflects the strength of mixing in a Markov process. Furthermore we introduce a method to divide the 20 amino acid residues into subsets that achieve good scores with our criterion. The criterion has the time-invariance property that different time distances of the same amino acid replacement rate matrix lead to the same grouping; but different rate matrices lead to different groupings. Therefore it can be used as an automated method to compare matrices derived from consideration of different types of proteins, or from parts of proteins sharing different structural or functional features. We present the groupings resulting from two standard matrices used in sequence alignment and phylogenetic tree estimation.

摘要

人们普遍认为,蛋白质中氨基酸序列的许多进化变化是保守的:如果两个残基具有相似的性质,那么一个氨基酸被另一个残基取代的可能性要大得多。然而,很难确定能够捕捉这种相似性的相关物理化学性质。在本文中,我们引入了一种从进化角度确定相似性的标准。我们的标准基于用马尔可夫过程对蛋白质进化的描述以及相应的瞬时取代率矩阵。它的灵感来自于电导,电导是一个反映马尔可夫过程中混合强度的量。此外,我们还介绍了一种方法,将20种氨基酸残基划分为子集,这些子集用我们的标准能获得较好的分数。该标准具有时间不变性,即相同氨基酸取代率矩阵的不同时间距离会导致相同的分组;但不同的率矩阵会导致不同的分组。因此,它可以用作一种自动方法,来比较从考虑不同类型蛋白质或具有不同结构或功能特征的蛋白质部分得出的矩阵。我们展示了由序列比对和系统发育树估计中使用的两个标准矩阵所得到的分组。

相似文献

1
A new criterion and method for amino acid classification.一种新的氨基酸分类标准和方法。
J Theor Biol. 2004 May 7;228(1):97-106. doi: 10.1016/j.jtbi.2003.12.010.
2
On the quality of tree-based protein classification.论基于树的蛋白质分类的质量。
Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12.
3
Estimation of amino acid residue substitution rates at local spatial regions and application in protein function inference: a Bayesian Monte Carlo approach.局部空间区域氨基酸残基替换率的估计及其在蛋白质功能推断中的应用:一种贝叶斯蒙特卡罗方法。
Mol Biol Evol. 2006 Feb;23(2):421-36. doi: 10.1093/molbev/msj048. Epub 2005 Oct 26.
4
Different versions of the Dayhoff rate matrix.Dayhoff速率矩阵的不同版本。
Mol Biol Evol. 2005 Feb;22(2):193-9. doi: 10.1093/molbev/msi005. Epub 2004 Oct 13.
5
Detecting distant homologs using phylogenetic tree-based HMMs.使用基于系统发育树的隐马尔可夫模型检测远缘同源物。
Proteins. 2003 Aug 15;52(3):446-53. doi: 10.1002/prot.10373.
6
Assessment of the probabilities for evolutionary structural changes in protein folds.蛋白质折叠中进化结构变化概率的评估。
Bioinformatics. 2007 Apr 1;23(7):832-41. doi: 10.1093/bioinformatics/btm022. Epub 2007 Feb 4.
7
A "Long Indel" model for evolutionary sequence alignment.一种用于进化序列比对的“长插入缺失”模型。
Mol Biol Evol. 2004 Mar;21(3):529-40. doi: 10.1093/molbev/msh043. Epub 2003 Dec 23.
8
Sequence similarity is more relevant than species specificity in probabilistic backtranslation.在概率性反向翻译中,序列相似性比物种特异性更具相关性。
BMC Bioinformatics. 2007 Feb 21;8:58. doi: 10.1186/1471-2105-8-58.
9
Testing for covarion-like evolution in protein sequences.检测蛋白质序列中的类共变进化。
Mol Biol Evol. 2007 Jan;24(1):294-305. doi: 10.1093/molbev/msl155. Epub 2006 Oct 20.
10
Modeling protein evolution with several amino acid replacement matrices depending on site rates.基于位点速率的几种氨基酸替换矩阵来模拟蛋白质进化。
Mol Biol Evol. 2012 Oct;29(10):2921-36. doi: 10.1093/molbev/mss112. Epub 2012 Apr 6.

引用本文的文献

1
Group-based phylogenetic models on 3-sunlet networks.基于群组的3-小阳网络系统发育模型。
Bull Math Biol. 2025 Aug 18;87(9):132. doi: 10.1007/s11538-025-01506-1.
2
Modeling compositional heterogeneity resolves deep phylogeny of flowering plants.构建成分异质性模型解析开花植物的深层系统发育关系。
Plant Divers. 2024 Jul 23;47(1):13-20. doi: 10.1016/j.pld.2024.07.007. eCollection 2025 Jan.
3
BAD2matrix: Phylogenomic matrix concatenation, indel coding, and more.BAD2矩阵:系统发育基因组矩阵拼接、插入缺失编码及更多内容。
Appl Plant Sci. 2024 Sep 24;12(6):e11604. doi: 10.1002/aps3.11604. eCollection 2024 Nov-Dec.
4
Acoelomorph flatworm monophyly is a long-branch attraction artefact obscuring a clade of Acoela and Xenoturbellida.后生动物扁形动物单系性是一个长枝吸引伪像,掩盖了后生动物门扁形动物和栉水母动物的一个分支。
Proc Biol Sci. 2024 Sep;291(2031):20240329. doi: 10.1098/rspb.2024.0329. Epub 2024 Sep 18.
5
A deep learning method for predicting the minimum inhibitory concentration of antimicrobial peptides against using Multi-Branch-CNN and Attention.一种基于多分支卷积神经网络和注意力机制的深度学习方法,用于预测抗菌肽对 的最小抑菌浓度。
mSystems. 2023 Aug 31;8(4):e0034523. doi: 10.1128/msystems.00345-23. Epub 2023 Jul 11.
6
Resolving tricky nodes in the tree of life through amino acid recoding.通过氨基酸重新编码解决生命之树中棘手的节点问题。
iScience. 2022 Nov 15;25(12):105594. doi: 10.1016/j.isci.2022.105594. eCollection 2022 Dec 22.
7
A Practical Guide to Design and Assess a Phylogenomic Study.《系统发育基因组学研究设计与评估实用指南》
Genome Biol Evol. 2022 Sep 6;14(9). doi: 10.1093/gbe/evac129.
8
Research progress of reduced amino acid alphabets in protein analysis and prediction.蛋白质分析与预测中简化氨基酸字母表的研究进展
Comput Struct Biotechnol J. 2022 Jul 4;20:3503-3510. doi: 10.1016/j.csbj.2022.07.001. eCollection 2022.
9
Geometry-based distance for clustering amino acids.基于几何的氨基酸聚类距离。
J Appl Stat. 2019 Oct 3;47(7):1235-1250. doi: 10.1080/02664763.2019.1673324. eCollection 2020.
10
BioKIT: a versatile toolkit for processing and analyzing diverse types of sequence data.BioKIT:一个用于处理和分析多种类型序列数据的多功能工具包。
Genetics. 2022 Jul 4;221(3). doi: 10.1093/genetics/iyac079.