一种对氨基酸进行聚类的新方法。

A new approach to clustering the amino acids.

作者信息

Stanfel L E

机构信息

University of Alabama, Tuscaloosa 35487-0226, USA.

出版信息

J Theor Biol. 1996 Nov 21;183(2):195-205. doi: 10.1006/jtbi.1996.0213.

DOI:10.1006/jtbi.1996.0213

PMID:8977877

Abstract

Each amino acid is represented by a vector of numerical measurements for the attributes of volume, area, hydrophilicity, polarity, hydrogen bonding, shape, and charge. Inter-residue distances are then calculated according to common metrics, and we introduce a new clustering objective function derived from information-theoretic considerations. The arguments of the function are the inter-object distances of the things to be clustered: in this case the amino acids. By means of approximating the solution of an integer programming problem, then, the residues are partitioned into clusters. The clusters obtained are compared with groups obtained in substitution/mutation studies and found to be similar. Thus, probably the strongest and most objective evidence to date is supplied for believing that physico-chemical properties account for the viability of substitutions and that the important similarities/differences are explained by a relatively small and simple set of properties.

摘要

每个氨基酸都由一个数值测量向量表示，该向量用于描述体积、面积、亲水性、极性、氢键、形状和电荷等属性。然后根据常见度量计算残基间距离，并且我们引入了一个基于信息论考量得出的新聚类目标函数。该函数的自变量是待聚类事物（在这种情况下是氨基酸）的对象间距离。通过近似求解整数规划问题，残基被划分为不同的簇。将得到的簇与在替换/突变研究中获得的组进行比较，发现它们相似。因此，可能提供了迄今为止最有力且最客观的证据，让人相信物理化学性质决定了替换的可行性，并且重要的相似性/差异可以由一组相对较少且简单的性质来解释。

相似文献

A new approach to clustering the amino acids.一种对氨基酸进行聚类的新方法。

J Theor Biol. 1996 Nov 21;183(2):195-205. doi: 10.1006/jtbi.1996.0213.

Fuzzy cluster analysis of simple physicochemical properties of amino acids for recognizing secondary structure in proteins.用于识别蛋白质二级结构的氨基酸简单物理化学性质的模糊聚类分析。

Protein Sci. 1995 Jun;4(6):1178-87. doi: 10.1002/pro.5560040616.

Use of variable selection in modeling the secondary structural content of proteins from their composition of amino acid residues.在根据氨基酸残基组成对蛋白质二级结构含量进行建模时使用变量选择。

J Chem Inf Comput Sci. 2004 Jan-Feb;44(1):113-21. doi: 10.1021/ci034037p.

Prediction of secondary structural content of proteins from their amino acid composition alone. I. New analytic vector decomposition methods.仅根据氨基酸组成预测蛋白质的二级结构含量。I. 新的分析向量分解方法。

Proteins. 1996 Jun;25(2):157-68. doi: 10.1002/(SICI)1097-0134(199606)25:2<157::AID-PROT2>3.0.CO;2-F.

Amino acid pairing preferences in parallel beta-sheets in proteins.蛋白质中平行β-折叠中的氨基酸配对偏好

J Mol Biol. 2006 Feb 10;356(1):32-44. doi: 10.1016/j.jmb.2005.11.008. Epub 2005 Nov 22.

Inter-residue distances derived from fold contact propensities correlate with evolutionary substitution costs.从折叠接触倾向得出的残基间距离与进化替代成本相关。

BMC Bioinformatics. 2004 Oct 18;5:153. doi: 10.1186/1471-2105-5-153.

An integrated approach to the analysis and modeling of protein sequences and structures. III. A comparative study of sequence conservation in protein structural families using multiple structural alignments.一种蛋白质序列与结构分析及建模的综合方法。III. 使用多重结构比对对蛋白质结构家族中的序列保守性进行比较研究。

J Mol Biol. 2000 Aug 18;301(3):691-711. doi: 10.1006/jmbi.2000.3975.

Residual dipolar couplings in short peptides reveal systematic conformational preferences of individual amino acids.短肽中的剩余偶极耦合揭示了单个氨基酸的系统构象偏好。

J Am Chem Soc. 2006 Oct 18;128(41):13508-14. doi: 10.1021/ja063606h.

Application of information theory to a three-body coarse-grained representation of proteins in the PDB: insights into the structural and evolutionary roles of residues in protein structure.信息论在蛋白质数据银行（PDB）中蛋白质三体粗粒度表示法的应用：对蛋白质结构中残基的结构和进化作用的见解。

Proteins. 2014 Dec;82(12):3450-65. doi: 10.1002/prot.24698. Epub 2014 Oct 21.

Distances and classification of amino acids for different protein secondary structures.

Phys Rev E Stat Nonlin Soft Matter Phys. 2003 May;67(5 Pt 1):051927. doi: 10.1103/PhysRevE.67.051927. Epub 2003 May 27.

引用本文的文献

Research progress of reduced amino acid alphabets in protein analysis and prediction.蛋白质分析与预测中简化氨基酸字母表的研究进展

Comput Struct Biotechnol J. 2022 Jul 4;20:3503-3510. doi: 10.1016/j.csbj.2022.07.001. eCollection 2022.

Genome wide and evolutionary analysis of heat shock protein 70 proteins in tomato and their role in response to heat and drought stress.番茄中热休克蛋白70家族的全基因组及进化分析及其在响应高温和干旱胁迫中的作用

Mol Biol Rep. 2022 Dec;49(12):11229-11241. doi: 10.1007/s11033-022-07734-1. Epub 2022 Jul 4.

Geometry-based distance for clustering amino acids.基于几何的氨基酸聚类距离。

J Appl Stat. 2019 Oct 3;47(7):1235-1250. doi: 10.1080/02664763.2019.1673324. eCollection 2020.

Adaptive Molecular Evolution of Gene for Positive Diversifying Selection in Mammals.哺乳动物中正向选择作用下基因的适应性分子进化。

Biomed Res Int. 2020 May 19;2020:2584627. doi: 10.1155/2020/2584627. eCollection 2020.

Adaptive molecular evolution of gene reveals the evidence for positive diversifying selection in indigenous goat populations.基因的适应性分子进化揭示了本土山羊群体中正向多样化选择的证据。

Ecol Evol. 2017 Jun 7;7(14):5170-5180. doi: 10.1002/ece3.2919. eCollection 2017 Jul.

IDEPI: rapid prediction of HIV-1 antibody epitopes and other phenotypic features from sequence data using a flexible machine learning platform.IDEPI：使用灵活的机器学习平台从序列数据快速预测HIV-1抗体表位及其他表型特征。

PLoS Comput Biol. 2014 Sep 25;10(9):e1003842. doi: 10.1371/journal.pcbi.1003842. eCollection 2014 Sep.

Non-negative matrix factorization for learning alignment-specific models of protein evolution.非负矩阵分解用于学习蛋白质进化的对齐特异性模型。

PLoS One. 2011;6(12):e28898. doi: 10.1371/journal.pone.0028898. Epub 2011 Dec 22.

CodonTest: modeling amino acid substitution preferences in coding sequences.CodonTest：建模编码序列中氨基酸替换偏好。

PLoS Comput Biol. 2010 Aug 19;6(8):e1000885. doi: 10.1371/journal.pcbi.1000885.

Benchmarking multi-rate codon models.基准多速率密码子模型。

PLoS One. 2010 Jul 21;5(7):e11587. doi: 10.1371/journal.pone.0011587.

A maximum likelihood method for detecting directional evolution in protein sequences and its application to influenza A virus.一种用于检测蛋白质序列定向进化的最大似然法及其在甲型流感病毒中的应用。

Mol Biol Evol. 2008 Sep;25(9):1809-24. doi: 10.1093/molbev/msn123. Epub 2008 May 29.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种对氨基酸进行聚类的新方法。

A new approach to clustering the amino acids.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献