• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于蛋白质序列比较和结构预测的氨基酸指数及突变矩阵分析。

Analysis of amino acid indices and mutation matrices for sequence comparison and structure prediction of proteins.

作者信息

Tomii K, Kanehisa M

机构信息

Institute for Chemical Research, Kyoto University, Japan.

出版信息

Protein Eng. 1996 Jan;9(1):27-36. doi: 10.1093/protein/9.1.27.

DOI:10.1093/protein/9.1.27
PMID:9053899
Abstract

An amino acid index is a set of 20 numerical values representing any of the different physicochemical and biochemical properties of amino acids. As a follow-up to the previous study, we have increased the size of the database, which currently contains 402 published indices, and re-performed the single-linkage cluster analysis. The results basically confirmed the previous findings. Another important feature of amino acids that can be represented numerically is the similarity between them. Thus, a similarity matrix, also called a mutation matrix, is a set of 20 x 20 numerical values used for protein sequence alignments and similarity searches. We have collected 42 published matrices, performed hierarchical cluster analyses and identified several clusters corresponding to the nature of the data set and the method used for constructing the mutation matrix. Further, we have tried to reproduce each mutation matrix by the combination of amino acid indices in order to understand which properties of amino acids are reflected most. There was a relationship between the PAM units of Dayhoff's mutation matrix and the volume and hydrophobicity of amino acids. The database of 402 amino acid indices and 42 amino acid mutation matrices is made publicly available on the Internet.

摘要

氨基酸指数是一组20个数值,代表氨基酸的任何不同物理化学和生化特性。作为之前研究的后续,我们扩大了数据库规模,该数据库目前包含402个已发表的指数,并重新进行了单链聚类分析。结果基本证实了之前的发现。氨基酸的另一个可以用数值表示的重要特征是它们之间的相似性。因此,相似性矩阵,也称为突变矩阵,是一组用于蛋白质序列比对和相似性搜索的20×20数值。我们收集了42个已发表的矩阵,进行了层次聚类分析,并确定了几个与数据集性质和用于构建突变矩阵的方法相对应的聚类。此外,我们试图通过氨基酸指数的组合来重现每个突变矩阵,以便了解氨基酸的哪些特性得到了最充分的体现。Dayhoff突变矩阵的PAM单位与氨基酸的体积和疏水性之间存在关系。包含402个氨基酸指数和42个氨基酸突变矩阵的数据库在互联网上公开提供。

相似文献

1
Analysis of amino acid indices and mutation matrices for sequence comparison and structure prediction of proteins.用于蛋白质序列比较和结构预测的氨基酸指数及突变矩阵分析。
Protein Eng. 1996 Jan;9(1):27-36. doi: 10.1093/protein/9.1.27.
2
AAindex: Amino Acid Index Database.AAindex:氨基酸索引数据库。
Nucleic Acids Res. 1999 Jan 1;27(1):368-9. doi: 10.1093/nar/27.1.368.
3
Novel protein weight matrix generated from amino acid indices.由氨基酸指数生成的新型蛋白质权重矩阵。
Annu Int Conf IEEE Eng Med Biol Soc. 2015 Aug;2015:8181-4. doi: 10.1109/EMBC.2015.7320293.
4
AAindex: amino acid index database.AAindex:氨基酸索引数据库。
Nucleic Acids Res. 2000 Jan 1;28(1):374. doi: 10.1093/nar/28.1.374.
5
Cluster analysis of amino acid indices for prediction of protein structure and function.用于预测蛋白质结构和功能的氨基酸指数聚类分析。
Protein Eng. 1988 Jul;2(2):93-100. doi: 10.1093/protein/2.2.93.
6
A method to estimate effects of amino acid substitutions in blood coagulation factor IX from hemophilia B patients.一种评估B型血友病患者凝血因子IX中氨基酸取代效应的方法。
Medinfo. 1995;8 Pt 2:909.
7
Interpretable numerical descriptors of amino acid space.氨基酸空间的可解释数值描述符。
J Comput Biol. 2009 May;16(5):703-23. doi: 10.1089/cmb.2008.0173.
8
Using the radial distributions of physical features to compare amino acid environments and align amino acid sequences.利用物理特征的径向分布来比较氨基酸环境并比对氨基酸序列。
Pac Symp Biocomput. 1997:465-76.
9
An assessment of amino acid exchange matrices in aligning protein sequences: the twilight zone revisited.蛋白质序列比对中氨基酸交换矩阵的评估:重温模糊区域
J Mol Biol. 1995 Jun 16;249(4):816-31. doi: 10.1006/jmbi.1995.0340.
10
Statistical potential-based amino acid similarity matrices for aligning distantly related protein sequences.用于比对远缘相关蛋白质序列的基于统计势的氨基酸相似性矩阵。
Proteins. 2006 Aug 15;64(3):587-600. doi: 10.1002/prot.21020.

引用本文的文献

1
An artificial intelligence-based approach for identifying the proteins regulating liquid-liquid phase separation.一种基于人工智能的方法用于识别调节液-液相分离的蛋白质。
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf313.
2
In silico evolution of globular protein folds from random sequences.从随机序列进行球状蛋白质折叠的计算机模拟进化
Proc Natl Acad Sci U S A. 2025 Jul 8;122(27):e2509015122. doi: 10.1073/pnas.2509015122. Epub 2025 Jun 30.
3
ConsAMPHemo: A computational framework for predicting hemolysis of antimicrobial peptides based on machine learning approaches.
ConsAMPHemo:一种基于机器学习方法预测抗菌肽溶血作用的计算框架。
Protein Sci. 2025 Jul;34(7):e70087. doi: 10.1002/pro.70087.
4
LncSL: A Novel Stacked Ensemble Computing Tool for Subcellular Localization of lncRNA by Amino Acid-Enhanced Features and Two-Stage Automated Selection Strategy.LncSL:一种通过氨基酸增强特征和两阶段自动选择策略进行长链非编码RNA亚细胞定位的新型堆叠集成计算工具。
Int J Mol Sci. 2024 Dec 23;25(24):13734. doi: 10.3390/ijms252413734.
5
MLAFP-XN: Leveraging neural network model for development of antifungal peptide identification tool.MLAFP-XN:利用神经网络模型开发抗真菌肽识别工具。
Heliyon. 2024 Sep 11;10(18):e37820. doi: 10.1016/j.heliyon.2024.e37820. eCollection 2024 Sep 30.
6
PeptiHub: a curated repository of precisely annotated cancer-related peptides with advanced utilities for peptide exploration and discovery.PeptiHub:一个经过精心整理的癌症相关肽精确注释数据库,具有高级的肽探索和发现工具。
Database (Oxford). 2024 Sep 20;2024. doi: 10.1093/database/baae092.
7
A study of correlations between cephalometric measurements in Koreans with normal occlusion by network analysis.一项通过网络分析研究韩国正常牙合人群头影测量各项指标相关性的研究。
Sci Rep. 2024 Apr 26;14(1):9660. doi: 10.1038/s41598-024-60410-1.
8
Identification of Family-Specific Features in Cas9 and Cas12 Proteins: A Machine Learning Approach Using Complete Protein Feature Spectrum.鉴定Cas9和Cas12蛋白中家族特异性特征:一种使用完整蛋白质特征谱的机器学习方法。
bioRxiv. 2024 Jan 23:2024.01.22.576286. doi: 10.1101/2024.01.22.576286.
9
Prediction of protein-ligand binding affinity with deep learning.利用深度学习预测蛋白质-配体结合亲和力。
Comput Struct Biotechnol J. 2023 Nov 20;21:5796-5806. doi: 10.1016/j.csbj.2023.11.009. eCollection 2023.
10
TROLLOPE: A novel sequence-based stacked approach for the accelerated discovery of linear T-cell epitopes of hepatitis C virus.特罗洛普:一种基于新型序列的堆叠方法,用于加速发现丙型肝炎病毒的线性 T 细胞表位。
PLoS One. 2023 Aug 25;18(8):e0290538. doi: 10.1371/journal.pone.0290538. eCollection 2023.