• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

氨基酸理化生化性质的模糊聚类。

Fuzzy clustering of physicochemical and biochemical properties of amino acids.

机构信息

Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, 02-106 Warsaw, Poland.

出版信息

Amino Acids. 2012 Aug;43(2):583-94. doi: 10.1007/s00726-011-1106-9. Epub 2011 Oct 13.

DOI:10.1007/s00726-011-1106-9
PMID:21993537
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3397137/
Abstract

In this article, we categorize presently available experimental and theoretical knowledge of various physicochemical and biochemical features of amino acids, as collected in the AAindex database of known 544 amino acid (AA) indices. Previously reported 402 indices were categorized into six groups using hierarchical clustering technique and 142 were left unclustered. However, due to the increasing diversity of the database these indices are overlapping, therefore crisp clustering method may not provide optimal results. Moreover, in various large-scale bioinformatics analyses of whole proteomes, the proper selection of amino acid indices representing their biological significance is crucial for efficient and error-prone encoding of the short functional sequence motifs. In most cases, researchers perform exhaustive manual selection of the most informative indices. These two facts motivated us to analyse the widely used AA indices. The main goal of this article is twofold. First, we present a novel method of partitioning the bioinformatics data using consensus fuzzy clustering, where the recently proposed fuzzy clustering techniques are exploited. Second, we prepare three high quality subsets of all available indices. Superiority of the consensus fuzzy clustering method is demonstrated quantitatively, visually and statistically by comparing it with the previously proposed hierarchical clustered results. The processed AAindex1 database, supplementary material and the software are available at http://sysbio.icm.edu.pl/aaindex/ .

摘要

在本文中,我们对已知的 544 种氨基酸(AA)指数的 AAindex 数据库中收集到的各种物理化学和生化特性的现有实验和理论知识进行了分类。先前报道的 402 个指数使用层次聚类技术分为六组,还有 142 个未聚类。然而,由于数据库的多样性不断增加,这些指数存在重叠,因此,清晰聚类方法可能无法提供最佳结果。此外,在对整个蛋白质组进行各种大规模的生物信息学分析时,选择代表其生物学意义的氨基酸指数对于有效且易于出错地对短功能序列基元进行编码至关重要。在大多数情况下,研究人员会手动选择最具信息量的指数。这两个事实促使我们对广泛使用的 AA 指数进行分析。本文的主要目标有两个。首先,我们提出了一种使用共识模糊聚类对生物信息学数据进行分区的新方法,其中利用了最近提出的模糊聚类技术。其次,我们准备了所有可用指数的三个高质量子集。通过与先前提出的层次聚类结果进行定量、可视化和统计学比较,证明了共识模糊聚类方法的优越性。处理后的 AAindex1 数据库、补充材料和软件可在 http://sysbio.icm.edu.pl/aaindex/ 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c62d/3397137/bd532fc0a54d/726_2011_1106_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c62d/3397137/871154f76e4a/726_2011_1106_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c62d/3397137/c0f3dae0ea80/726_2011_1106_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c62d/3397137/30a608ed2b02/726_2011_1106_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c62d/3397137/076ab5c60ae1/726_2011_1106_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c62d/3397137/643689d51837/726_2011_1106_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c62d/3397137/55e9034b54e0/726_2011_1106_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c62d/3397137/bd532fc0a54d/726_2011_1106_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c62d/3397137/871154f76e4a/726_2011_1106_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c62d/3397137/c0f3dae0ea80/726_2011_1106_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c62d/3397137/30a608ed2b02/726_2011_1106_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c62d/3397137/076ab5c60ae1/726_2011_1106_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c62d/3397137/643689d51837/726_2011_1106_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c62d/3397137/55e9034b54e0/726_2011_1106_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c62d/3397137/bd532fc0a54d/726_2011_1106_Fig7_HTML.jpg

相似文献

1
Fuzzy clustering of physicochemical and biochemical properties of amino acids.氨基酸理化生化性质的模糊聚类。
Amino Acids. 2012 Aug;43(2):583-94. doi: 10.1007/s00726-011-1106-9. Epub 2011 Oct 13.
2
Predicting and analyzing DNA-binding domains using a systematic approach to identifying a set of informative physicochemical and biochemical properties.使用系统方法预测和分析 DNA 结合域,以确定一组有意义的物理化学和生化特性。
BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S47. doi: 10.1186/1471-2105-12-S1-S47.
3
Combining Pareto-optimal clusters using supervised learning for identifying co-expressed genes.使用监督学习组合帕累托最优聚类以识别共表达基因。
BMC Bioinformatics. 2009 Jan 20;10:27. doi: 10.1186/1471-2105-10-27.
4
AAindex: amino acid index database, progress report 2008.AAindex:氨基酸索引数据库,2008年进展报告。
Nucleic Acids Res. 2008 Jan;36(Database issue):D202-5. doi: 10.1093/nar/gkm998. Epub 2007 Nov 12.
5
AAindex: Amino Acid Index Database.AAindex:氨基酸索引数据库。
Nucleic Acids Res. 1999 Jan 1;27(1):368-9. doi: 10.1093/nar/27.1.368.
6
AAindex: amino acid index database.AAindex:氨基酸索引数据库。
Nucleic Acids Res. 2000 Jan 1;28(1):374. doi: 10.1093/nar/28.1.374.
7
Use of fuzzy clustering technique and matrices to classify amino acids and its impact to Chou's pseudo amino acid composition.使用模糊聚类技术和矩阵对氨基酸进行分类及其对周氏伪氨基酸组成的影响。
J Theor Biol. 2009 Mar 7;257(1):17-26. doi: 10.1016/j.jtbi.2008.11.003. Epub 2008 Nov 12.
8
An alignment-free measure based on physicochemical properties of amino acids for protein sequence comparison.一种基于氨基酸理化性质的序列比对无标度测度方法。
Comput Biol Chem. 2019 Jun;80:10-15. doi: 10.1016/j.compbiolchem.2019.01.005. Epub 2019 Jan 18.
9
Fuzzy cluster analysis of simple physicochemical properties of amino acids for recognizing secondary structure in proteins.用于识别蛋白质二级结构的氨基酸简单物理化学性质的模糊聚类分析。
Protein Sci. 1995 Jun;4(6):1178-87. doi: 10.1002/pro.5560040616.
10
Cluster analysis of amino acid indices for prediction of protein structure and function.用于预测蛋白质结构和功能的氨基酸指数聚类分析。
Protein Eng. 1988 Jul;2(2):93-100. doi: 10.1093/protein/2.2.93.

引用本文的文献

1
Nphos: Database and Predictor of Protein N-phosphorylation.Nphos:蛋白质 N-磷酸化数据库和预测器。
Genomics Proteomics Bioinformatics. 2024 Sep 13;22(3). doi: 10.1093/gpbjnl/qzae032.
2
PepNet: an interpretable neural network for anti-inflammatory and antimicrobial peptides prediction using a pre-trained protein language model.PepNet:一种基于预训练蛋白质语言模型的可解释神经网络,用于预测抗炎和抗菌肽。
Commun Biol. 2024 Sep 28;7(1):1198. doi: 10.1038/s42003-024-06911-1.
3
Exploiting the Role of Features for Antigens-Antibodies Interaction Site Prediction.

本文引用的文献

1
A novel method for similarity analysis and protein sub-cellular localization prediction.一种用于相似性分析和蛋白质亚细胞定位预测的新方法。
Bioinformatics. 2010 Nov 1;26(21):2678-83. doi: 10.1093/bioinformatics/btq521. Epub 2010 Sep 8.
2
VoteDock: consensus docking method for prediction of protein-ligand interactions.VoteDock:用于预测蛋白质-配体相互作用的共识对接方法。
J Comput Chem. 2011 Mar;32(4):568-81. doi: 10.1002/jcc.21642. Epub 2010 Sep 1.
3
Can we trust docking results? Evaluation of seven commonly used programs on PDBbind database.
挖掘特征在抗原-抗体相互作用位点预测中的作用。
Methods Mol Biol. 2024;2780:303-325. doi: 10.1007/978-1-0716-3985-6_16.
4
Recent Progress in Antibody Epitope Prediction.抗体表位预测的最新进展
Antibodies (Basel). 2023 Aug 8;12(3):52. doi: 10.3390/antib12030052.
5
TriNet: A tri-fusion neural network for the prediction of anticancer and antimicrobial peptides.TriNet:一种用于预测抗癌和抗菌肽的三融合神经网络。
Patterns (N Y). 2023 Feb 28;4(3):100702. doi: 10.1016/j.patter.2023.100702. eCollection 2023 Mar 10.
6
Generalized Property-Based Encoders and Digital Signal Processing Facilitate Predictive Tasks in Protein Engineering.基于广义属性的编码器和数字信号处理助力蛋白质工程中的预测任务。
Front Mol Biosci. 2022 Jul 14;9:898627. doi: 10.3389/fmolb.2022.898627. eCollection 2022.
7
Prediction of DNA-Binding Protein-Drug-Binding Sites Using Residue Interaction Networks and Sequence Feature.利用残基相互作用网络和序列特征预测DNA结合蛋白-药物结合位点
Front Bioeng Biotechnol. 2022 Apr 20;10:822392. doi: 10.3389/fbioe.2022.822392. eCollection 2022.
8
The Structural Determinants of Intra-Protein Compensatory Substitutions.蛋白质内补偿性替换的结构决定因素。
Mol Biol Evol. 2022 Apr 11;39(4). doi: 10.1093/molbev/msac063.
9
Hierarchical representation for PPI sites prediction.蛋白质相互作用位点预测的层次表示。
BMC Bioinformatics. 2022 Mar 20;23(1):96. doi: 10.1186/s12859-022-04624-y.
10
Some theoretical aspects of reprogramming the standard genetic code.重编程标准遗传密码的一些理论方面。
Genetics. 2021 May 17;218(1). doi: 10.1093/genetics/iyab040.
我们能相信对接结果吗?对 PDBbind 数据库上七个常用程序的评估。
J Comput Chem. 2011 Mar;32(4):742-55. doi: 10.1002/jcc.21643. Epub 2010 Sep 1.
4
PROlocalizer: integrated web service for protein subcellular localization prediction.PROlocalizer:用于蛋白质亚细胞定位预测的集成网络服务。
Amino Acids. 2011 Mar;40(3):975-80. doi: 10.1007/s00726-010-0724-y. Epub 2010 Sep 2.
5
A study of entropy/clarity of genetic sequences using metric spaces and fuzzy sets.利用度量空间和模糊集研究遗传序列的熵/清晰度。
J Theor Biol. 2010 Nov 7;267(1):95-105. doi: 10.1016/j.jtbi.2010.08.010. Epub 2010 Aug 11.
6
Distance-dependent classification of amino acids by information theory.基于信息论的氨基酸距离相关分类。
Proteins. 2010 Aug 1;78(10):2322-8. doi: 10.1002/prot.22744.
7
AMS 3.0: prediction of post-translational modifications.AMS 3.0:预测翻译后修饰。
BMC Bioinformatics. 2010 Apr 28;11:210. doi: 10.1186/1471-2105-11-210.
8
Bilateral similarity function: a novel and universal method for similarity analysis of biological sequences.双侧相似性函数:一种新颖且通用的生物序列相似性分析方法。
J Theor Biol. 2010 Jul 21;265(2):194-201. doi: 10.1016/j.jtbi.2010.04.013. Epub 2010 Apr 21.
9
Protein classification using texture descriptors extracted from the protein backbone image.基于蛋白质骨架图像提取的纹理描述子进行蛋白质分类。
J Theor Biol. 2010 Jun 7;264(3):1024-32. doi: 10.1016/j.jtbi.2010.03.020. Epub 2010 Mar 20.
10
Use of amino acid composition to predict epitope residues of individual antibodies.利用氨基酸组成预测单克隆抗体的抗原表位残基。
Protein Eng Des Sel. 2010 Jun;23(6):441-8. doi: 10.1093/protein/gzq014. Epub 2010 Mar 19.