• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

氨基酸字母表精简保留了蛋白质中接触相互作用所包含的折叠信息。

Amino acid alphabet reduction preserves fold information contained in contact interactions in proteins.

作者信息

Solis Armando D

机构信息

Biological Sciences Department, New York City College of Technology, the City University of New York (CUNY), Brooklyn, New York, 11201.

出版信息

Proteins. 2015 Dec;83(12):2198-216. doi: 10.1002/prot.24936.

DOI:10.1002/prot.24936
PMID:26407535
Abstract

To reduce complexity, understand generalized rules of protein folding, and facilitate de novo protein design, the 20-letter amino acid alphabet is commonly reduced to a smaller alphabet by clustering amino acids based on some measure of similarity. In this work, we seek the optimal alphabet that preserves as much of the structural information found in long-range (contact) interactions among amino acids in natively-folded proteins. We employ the Information Maximization Device, based on information theory, to partition the amino acids into well-defined clusters. Numbering from 2 to 19 groups, these optimal clusters of amino acids, while generated automatically, embody well-known properties of amino acids such as hydrophobicity/polarity, charge, size, and aromaticity, and are demonstrated to maintain the discriminative power of long-range interactions with minimal loss of mutual information. Our measurements suggest that reduced alphabets (of less than 10) are able to capture virtually all of the information residing in native contacts and may be sufficient for fold recognition, as demonstrated by extensive threading tests. In an expansive survey of the literature, we observe that alphabets derived from various approaches-including those derived from physicochemical intuition, local structure considerations, and sequence alignments of remote homologs-fare consistently well in preserving contact interaction information, highlighting a convergence in the various factors thought to be relevant to the folding code. Moreover, we find that alphabets commonly used in experimental protein design are nearly optimal and are largely coherent with observations that have arisen in this work.

摘要

为了降低复杂性、理解蛋白质折叠的一般规则并促进从头蛋白质设计,通常通过基于某种相似性度量对氨基酸进行聚类,将由20种字母组成的氨基酸字母表简化为更小的字母表。在这项工作中,我们寻找最优字母表,以保留天然折叠蛋白质中氨基酸之间长程(接触)相互作用中发现的尽可能多的结构信息。我们采用基于信息论的信息最大化装置,将氨基酸划分为定义明确的簇。从2到19个组进行编号,这些最优的氨基酸簇虽然是自动生成的,但体现了氨基酸的众所周知的性质,如疏水性/极性、电荷、大小和芳香性,并被证明在最小化互信息损失的情况下保持长程相互作用的判别能力。我们的测量表明,减少后的字母表(少于10个)能够捕获几乎所有存在于天然接触中的信息,并且如广泛的穿线测试所示,可能足以用于折叠识别。在对文献的广泛调查中,我们观察到,从各种方法中得出的字母表——包括那些从物理化学直觉、局部结构考虑以及远缘同源物的序列比对中得出的字母表——在保留接触相互作用信息方面一直表现良好,这突出了被认为与折叠密码相关的各种因素的趋同性。此外,我们发现实验性蛋白质设计中常用的字母表几乎是最优的,并且在很大程度上与这项工作中出现的观察结果一致。

相似文献

1
Amino acid alphabet reduction preserves fold information contained in contact interactions in proteins.氨基酸字母表精简保留了蛋白质中接触相互作用所包含的折叠信息。
Proteins. 2015 Dec;83(12):2198-216. doi: 10.1002/prot.24936.
2
Accuracy of sequence alignment and fold assessment using reduced amino acid alphabets.使用简化氨基酸字母表进行序列比对和折叠评估的准确性。
Proteins. 2006 Jun 1;63(4):986-95. doi: 10.1002/prot.20881.
3
Reduced alphabet of prebiotic amino acids optimally encodes the conformational space of diverse extant protein folds.简化的前生物氨基酸字母表最优地编码了不同现存蛋白质折叠的构象空间。
BMC Evol Biol. 2019 Jul 30;19(1):158. doi: 10.1186/s12862-019-1464-6.
4
Automated alphabet reduction for protein datasets.蛋白质数据集的自动字母缩减
BMC Bioinformatics. 2009 Jan 6;10:6. doi: 10.1186/1471-2105-10-6.
5
Reduced alphabet for protein folding prediction.用于蛋白质折叠预测的简化字母表。
Proteins. 2015 Apr;83(4):631-9. doi: 10.1002/prot.24762. Epub 2015 Feb 5.
6
Simplified amino acid alphabets for protein fold recognition and implications for folding.用于蛋白质折叠识别的简化氨基酸字母表及其对折叠的影响。
Protein Eng. 2000 Mar;13(3):149-52. doi: 10.1093/protein/13.3.149.
7
Distance-dependent classification of amino acids by information theory.基于信息论的氨基酸距离相关分类。
Proteins. 2010 Aug 1;78(10):2322-8. doi: 10.1002/prot.22744.
8
Folding alphabets.折叠字母表。
Nat Struct Biol. 1999 Nov;6(11):994-6. doi: 10.1038/14876.
9
Reduced amino acid alphabets exhibit an improved sensitivity and selectivity in fold assignment.简化氨基酸字母表在折叠分配中表现出更高的灵敏度和选择性。
Bioinformatics. 2009 Jun 1;25(11):1356-62. doi: 10.1093/bioinformatics/btp164. Epub 2009 Apr 7.
10
Protein Folding Prediction in a Cubic Lattice in Hydrophobic-Polar Model.疏水-极性模型中立方晶格中的蛋白质折叠预测
J Comput Biol. 2017 May;24(5):412-421. doi: 10.1089/cmb.2016.0181. Epub 2016 Nov 30.

引用本文的文献

1
Discovery of antimicrobial peptides in the global microbiome with machine learning.利用机器学习在全球微生物组中发现抗菌肽。
Cell. 2024 Jul 11;187(14):3761-3778.e16. doi: 10.1016/j.cell.2024.05.013. Epub 2024 Jun 5.
2
Computational exploration of the global microbiome for antibiotic discovery.用于抗生素发现的全球微生物组的计算探索。
bioRxiv. 2023 Sep 11:2023.08.31.555663. doi: 10.1101/2023.08.31.555663.
3
An engineered T7 RNA polymerase that produces mRNA free of immunostimulatory byproducts.一种工程化的 T7 RNA 聚合酶,可产生无免疫刺激性副产物的 mRNA。
Nat Biotechnol. 2023 Apr;41(4):560-568. doi: 10.1038/s41587-022-01525-6. Epub 2022 Nov 10.
4
Research progress of reduced amino acid alphabets in protein analysis and prediction.蛋白质分析与预测中简化氨基酸字母表的研究进展
Comput Struct Biotechnol J. 2022 Jul 4;20:3503-3510. doi: 10.1016/j.csbj.2022.07.001. eCollection 2022.
5
Immunoglobulin Classification Based on FC* and GC* Features.基于Fc*和Gc*特征的免疫球蛋白分类
Front Genet. 2022 Jan 24;12:827161. doi: 10.3389/fgene.2021.827161. eCollection 2021.
6
Accurate annotation of protein coding sequences with IDTAXA.使用IDTAXA对蛋白质编码序列进行准确注释。
NAR Genom Bioinform. 2021 Sep 16;3(3):lqab080. doi: 10.1093/nargab/lqab080. eCollection 2021 Sep.
7
Antisense Peptide Technology for Diagnostic Tests and Bioengineering Research.反义肽技术在诊断测试和生物工程研究中的应用。
Int J Mol Sci. 2021 Aug 24;22(17):9106. doi: 10.3390/ijms22179106.
8
ANPrAod: Identify Antioxidant Proteins by Fusing Amino Acid Clustering Strategy and -Peptide Combination.ANPrAod:通过融合氨基酸聚类策略和 - 肽组合来鉴定抗氧化蛋白。
Comput Math Methods Med. 2021 Apr 8;2021:5518209. doi: 10.1155/2021/5518209. eCollection 2021.
9
A Simplified Amino Acidic Alphabet to Unveil the T-Cells Receptors Antigens: A Computational Perspective.一种简化的氨基酸字母表以揭示T细胞受体抗原:计算视角
Front Chem. 2021 Feb 25;9:598802. doi: 10.3389/fchem.2021.598802. eCollection 2021.
10
IHEC_RAAC: a online platform for identifying human enzyme classes via reduced amino acid cluster strategy.IHEC\_RAAC:一种通过简化氨基酸簇策略来鉴定人类酶类的在线平台。
Amino Acids. 2021 Feb;53(2):239-251. doi: 10.1007/s00726-021-02941-9. Epub 2021 Jan 23.