六肽中的简洁淀粉样蛋白和非淀粉样蛋白模式

Succinct Amyloid and Nonamyloid Patterns in Hexapeptides.

作者信息

Keresztes László, Szögi Evelin, Varga Bálint, Farkas Viktor, Perczel András, Grolmusz Vince

机构信息

PIT Bioinformatics Group, Eötvös University, Budapest H-1117, Hungary.

MTA-ELTE Protein Modeling Research Group, Budapest H-1117, Hungary.

出版信息

ACS Omega. 2022 Sep 27;7(40):35532-35537. doi: 10.1021/acsomega.2c02513. eCollection 2022 Oct 11.

DOI:10.1021/acsomega.2c02513

PMID:36249386

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9558248/

Abstract

Hexapeptides are widely applied as a model system for studying the amyloid-forming properties of polypeptides, including proteins. Recently, large experimental databases have become publicly available with amyloidogenic labels. Using these data sets for training and testing purposes, one may build artificial intelligence (AI)-based classifiers for predicting the amyloid state of peptides. In our previous work ( , , 500), we described the Support Vector Machine (SVM)-based Budapest Amyloid Predictor (https://pitgroup.org/bap). Here, we apply the Budapest Amyloid Predictor for discovering numerous amyloidogenic and nonamyloidogenic hexapeptide patterns with accuracy between 80% and 84%, as surprising and succinct novel rules for further understanding the amyloid state of peptides. For example, we have shown that for any independently mutated residue (position marked by "x"), the patterns CxFLWx, FxFLFx, or xxIVIV are predicted to be amyloidogenic, while those of PxDxxx, xxKxEx, and xxPQxx are nonamyloidogenic. We note that each amyloidogenic pattern with two x's (e.g.,CxFLWx) describes succinctly 20 = 400 hexapeptides, while the nonamyloidogenic patterns comprising four point mutations (e.g.,PxDxxx) give 20 = 160 000 hexapeptides in total. We also examine the restricted substitutions for positions "x" from subclasses of proteinogenic amino acid residues; for example, if "x" is substituted with hydrophobic amino acids, then there exist patterns containing three x's, like MxVVxx, predicted to be amyloidogenic. If we can choose for the x positions any hydrophobic amino acids, except the "structure breaker" proline, then we get amyloid patterns with five x positions, for example, xxxFxx, each corresponding to 32 768 hexapeptides. To our knowledge, no similar applications of artificial intelligence tools or succinct amyloid patterns were described before the present work.

摘要

六肽被广泛用作研究多肽（包括蛋白质）形成淀粉样蛋白特性的模型系统。最近，带有淀粉样蛋白生成标签的大型实验数据库已公开可用。利用这些数据集进行训练和测试，人们可以构建基于人工智能（AI）的分类器来预测肽的淀粉样状态。在我们之前的工作（，，500）中，我们描述了基于支持向量机（SVM）的布达佩斯淀粉样蛋白预测器（https://pitgroup.org/bap）。在这里，我们应用布达佩斯淀粉样蛋白预测器来发现众多淀粉样生成和非淀粉样生成的六肽模式，准确率在80%至84%之间，这些规则令人惊讶且简洁，有助于进一步理解肽的淀粉样状态。例如，我们已经表明，对于任何独立突变的残基（位置用“x”标记），模式CxFLWx、FxFLFx或xxIVIV被预测为淀粉样生成，而PxDxxx、xxKxEx和xxPQxx则是非淀粉样生成。我们注意到，每个带有两个x的淀粉样生成模式（例如CxFLWx）简洁地描述了20² = 400种六肽，而包含四个点突变的非淀粉样生成模式（例如PxDxxx）总共给出20⁴ = 160 000种六肽。我们还研究了来自蛋白质ogenic氨基酸残基亚类对“x”位置的受限取代；例如，如果“x”被疏水氨基酸取代，那么存在包含三个x的模式，如MxVVxx，被预测为淀粉样生成。如果我们可以为x位置选择任何疏水氨基酸，除了“结构破坏者”脯氨酸，那么我们会得到带有五个x位置的淀粉样模式，例如xxxFxx，每个对应32 768种六肽。据我们所知，在本工作之前，没有描述过人工智能工具的类似应用或简洁的淀粉样模式。

相似文献

Succinct Amyloid and Nonamyloid Patterns in Hexapeptides.六肽中的简洁淀粉样蛋白和非淀粉样蛋白模式

ACS Omega. 2022 Sep 27;7(40):35532-35537. doi: 10.1021/acsomega.2c02513. eCollection 2022 Oct 11.

FISH Amyloid - a new method for finding amyloidogenic segments in proteins based on site specific co-occurrence of aminoacids.FISH 淀粉样变——一种基于氨基酸特定共现的发现蛋白质中淀粉样肽段的新方法。

BMC Bioinformatics. 2014 Feb 24;15:54. doi: 10.1186/1471-2105-15-54.

The Budapest Amyloid Predictor and Its Applications.布达佩斯淀粉样变预测器及其应用。

Biomolecules. 2021 Mar 26;11(4):500. doi: 10.3390/biom11040500.

On the amyloid datasets used for training PAFIG--how (not) to extend the experimental dataset of hexapeptides.用于训练 PAFIG 的淀粉样蛋白数据集——如何（不）扩展六肽的实验数据集。

BMC Bioinformatics. 2013 Dec 4;14:351. doi: 10.1186/1471-2105-14-351.

Machine learning methods can replace 3D profile method in classification of amyloidogenic hexapeptides.机器学习方法可替代 3D 构象分析法用于淀粉样六肽的分类。

BMC Bioinformatics. 2013 Jan 17;14:21. doi: 10.1186/1471-2105-14-21.

WALTZ-DB: a benchmark database of amyloidogenic hexapeptides.WALTZ-DB：淀粉样肽的基准数据库。

Bioinformatics. 2015 May 15;31(10):1698-700. doi: 10.1093/bioinformatics/btv027. Epub 2015 Jan 18.

Amyloid fibril formation propensity is inherent into the hexapeptide tandemly repeating sequence of the central domain of silkmoth chorion proteins of the A-family.淀粉样纤维形成倾向内在地存在于家蚕A族绒毛膜蛋白中心结构域的六肽串联重复序列中。

J Struct Biol. 2006 Dec;156(3):480-8. doi: 10.1016/j.jsb.2006.08.011. Epub 2006 Sep 5.

Machine learning study of classifiers trained with biophysiochemical properties of amino acids to predict fibril forming Peptide motifs.利用氨基酸的生物物理化学性质训练分类器以预测纤维形成肽基序的机器学习研究。

Protein Pept Lett. 2012 Sep;19(9):917-23. doi: 10.2174/092986612802084429.

Cooperativity among short amyloid stretches in long amyloidogenic sequences.长淀粉样序列中短淀粉样伸展的协同作用。

PLoS One. 2012;7(6):e39369. doi: 10.1371/journal.pone.0039369. Epub 2012 Jun 22.

Exploiting heterogeneous features to improve in silico prediction of peptide status - amyloidogenic or non-amyloidogenic.挖掘异质特征以提高肽状态（淀粉样变性或非淀粉样变性）的计算预测。

BMC Bioinformatics. 2011;12 Suppl 13(Suppl 13):S21. doi: 10.1186/1471-2105-12-S13-S21. Epub 2011 Nov 30.

引用本文的文献

iAmyP: A Multi-view Learning for Amyloidogenic Hexapeptides Identification Based on Sequence Least Squares Programming.iAmyP：基于序列最小二乘规划的淀粉样生成六肽识别多视图学习

Interdiscip Sci. 2025 Jun;17(2):277-292. doi: 10.1007/s12539-024-00666-3. Epub 2024 Nov 15.

Proteomic Evidence for Amyloidogenic Cross-Seeding in Fibrinaloid Microclots.纤维蛋白原样微栓中淀粉样蛋白形成的蛋白组学证据

Int J Mol Sci. 2024 Oct 8;25(19):10809. doi: 10.3390/ijms251910809.

本文引用的文献

Identifying super-feminine, super-masculine and sex-defining connections in the human braingraph.识别人类脑图谱中超级女性化、超级男性化和性别定义连接。

Cogn Neurodyn. 2021 Dec;15(6):949-959. doi: 10.1007/s11571-021-09687-w. Epub 2021 Jul 15.

On the border of the amyloidogenic sequences: prefix analysis of the parallel beta sheets in the PDB_Amyloid collection.在淀粉样序列的边界处：PDB_Amyloid 集合中平行 β 片层的前缀分析。

J Integr Bioinform. 2021 Jul 26;19(1):20200043. doi: 10.1515/jib-2020-0043.

CRISPR-Cas9 In Vivo Gene Editing for Transthyretin Amyloidosis.CRISPR-Cas9 体内基因编辑治疗转甲状腺素蛋白淀粉样变性。

N Engl J Med. 2021 Aug 5;385(6):493-502. doi: 10.1056/NEJMoa2107454. Epub 2021 Jun 26.

The Budapest Amyloid Predictor and Its Applications.布达佩斯淀粉样变预测器及其应用。

Biomolecules. 2021 Mar 26;11(4):500. doi: 10.3390/biom11040500.

Computational prediction of protein aggregation: Advances in proteomics, conformation-specific algorithms and biotechnological applications.蛋白质聚集的计算预测：蛋白质组学、构象特异性算法及生物技术应用的进展

Comput Struct Biotechnol J. 2020 Jun 10;18:1403-1413. doi: 10.1016/j.csbj.2020.05.026. eCollection 2020.

Reverse engineering synthetic antiviral amyloids.反向工程合成抗病毒淀粉样蛋白。

Nat Commun. 2020 Jun 5;11(1):2832. doi: 10.1038/s41467-020-16721-8.

The Route from the Folded to the Amyloid State: Exploring the Potential Energy Surface of a Drug-Like Miniprotein.从折叠态到淀粉样态的途径：探索类药小蛋白的势能表面。

Chemistry. 2020 Feb 11;26(9):1968-1978. doi: 10.1002/chem.201903826. Epub 2019 Dec 27.

Protein Aggregation in a Nutshell: The Splendid Molecular Architecture of the Dreaded Amyloid Fibrils.蛋白质聚集简述：可怕的淀粉样纤维的精彩分子结构。

Curr Protein Pept Sci. 2019;20(11):1077-1088. doi: 10.2174/1389203720666190925102832.

WALTZ-DB 2.0: an updated database containing structural information of experimentally determined amyloid-forming peptides.WALTZ-DB 2.0：一个更新的数据库，包含实验确定的淀粉样肽形成的结构信息。

Nucleic Acids Res. 2020 Jan 8;48(D1):D389-D393. doi: 10.1093/nar/gkz758.

PDB_Amyloid: an extended live amyloid structure list from the PDB.PDB_淀粉样蛋白：来自 PDB 的扩展的活体淀粉样蛋白结构列表。

FEBS Open Bio. 2018 Nov 22;9(1):185-190. doi: 10.1002/2211-5463.12524. eCollection 2019 Jan.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

六肽中的简洁淀粉样蛋白和非淀粉样蛋白模式

Succinct Amyloid and Nonamyloid Patterns in Hexapeptides.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献