ANuPP：一种用于预测肽和蛋白质中聚集核区域的多功能工具。

ANuPP: A Versatile Tool to Predict Aggregation Nucleating Regions in Peptides and Proteins.

机构信息

Protein Bioinformatics Lab, Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India.

Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceutical Inc., Ridgefield, CT, USA.

出版信息

J Mol Biol. 2021 May 28;433(11):166707. doi: 10.1016/j.jmb.2020.11.006. Epub 2020 Nov 12.

DOI:10.1016/j.jmb.2020.11.006

PMID:33972019

Abstract

Short aggregation prone sequence motifs can trigger aggregation in peptide and protein sequences. Most algorithms developed so far to identify potential aggregation prone regions (APRs) use amino acid residue composition and/or sequence pattern features. In this work, we have investigated the importance of atomic-level characteristics rather than residue level to understand the initiation of aggregation in proteins and peptides. Using atomic-level features an ensemble-classifier, ANuPP has been developed to predict the aggregation-nucleating regions in peptides and proteins. In a dataset of 1279 hexapeptides, ANuPP achieved an area under the curve (AUC) of 0.831 with 77% accuracy on 10-fold cross-validation and an AUC of 0.883 with 83% accuracy in a blind test dataset of 142 hexapeptides. Further, it showed an average SOV of 48.7% on identifying APR regions in 37 proteins. The performance of ANuPP is better than other methods reported in the literature on both amyloidogenic hexapeptide prediction and APR identification. We have developed a web server for ANuPP and it is available at https://web.iitm.ac.in/bioinfo2/ANuPP/. Insights gained from this work demonstrate the importance of atomic and functional group characteristics towards diversity of atomic level origins as well as mechanisms of protein aggregation.

摘要

短聚集倾向序列基序可引发肽和蛋白质序列的聚集。迄今为止，大多数开发用于识别潜在聚集倾向区域（APR）的算法都使用氨基酸残基组成和/或序列模式特征。在这项工作中，我们研究了原子水平特征的重要性，而不是残基水平，以了解蛋白质和肽中聚集的起始。使用原子水平特征，开发了一个集成分类器 ANuPP，用于预测肽和蛋白质中的聚集引发区域。在包含 1279 个六肽的数据集上，ANuPP 在 10 倍交叉验证中实现了 AUC 为 0.831，准确率为 77%，在包含 142 个六肽的盲测数据集上实现了 AUC 为 0.883，准确率为 83%。此外，它在识别 37 个蛋白质中的 APR 区域方面的 SOV 平均值为 48.7%。ANuPP 的性能优于文献中报道的其他方法，无论是在淀粉样肽预测还是 APR 识别方面。我们已经为 ANuPP 开发了一个网络服务器，网址为 https://web.iitm.ac.in/bioinfo2/ANuPP/。这项工作获得的见解表明，原子和官能团特征对于原子水平起源的多样性以及蛋白质聚集的机制非常重要。

相似文献

ANuPP: A Versatile Tool to Predict Aggregation Nucleating Regions in Peptides and Proteins.ANuPP：一种用于预测肽和蛋白质中聚集核区域的多功能工具。

J Mol Biol. 2021 May 28;433(11):166707. doi: 10.1016/j.jmb.2020.11.006. Epub 2020 Nov 12.

AbsoluRATE: An in-silico method to predict the aggregation kinetics of native proteins.AbsoluRATE：一种预测天然蛋白质聚集动力学的计算方法。

Biochim Biophys Acta Proteins Proteom. 2021 Sep;1869(9):140682. doi: 10.1016/j.bbapap.2021.140682. Epub 2021 Jun 6.

GAP: towards almost 100 percent prediction for β-strand-mediated aggregating peptides with distinct morphologies.GAP：实现具有不同形态的β-折叠介导聚集肽的近 100%预测。

Bioinformatics. 2014 Jul 15;30(14):1983-90. doi: 10.1093/bioinformatics/btu167. Epub 2014 Mar 28.

Evaluation of in silico tools for the prediction of protein and peptide aggregation on diverse datasets.评估不同数据集上用于预测蛋白质和肽聚集的计算工具。

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab240.

CPAD 2.0: a repository of curated experimental data on aggregating proteins and peptides.CPAD 2.0：一个聚集蛋白和肽的实验数据的精选知识库。

Amyloid. 2020 Jun;27(2):128-133. doi: 10.1080/13506129.2020.1715363. Epub 2020 Jan 24.

Aggregation prone regions in human proteome: Insights from large-scale data analyses.人类蛋白质组中易于聚集的区域：来自大规模数据分析的见解。

Proteins. 2017 Jun;85(6):1099-1118. doi: 10.1002/prot.25276. Epub 2017 Mar 24.

AggreProt: a web server for predicting and engineering aggregation prone regions in proteins.AggreProt：一个用于预测和设计蛋白质中易于聚集区域的网络服务器。

Nucleic Acids Res. 2024 Jul 5;52(W1):W159-W169. doi: 10.1093/nar/gkae420.

Exploring the sequence features determining amyloidosis in human antibody light chains.探索决定人抗体轻链淀粉样变性的序列特征。

Sci Rep. 2021 Jul 2;11(1):13785. doi: 10.1038/s41598-021-93019-9.

Exploiting heterogeneous features to improve in silico prediction of peptide status - amyloidogenic or non-amyloidogenic.挖掘异质特征以提高肽状态（淀粉样变性或非淀粉样变性）的计算预测。

BMC Bioinformatics. 2011;12 Suppl 13(Suppl 13):S21. doi: 10.1186/1471-2105-12-S13-S21. Epub 2011 Nov 30.

CPAD, Curated Protein Aggregation Database: A Repository of Manually Curated Experimental Data on Protein and Peptide Aggregation.CPAD，精心整理的蛋白质聚集数据库：一个关于蛋白质和肽聚集的手动整理实验数据的存储库。

PLoS One. 2016 Apr 4;11(4):e0152949. doi: 10.1371/journal.pone.0152949. eCollection 2016.

引用本文的文献

Chemical Evolution of Early Macromolecules: From Prebiotic Oligopeptides to Self-Organizing Biosystems via Amyloid Formation.早期大分子的化学演化：从益生元寡肽通过淀粉样蛋白形成到自组织生物系统

Chemistry. 2025 May 22;31(29):e202404669. doi: 10.1002/chem.202404669. Epub 2025 May 2.

Proteolysis-Based Biomarker Repertoire of the Neurofilament Proteome.基于蛋白水解作用的神经丝蛋白质组生物标志物全集

J Neurochem. 2025 Mar;169(3):e70023. doi: 10.1111/jnc.70023.

Predicting amyloid proteins using attention-based long short-term memory.使用基于注意力机制的长短期记忆网络预测淀粉样蛋白。

PeerJ Comput Sci. 2025 Feb 7;11:e2660. doi: 10.7717/peerj-cs.2660. eCollection 2025.

Investigating Local Sequence-Structural Attributes of Amyloidogenic Light Chain Variable Domains.研究淀粉样轻链可变结构域的局部序列-结构属性

Proteins. 2025 Mar 4. doi: 10.1002/prot.26815.

AggNet: Advancing protein aggregation analysis through deep learning and protein language model.AggNet：通过深度学习和蛋白质语言模型推进蛋白质聚集分析。

Protein Sci. 2025 Feb;34(2):e70031. doi: 10.1002/pro.70031.

Prediction and Evaluation of Protein Aggregation with Computational Methods.运用计算方法预测和评估蛋白质聚集

Methods Mol Biol. 2025;2867:299-314. doi: 10.1007/978-1-0716-4196-5_17.

iAmyP: A Multi-view Learning for Amyloidogenic Hexapeptides Identification Based on Sequence Least Squares Programming.iAmyP：基于序列最小二乘规划的淀粉样生成六肽识别多视图学习

Interdiscip Sci. 2025 Jun;17(2):277-292. doi: 10.1007/s12539-024-00666-3. Epub 2024 Nov 15.

Proteomic Evidence for Amyloidogenic Cross-Seeding in Fibrinaloid Microclots.纤维蛋白原样微栓中淀粉样蛋白形成的蛋白组学证据

Int J Mol Sci. 2024 Oct 8;25(19):10809. doi: 10.3390/ijms251910809.

Unravelling aggregation propensity of rotavirus A VP6 expressed as E. coli inclusion bodies through in silico prediction.通过计算机预测阐明轮状病毒 A VP6 在大肠杆菌包涵体中的聚集倾向。

Sci Rep. 2024 Sep 13;14(1):21464. doi: 10.1038/s41598-024-69896-1.

AggreProt: a web server for predicting and engineering aggregation prone regions in proteins.AggreProt：一个用于预测和设计蛋白质中易于聚集区域的网络服务器。

Nucleic Acids Res. 2024 Jul 5;52(W1):W159-W169. doi: 10.1093/nar/gkae420.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

ANuPP：一种用于预测肽和蛋白质中聚集核区域的多功能工具。

ANuPP: A Versatile Tool to Predict Aggregation Nucleating Regions in Peptides and Proteins.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献