Suppr超能文献

ANuPP:一种用于预测肽和蛋白质中聚集核区域的多功能工具。

ANuPP: A Versatile Tool to Predict Aggregation Nucleating Regions in Peptides and Proteins.

机构信息

Protein Bioinformatics Lab, Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India.

Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceutical Inc., Ridgefield, CT, USA.

出版信息

J Mol Biol. 2021 May 28;433(11):166707. doi: 10.1016/j.jmb.2020.11.006. Epub 2020 Nov 12.

Abstract

Short aggregation prone sequence motifs can trigger aggregation in peptide and protein sequences. Most algorithms developed so far to identify potential aggregation prone regions (APRs) use amino acid residue composition and/or sequence pattern features. In this work, we have investigated the importance of atomic-level characteristics rather than residue level to understand the initiation of aggregation in proteins and peptides. Using atomic-level features an ensemble-classifier, ANuPP has been developed to predict the aggregation-nucleating regions in peptides and proteins. In a dataset of 1279 hexapeptides, ANuPP achieved an area under the curve (AUC) of 0.831 with 77% accuracy on 10-fold cross-validation and an AUC of 0.883 with 83% accuracy in a blind test dataset of 142 hexapeptides. Further, it showed an average SOV of 48.7% on identifying APR regions in 37 proteins. The performance of ANuPP is better than other methods reported in the literature on both amyloidogenic hexapeptide prediction and APR identification. We have developed a web server for ANuPP and it is available at https://web.iitm.ac.in/bioinfo2/ANuPP/. Insights gained from this work demonstrate the importance of atomic and functional group characteristics towards diversity of atomic level origins as well as mechanisms of protein aggregation.

摘要

短聚集倾向序列基序可引发肽和蛋白质序列的聚集。迄今为止,大多数开发用于识别潜在聚集倾向区域(APR)的算法都使用氨基酸残基组成和/或序列模式特征。在这项工作中,我们研究了原子水平特征的重要性,而不是残基水平,以了解蛋白质和肽中聚集的起始。使用原子水平特征,开发了一个集成分类器 ANuPP,用于预测肽和蛋白质中的聚集引发区域。在包含 1279 个六肽的数据集上,ANuPP 在 10 倍交叉验证中实现了 AUC 为 0.831,准确率为 77%,在包含 142 个六肽的盲测数据集上实现了 AUC 为 0.883,准确率为 83%。此外,它在识别 37 个蛋白质中的 APR 区域方面的 SOV 平均值为 48.7%。ANuPP 的性能优于文献中报道的其他方法,无论是在淀粉样肽预测还是 APR 识别方面。我们已经为 ANuPP 开发了一个网络服务器,网址为 https://web.iitm.ac.in/bioinfo2/ANuPP/。这项工作获得的见解表明,原子和官能团特征对于原子水平起源的多样性以及蛋白质聚集的机制非常重要。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验