• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

UniDL4BioPep:用于肽生物活性二元分类的通用深度学习架构。

UniDL4BioPep: a universal deep learning architecture for binary classification in peptide bioactivity.

机构信息

Department of Grain Science and Industry, Kansas State University, Manhattan, KS 66506, USA.

Department of Computer Science, Kansas State University, Manhattan, KS 66506, USA.

出版信息

Brief Bioinform. 2023 May 19;24(3). doi: 10.1093/bib/bbad135.

DOI:10.1093/bib/bbad135
PMID:37020337
Abstract

Identification of potent peptides through model prediction can reduce benchwork in wet experiments. However, the conventional process of model buildings can be complex and time consuming due to challenges such as peptide representation, feature selection, model selection and hyperparameter tuning. Recently, advanced pretrained deep learning-based language models (LMs) have been released for protein sequence embedding and applied to structure and function prediction. Based on these developments, we have developed UniDL4BioPep, a universal deep-learning model architecture for transfer learning in bioactive peptide binary classification modeling. It can directly assist users in training a high-performance deep-learning model with a fixed architecture and achieve cutting-edge performance to meet the demands in efficiently novel bioactive peptide discovery. To the best of our best knowledge, this is the first time that a pretrained biological language model is utilized for peptide embeddings and successfully predicts peptide bioactivities through large-scale evaluations of those peptide embeddings. The model was also validated through uniform manifold approximation and projection analysis. By combining the LM with a convolutional neural network, UniDL4BioPep achieved greater performances than the respective state-of-the-art models for 15 out of 20 different bioactivity dataset prediction tasks. The accuracy, Mathews correlation coefficient and area under the curve were 0.7-7, 1.23-26.7 and 0.3-25.6% higher, respectively. A user-friendly web server of UniDL4BioPep for the tested bioactivities is established and freely accessible at https://nepc2pvmzy.us-east-1.awsapprunner.com. The source codes, datasets and templates of UniDL4BioPep for other bioactivity fitting and prediction tasks are available at https://github.com/dzjxzyd/UniDL4BioPep.

摘要

通过模型预测来识别有效肽可以减少湿实验的工作量。然而,由于肽表示、特征选择、模型选择和超参数调整等挑战,传统的模型构建过程可能会很复杂且耗时。最近,用于蛋白质序列嵌入的先进的基于预训练的深度学习语言模型(LMs)已经发布,并应用于结构和功能预测。基于这些发展,我们开发了 UniDL4BioPep,这是一种用于生物活性肽二分类建模的迁移学习的通用深度学习模型架构。它可以直接帮助用户使用固定架构训练高性能深度学习模型,并实现最先进的性能,以满足高效发现新型生物活性肽的需求。据我们所知,这是第一次将预训练的生物语言模型用于肽嵌入,并通过大规模评估这些肽嵌入来成功预测肽的生物活性。该模型还通过均匀流形逼近和投影分析进行了验证。通过将 LM 与卷积神经网络相结合,UniDL4BioPep 在 20 个不同生物活性数据集预测任务中的 15 个任务中的表现优于各自的最先进模型。准确性、马修斯相关系数和曲线下面积分别提高了 0.7-7%、1.23-26.7%和 0.3-25.6%。建立了 UniDL4BioPep 的用户友好型网页服务器,用于测试的生物活性,可在 https://nepc2pvmzy.us-east-1.awsapprunner.com 上访问。UniDL4BioPep 的源代码、数据集和其他生物活性拟合和预测任务的模板可在 https://github.com/dzjxzyd/UniDL4BioPep 上获得。

相似文献

1
UniDL4BioPep: a universal deep learning architecture for binary classification in peptide bioactivity.UniDL4BioPep:用于肽生物活性二元分类的通用深度学习架构。
Brief Bioinform. 2023 May 19;24(3). doi: 10.1093/bib/bbad135.
2
pLM4Alg: Protein Language Model-Based Predictors for Allergenic Proteins and Peptides.pLM4Alg:基于蛋白质语言模型的变应原性蛋白质和肽预测器
J Agric Food Chem. 2024 Jan 10;72(1):752-760. doi: 10.1021/acs.jafc.3c07143. Epub 2023 Dec 19.
3
Anticancer peptides prediction with deep representation learning features.基于深度表示学习特征的抗癌肽预测。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab008.
4
mACPpred 2.0: Stacked Deep Learning for Anticancer Peptide Prediction with Integrated Spatial and Probabilistic Feature Representations.mACPpred 2.0:具有集成空间和概率特征表示的用于抗癌肽预测的堆叠深度学习。
J Mol Biol. 2024 Sep 1;436(17):168687. doi: 10.1016/j.jmb.2024.168687. Epub 2024 Jun 25.
5
Convolutional neural networks with image representation of amino acid sequences for protein function prediction.基于氨基酸序列图像表示的卷积神经网络用于蛋白质功能预测。
Comput Biol Chem. 2021 Jun;92:107494. doi: 10.1016/j.compbiolchem.2021.107494. Epub 2021 Apr 24.
6
Sequence representation approaches for sequence-based protein prediction tasks that use deep learning.用于基于序列的蛋白质预测任务的序列表示方法,这些任务使用深度学习。
Brief Funct Genomics. 2021 Mar 2;20(1):61-73. doi: 10.1093/bfgp/elaa030.
7
A comparative study on deep learning models for text classification of unstructured medical notes with various levels of class imbalance.深度学习模型在不同类别不平衡程度的非结构化医疗记录文本分类中的对比研究。
BMC Med Res Methodol. 2022 Jul 2;22(1):181. doi: 10.1186/s12874-022-01665-y.
8
ACP-Dnnel: anti-coronavirus peptides' prediction based on deep neural network ensemble learning.ACP-Dnnel:基于深度神经网络集成学习的抗冠状病毒肽预测
Amino Acids. 2023 Sep;55(9):1121-1136. doi: 10.1007/s00726-023-03300-6. Epub 2023 Jul 4.
9
pLM4ACE: A protein language model based predictor for antihypertensive peptide screening.pLM4ACE:一种基于蛋白质语言模型的降压肽筛选预测器。
Food Chem. 2024 Jan 15;431:137162. doi: 10.1016/j.foodchem.2023.137162. Epub 2023 Aug 14.
10
ChampKit: A framework for rapid evaluation of deep neural networks for patch-based histopathology classification.ChampKit:一种基于补丁的组织病理学分类的深度神经网络快速评估框架。
Comput Methods Programs Biomed. 2023 Sep;239:107631. doi: 10.1016/j.cmpb.2023.107631. Epub 2023 May 30.

引用本文的文献

1
AIP-TranLAC: A Transformer-Based Method Integrating LSTM and Attention Mechanism for Predicting Anti-inflammatory Peptides.AIP-TranLAC:一种基于Transformer的集成LSTM和注意力机制的抗炎肽预测方法。
Interdiscip Sci. 2025 Aug 19. doi: 10.1007/s12539-025-00761-z.
2
PepLand: a large-scale pre-trained peptide representation model for a comprehensive landscape of both canonical and non-canonical amino acids.PepLand:一种用于全面呈现标准和非标准氨基酸情况的大规模预训练肽段表示模型。
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf367.
3
Comprehensive comparison of potential flavor-active peptides, amino acids and pigments accumulation in different altitudes cultivated albino teas.
不同海拔种植的白化茶中潜在风味活性肽、氨基酸和色素积累的综合比较。
Food Chem X. 2025 Jul 3;29:102722. doi: 10.1016/j.fochx.2025.102722. eCollection 2025 Jul.
4
PepBERT: Lightweight language models for bioactive peptide representation.PepBERT:用于生物活性肽表征的轻量级语言模型。
bioRxiv. 2025 Jul 4:2025.04.08.647838. doi: 10.1101/2025.04.08.647838.
5
Predicting Peptide Bioactivity Using the Unified Model Architecture UniDL4BioPep.使用统一模型架构UniDL4BioPep预测肽的生物活性。
Methods Mol Biol. 2025;2941:279-292. doi: 10.1007/978-1-0716-4623-6_17.
6
AOPxSVM: A Support Vector Machine for Identifying Antioxidant Peptides Using a Block Substitution Matrix and Amino Acid Composition, Transformation, and Distribution Embeddings.AOPxSVM:一种使用块替换矩阵以及氨基酸组成、转化和分布嵌入来识别抗氧化肽的支持向量机。
Foods. 2025 Jun 6;14(12):2014. doi: 10.3390/foods14122014.
7
NeuroPpred-MSN: A Neuropeptide Prediction Model Based on Multi-feature Fusion and Siamese Networks.NeuroPpred-MSN:一种基于多特征融合和连体网络的神经肽预测模型。
Interdiscip Sci. 2025 Jun 3. doi: 10.1007/s12539-025-00730-6.
8
GRU4ACE: Enhancing ACE inhibitory peptide prediction by integrating gated recurrent unit with multi-source feature embeddings.GRU4ACE:通过将门控循环单元与多源特征嵌入相结合来增强血管紧张素转换酶抑制肽预测
Protein Sci. 2025 Jun;34(6):e70026. doi: 10.1002/pro.70026.
9
Preparation and Encapsulation of DPP-IV Inhibitory Peptides: Challenges and Strategies for Functional Food Development.二肽基肽酶-IV 抑制肽的制备与包封:功能性食品开发的挑战与策略
Foods. 2025 Apr 24;14(9):1479. doi: 10.3390/foods14091479.
10
pNPs-CapsNet: Predicting Neuropeptides Using Protein Language Models and FastText Encoding-Based Weighted Multi-View Feature Integration with Deep Capsule Neural Network.pNPs-CapsNet:使用蛋白质语言模型和基于FastText编码的加权多视图特征集成与深度胶囊神经网络预测神经肽
ACS Omega. 2025 Mar 18;10(12):12403-12416. doi: 10.1021/acsomega.4c11449. eCollection 2025 Apr 1.