• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

BERT4Bitter:一种基于变换器双向编码器表征(BERT)的模型,用于改进苦味肽的预测。

BERT4Bitter: a bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides.

作者信息

Charoenkwan Phasit, Nantasenamat Chanin, Hasan Md Mehedi, Manavalan Balachandran, Shoombuatong Watshara

机构信息

Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai 50200, Thailand.

Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.

出版信息

Bioinformatics. 2021 Sep 9;37(17):2556-2562. doi: 10.1093/bioinformatics/btab133.

DOI:10.1093/bioinformatics/btab133
PMID:33638635
Abstract

MOTIVATION

The identification of bitter peptides through experimental approaches is an expensive and time-consuming endeavor. Due to the huge number of newly available peptide sequences in the post-genomic era, the development of automated computational models for the identification of novel bitter peptides is highly desirable.

RESULTS

In this work, we present BERT4Bitter, a bidirectional encoder representation from transformers (BERT)-based model for predicting bitter peptides directly from their amino acid sequence without using any structural information. To the best of our knowledge, this is the first time a BERT-based model has been employed to identify bitter peptides. Compared to widely used machine learning models, BERT4Bitter achieved the best performance with an accuracy of 0.861 and 0.922 for cross-validation and independent tests, respectively. Furthermore, extensive empirical benchmarking experiments on the independent dataset demonstrated that BERT4Bitter clearly outperformed the existing method with improvements of 8.0% accuracy and 16.0% Matthews coefficient correlation, highlighting the effectiveness and robustness of BERT4Bitter. We believe that the BERT4Bitter method proposed herein will be a useful tool for rapidly screening and identifying novel bitter peptides for drug development and nutritional research.

AVAILABILITYAND IMPLEMENTATION

The user-friendly web server of the proposed BERT4Bitter is freely accessible at http://pmlab.pythonanywhere.com/BERT4Bitter.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

通过实验方法鉴定苦味肽是一项昂贵且耗时的工作。由于后基因组时代新出现的肽序列数量巨大,因此非常需要开发用于鉴定新型苦味肽的自动化计算模型。

结果

在这项工作中,我们提出了BERT4Bitter,这是一种基于变换器的双向编码器表示(BERT)模型,可直接从氨基酸序列预测苦味肽,而无需使用任何结构信息。据我们所知,这是首次使用基于BERT的模型来鉴定苦味肽。与广泛使用的机器学习模型相比,BERT4Bitter在交叉验证和独立测试中的准确率分别达到0.861和0.922,表现最佳。此外,在独立数据集上进行的广泛实证基准实验表明,BERT4Bitter明显优于现有方法,准确率提高了8.0%,马修斯系数相关性提高了16.0%,突出了BERT4Bitter的有效性和稳健性。我们相信,本文提出的BERT4Bitter方法将成为快速筛选和鉴定用于药物开发和营养研究的新型苦味肽的有用工具。

可用性与实现

所提出的BERT4Bitter的用户友好型网络服务器可在http://pmlab.pythonanywhere.com/BERT4Bitter上免费访问。

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

1
BERT4Bitter: a bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides.BERT4Bitter:一种基于变换器双向编码器表征(BERT)的模型,用于改进苦味肽的预测。
Bioinformatics. 2021 Sep 9;37(17):2556-2562. doi: 10.1093/bioinformatics/btab133.
2
iBitter-SCM: Identification and characterization of bitter peptides using a scoring card method with propensity scores of dipeptides.iBitter-SCM:利用二肽倾向评分的评分卡方法鉴定和表征苦味肽。
Genomics. 2020 Jul;112(4):2813-2822. doi: 10.1016/j.ygeno.2020.03.019. Epub 2020 Mar 28.
3
IUP-BERT: Identification of Umami Peptides Based on BERT Features.IUP-BERT:基于BERT特征的鲜味肽识别
Foods. 2022 Nov 21;11(22):3742. doi: 10.3390/foods11223742.
4
A BERT-based approach for identifying anti-inflammatory peptides using sequence information.一种基于BERT利用序列信息识别抗炎肽的方法。
Heliyon. 2024 Jun 13;10(12):e32951. doi: 10.1016/j.heliyon.2024.e32951. eCollection 2024 Jun 30.
5
BERT-Kcr: prediction of lysine crotonylation sites by a transfer learning method with pre-trained BERT models.BERT-Kcr:基于预训练BERT模型的迁移学习方法预测赖氨酸巴豆酰化位点
Bioinformatics. 2022 Jan 12;38(3):648-654. doi: 10.1093/bioinformatics/btab712.
6
Umami-BERT: An interpretable BERT-based model for umami peptides prediction.鲜味 BERT:一种基于 BERT 的可解释模型,用于预测鲜味肽。
Food Res Int. 2023 Oct;172:113142. doi: 10.1016/j.foodres.2023.113142. Epub 2023 Jun 16.
7
Multi-Label Classification in Patient-Doctor Dialogues With the RoBERTa-WWM-ext + CNN (Robustly Optimized Bidirectional Encoder Representations From Transformers Pretraining Approach With Whole Word Masking Extended Combining a Convolutional Neural Network) Model: Named Entity Study.基于RoBERTa-WWM-ext + CNN(带有全词掩码扩展的基于变换器预训练方法的稳健优化双向编码器表示与卷积神经网络相结合)模型的医患对话多标签分类:命名实体研究
JMIR Med Inform. 2022 Apr 21;10(4):e35606. doi: 10.2196/35606.
8
Predicting protein-peptide binding residues via interpretable deep learning.通过可解释的深度学习预测蛋白质-肽结合残基
Bioinformatics. 2022 Jun 27;38(13):3351-3360. doi: 10.1093/bioinformatics/btac352.
9
BERT-Kgly: A Bidirectional Encoder Representations From Transformers (BERT)-Based Model for Predicting Lysine Glycation Site for .BERT-Kgly:一种基于双向编码器表征变换器(BERT)的赖氨酸糖基化位点预测模型
Front Bioinform. 2022 Feb 18;2:834153. doi: 10.3389/fbinf.2022.834153. eCollection 2022.
10
Transfer Learning for Sentiment Classification Using Bidirectional Encoder Representations from Transformers (BERT) Model.使用来自Transformer的双向编码器表征(BERT)模型进行情感分类的迁移学习
Sensors (Basel). 2023 May 31;23(11):5232. doi: 10.3390/s23115232.

引用本文的文献

1
xBitterT5: an explainable transformer-based framework with multimodal inputs for identifying bitter-taste peptides.xBitterT5:一种基于可解释变压器的多模态输入框架,用于识别苦味肽。
J Cheminform. 2025 Aug 20;17(1):127. doi: 10.1186/s13321-025-01078-1.
2
Compressive strength modelling of cenosphere and copper slag-based geopolymer concrete using deep learning model.基于深度学习模型的漂珠和铜渣基地质聚合物混凝土抗压强度建模
Sci Rep. 2025 Jul 30;15(1):27849. doi: 10.1038/s41598-025-13176-z.
3
Bridging artificial intelligence and biological sciences: a comprehensive review of large language models in bioinformatics.
连接人工智能与生物科学:生物信息学中大型语言模型的全面综述
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf357.
4
A systematic review of data and models for predicting food flavor and texture.预测食物风味和质地的数据与模型的系统综述。
Curr Res Food Sci. 2025 Jun 26;11:101127. doi: 10.1016/j.crfs.2025.101127. eCollection 2025.
5
The data visualization and intelligent text analysis for effective evaluation of English language teaching.用于有效评估英语教学的数据可视化与智能文本分析
Sci Rep. 2025 Jul 2;15(1):22737. doi: 10.1038/s41598-025-08182-0.
6
GRU4ACE: Enhancing ACE inhibitory peptide prediction by integrating gated recurrent unit with multi-source feature embeddings.GRU4ACE:通过将门控循环单元与多源特征嵌入相结合来增强血管紧张素转换酶抑制肽预测
Protein Sci. 2025 Jun;34(6):e70026. doi: 10.1002/pro.70026.
7
Leveraging large language models for peptide antibiotic design.利用大语言模型进行肽类抗生素设计。
Cell Rep Phys Sci. 2025 Jan 15;6(1). doi: 10.1016/j.xcrp.2024.102359. Epub 2024 Dec 31.
8
RiceSNP-ABST: a deep learning approach to identify abiotic stress-associated single nucleotide polymorphisms in rice.水稻SNP-ABST:一种用于识别水稻中非生物胁迫相关单核苷酸多态性的深度学习方法。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae702.
9
The Microbial Diversity and Flavor Metabolism Regulation of During Different Natural Fermentation Time Periods.不同自然发酵时期的微生物多样性及风味代谢调控
Foods. 2024 Dec 5;13(23):3931. doi: 10.3390/foods13233931.
10
DeepPD: A Deep Learning Method for Predicting Peptide Detectability Based on Multi-feature Representation and Information Bottleneck.DeepPD:一种基于多特征表示和信息瓶颈预测肽段可检测性的深度学习方法。
Interdiscip Sci. 2025 Mar;17(1):200-214. doi: 10.1007/s12539-024-00665-4. Epub 2024 Dec 11.