BERT4Bitter：一种基于变换器双向编码器表征（BERT）的模型，用于改进苦味肽的预测。

BERT4Bitter: a bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides.

作者信息

Charoenkwan Phasit, Nantasenamat Chanin, Hasan Md Mehedi, Manavalan Balachandran, Shoombuatong Watshara

机构信息

Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai 50200, Thailand.

Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.

出版信息

Bioinformatics. 2021 Sep 9;37(17):2556-2562. doi: 10.1093/bioinformatics/btab133.

DOI:10.1093/bioinformatics/btab133

PMID:33638635

Abstract

MOTIVATION

The identification of bitter peptides through experimental approaches is an expensive and time-consuming endeavor. Due to the huge number of newly available peptide sequences in the post-genomic era, the development of automated computational models for the identification of novel bitter peptides is highly desirable.

RESULTS

In this work, we present BERT4Bitter, a bidirectional encoder representation from transformers (BERT)-based model for predicting bitter peptides directly from their amino acid sequence without using any structural information. To the best of our knowledge, this is the first time a BERT-based model has been employed to identify bitter peptides. Compared to widely used machine learning models, BERT4Bitter achieved the best performance with an accuracy of 0.861 and 0.922 for cross-validation and independent tests, respectively. Furthermore, extensive empirical benchmarking experiments on the independent dataset demonstrated that BERT4Bitter clearly outperformed the existing method with improvements of 8.0% accuracy and 16.0% Matthews coefficient correlation, highlighting the effectiveness and robustness of BERT4Bitter. We believe that the BERT4Bitter method proposed herein will be a useful tool for rapidly screening and identifying novel bitter peptides for drug development and nutritional research.

AVAILABILITYAND IMPLEMENTATION

The user-friendly web server of the proposed BERT4Bitter is freely accessible at http://pmlab.pythonanywhere.com/BERT4Bitter.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

通过实验方法鉴定苦味肽是一项昂贵且耗时的工作。由于后基因组时代新出现的肽序列数量巨大，因此非常需要开发用于鉴定新型苦味肽的自动化计算模型。

结果

在这项工作中，我们提出了BERT4Bitter，这是一种基于变换器的双向编码器表示（BERT）模型，可直接从氨基酸序列预测苦味肽，而无需使用任何结构信息。据我们所知，这是首次使用基于BERT的模型来鉴定苦味肽。与广泛使用的机器学习模型相比，BERT4Bitter在交叉验证和独立测试中的准确率分别达到0.861和0.922，表现最佳。此外，在独立数据集上进行的广泛实证基准实验表明，BERT4Bitter明显优于现有方法，准确率提高了8.0%，马修斯系数相关性提高了16.0%，突出了BERT4Bitter的有效性和稳健性。我们相信，本文提出的BERT4Bitter方法将成为快速筛选和鉴定用于药物开发和营养研究的新型苦味肽的有用工具。

可用性与实现

所提出的BERT4Bitter的用户友好型网络服务器可在http://pmlab.pythonanywhere.com/BERT4Bitter上免费访问。

补充信息

补充数据可在《生物信息学》在线获取。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

BERT4Bitter：一种基于变换器双向编码器表征（BERT）的模型，用于改进苦味肽的预测。

BERT4Bitter: a bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITYAND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性与实现

补充信息

相似文献

引用本文的文献

BERT4Bitter：一种基于变换器双向编码器表征（BERT）的模型，用于改进苦味肽的预测。

BERT4Bitter: a bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITYAND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性与实现

补充信息

相似文献

引用本文的文献