Food Laboratory of Zhongyuan, Beijing Technology and Business University, No. 11/33, Fucheng Road, Haidian District, Beijing 100048, China; Key Laboratory of Flavor Science of China Gengeral Chamber of Commerce, Beijing Technology and Business University, No. 11/33, Fucheng Road, Haidian District, Beijing 100048, China.
National Engineering Research Centre for Agri-product Quality Traceability, Beijing Technology and Business University, No. 11/33, Fucheng Road, Haidian District, Beijing 100048, China.
Food Res Int. 2023 Oct;172:113142. doi: 10.1016/j.foodres.2023.113142. Epub 2023 Jun 16.
Umami peptides have received extensive attention due to their ability to enhance flavors and provide nutritional benefits. The increasing demand for novel umami peptides and the vast number of peptides present in food call for more efficient methods to screen umami peptides, and further exploration is necessary. Therefore, the purpose of this study is to develop deep learning (DL) model to realize rapid screening of umami peptides. The Umami-BERT model was devised utilizing a novel two-stage training strategy with Bidirectional Encoder Representations from Transformers (BERT) and the inception network. In the pre-training stage, attention mechanisms were implemented on a large amount of bioactive peptides sequences to acquire high-dimensional generalized features. In the re-training stage, umami peptide prediction was carried out on UMP789 dataset, which is developed through the latest research. The model achieved the performance with an accuracy (ACC) of 93.23% and MCC of 0.78 on the balanced dataset, as well as an ACC of 95.00% and MCC of 0.85 on the unbalanced dataset. The results demonstrated that Umami-BERT could predict umami peptides directly from their amino acid sequences and exceeded the performance of other models. Furthermore, Umami-BERT enabled the analysis of attention pattern learned by Umami-BERT model. The amino acids Alanine (A), Cysteine (C), Aspartate (D), and Glutamicacid (E) were found to be the most significant contributors to umami peptides. Additionally, the patterns of summarized umami peptides involving A, C, D, and E were analyzed based on the learned attention weights. Consequently, Umami-BERT exhibited great potential in the large-scale screening of candidate peptides and offers novel insight for the further exploration of umami peptides.
鲜味肽因其能够增强风味和提供营养益处而受到广泛关注。对新型鲜味肽的需求不断增加,且食物中存在大量肽,因此需要更有效的方法来筛选鲜味肽,需要进一步探索。因此,本研究旨在开发深度学习 (DL) 模型,以实现鲜味肽的快速筛选。鲜味肽的识别模型采用了新颖的两阶段训练策略,该策略利用了来自 Transformer 的双向编码器表示 (BERT) 和 inception 网络。在预训练阶段,注意力机制应用于大量生物活性肽序列,以获取高维广义特征。在再训练阶段,在最新研究中开发的 UMP789 数据集上进行鲜味肽预测。该模型在平衡数据集上的性能达到了准确率 (ACC) 为 93.23%和 MCC 为 0.78,在不平衡数据集上的性能达到了 ACC 为 95.00%和 MCC 为 0.85。结果表明,鲜味肽识别模型可以直接从氨基酸序列预测鲜味肽,并且性能优于其他模型。此外,鲜味肽识别模型还可以分析鲜味肽识别模型学习到的注意力模式。结果表明,丙氨酸 (A)、半胱氨酸 (C)、天冬氨酸 (D) 和谷氨酸 (E) 是鲜味肽的最重要贡献者。此外,还根据学习到的注意力权重分析了涉及 A、C、D 和 E 的总结鲜味肽的模式。因此,鲜味肽识别模型在候选肽的大规模筛选中具有很大的潜力,为进一步探索鲜味肽提供了新的思路。