Suppr超能文献

FAD-BERT:使用深度双向转换器的预训练改进 FAD 结合位点预测。

FAD-BERT: Improved prediction of FAD binding sites using pre-training of deep bidirectional transformers.

机构信息

Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan; College of Information & Communication Technology, Can Tho University, Viet Nam.

Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan.

出版信息

Comput Biol Med. 2021 Apr;131:104258. doi: 10.1016/j.compbiomed.2021.104258. Epub 2021 Feb 8.

Abstract

The electron transport chain is a series of protein complexes embedded in the process of cellular respiration, which is an important process to transfer electrons and other macromolecules throughout the cell. Identifying Flavin Adenine Dinucleotide (FAD) binding sites in the electron transport chain is vital since it helps biological researchers precisely understand how electrons are produced and are transported in cells. This study distills and analyzes the contextualized word embedding from pre-trained BERT models to explore similarities in natural language and protein sequences. Thereby, we propose a new approach based on Pre-training of Bidirectional Encoder Representations from Transformers (BERT), Position-specific Scoring Matrix profiles (PSSM), Amino Acid Index database (AAIndex) to predict FAD-binding sites from the transport proteins which are found in nature recently. Our proposed approach archives 85.14% accuracy and improves accuracy by 11%, with Matthew's correlation coefficient of 0.39 compared to the previous method on the same independent set. We also deploy a web server that identifies FAD-binding sites in electron transporters available for academics at http://140.138.155.216/fadbert/.

摘要

电子传递链是一系列嵌入在细胞呼吸过程中的蛋白质复合物,是在整个细胞中传递电子和其他大分子的重要过程。鉴定电子传递链中的黄素腺嘌呤二核苷酸(FAD)结合位点至关重要,因为它有助于生物研究人员精确了解电子在细胞中是如何产生和传递的。本研究从预先训练的 BERT 模型中提取和分析上下文化的单词嵌入,以探索自然语言和蛋白质序列之间的相似性。因此,我们提出了一种基于从变压器(BERT)双向编码器表示预训练、位置特定评分矩阵图(PSSM)、氨基酸索引数据库(AAIndex)的新方法,从自然界中最近发现的运输蛋白中预测 FAD 结合位点。与同一独立数据集上的先前方法相比,我们提出的方法的准确率为 85.14%,提高了 11%,马修相关系数为 0.39。我们还部署了一个可以在学术界使用的电子转运蛋白中识别 FAD 结合位点的网络服务器,网址为 http://140.138.155.216/fadbert/。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验