• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ActTRANS:基于迁移学习和上下文表示的主动转运蛋白的功能分类。

ActTRANS: Functional classification in active transport proteins based on transfer learning and contextual representations.

机构信息

Department of Computer Science & Engineering, Yuan Ze University, Chungli, 32003, Taiwan.

Department of Computer Science & Engineering, Yuan Ze University, Chungli, 32003, Taiwan.

出版信息

Comput Biol Chem. 2021 Aug;93:107537. doi: 10.1016/j.compbiolchem.2021.107537. Epub 2021 Jun 29.

DOI:10.1016/j.compbiolchem.2021.107537
PMID:34217007
Abstract

MOTIVATION

Primary and secondary active transport are two types of active transport that involve using energy to move the substances. Active transport mechanisms do use proteins to assist in transport and play essential roles to regulate the traffic of ions or small molecules across a cell membrane against the concentration gradient. In this study, the two main types of proteins involved in such transport are classified from transmembrane transport proteins. We propose a Support Vector Machine (SVM) with contextualized word embeddings from Bidirectional Encoder Representations from Transformers (BERT) to represent protein sequences. BERT is a powerful model in transfer learning, a deep learning language representation model developed by Google and one of the highest performing pre-trained model for Natural Language Processing (NLP) tasks. The idea of transfer learning with pre-trained model from BERT is applied to extract fixed feature vectors from the hidden layers and learn contextual relations between amino acids in the protein sequence. Therefore, the contextualized word representations of proteins are introduced to effectively model complex structures of amino acids in the sequence and the variations of these amino acids in the context. By generating context information, we capture multiple meanings for the same amino acid to reveal the importance of specific residues in the protein sequence.

RESULTS

The performance of the proposed method is evaluated using five-fold cross-validation and independent test. The proposed method achieves an accuracy of 85.44 %, 88.74 % and 92.84 % for Class-1, Class-2, and Class-3, respectively. Experimental results show that this approach can outperform from other feature extraction methods using context information, effectively classify two types of active transport and improve the overall performance.

摘要

动机

主动运输有原发性主动运输和继发性主动运输两种类型,这两种类型都需要消耗能量来运输物质。主动运输机制确实使用蛋白质来协助运输,并在调节离子或小分子逆浓度梯度穿过细胞膜的运输方面发挥着重要作用。在这项研究中,我们从跨膜转运蛋白中对参与这种转运的两种主要类型的蛋白质进行了分类。我们提出了一种支持向量机(SVM),该 SVM 使用来自 Transformer 的双向编码器表示(BERT)的上下文化词嵌入来表示蛋白质序列。BERT 是迁移学习中的一种强大模型,是谷歌开发的一种深度学习语言表示模型,也是自然语言处理(NLP)任务中表现最好的预训练模型之一。我们将来自 BERT 的预训练模型的迁移学习思想应用于从隐藏层中提取固定特征向量,并学习蛋白质序列中氨基酸之间的上下文关系。因此,我们引入了蛋白质的上下文化词表示,以有效地对序列中氨基酸的复杂结构及其上下文的变化进行建模。通过生成上下文信息,我们为相同的氨基酸捕捉到了多个含义,从而揭示了蛋白质序列中特定残基的重要性。

结果

我们使用五折交叉验证和独立测试来评估所提出方法的性能。该方法在分类 1、分类 2 和分类 3 方面的准确率分别达到 85.44%、88.74%和 92.84%。实验结果表明,这种方法可以利用上下文信息胜过其他特征提取方法,有效地对两种类型的主动运输进行分类,并提高整体性能。

相似文献

1
ActTRANS: Functional classification in active transport proteins based on transfer learning and contextual representations.ActTRANS:基于迁移学习和上下文表示的主动转运蛋白的功能分类。
Comput Biol Chem. 2021 Aug;93:107537. doi: 10.1016/j.compbiolchem.2021.107537. Epub 2021 Jun 29.
2
Identification of efflux proteins based on contextual representations with deep bidirectional transformer encoders.基于深度双向变压器编码器的上下文表示识别外排蛋白。
Anal Biochem. 2021 Nov 15;633:114416. doi: 10.1016/j.ab.2021.114416. Epub 2021 Oct 14.
3
TRP-BERT: Discrimination of transient receptor potential (TRP) channels using contextual representations from deep bidirectional transformer based on BERT.TRP-BERT:基于 BERT 的深度双向转换器的上下文表示对瞬时受体电位 (TRP) 通道的判别。
Comput Biol Med. 2021 Oct;137:104821. doi: 10.1016/j.compbiomed.2021.104821. Epub 2021 Sep 1.
4
GT-Finder: Classify the family of glucose transporters with pre-trained BERT language models.GT-Finder:使用预训练的 BERT 语言模型对葡萄糖转运蛋白家族进行分类。
Comput Biol Med. 2021 Apr;131:104259. doi: 10.1016/j.compbiomed.2021.104259. Epub 2021 Feb 7.
5
Extracting comprehensive clinical information for breast cancer using deep learning methods.利用深度学习方法提取乳腺癌全面临床信息。
Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.
6
BERT-Kcr: prediction of lysine crotonylation sites by a transfer learning method with pre-trained BERT models.BERT-Kcr:基于预训练BERT模型的迁移学习方法预测赖氨酸巴豆酰化位点
Bioinformatics. 2022 Jan 12;38(3):648-654. doi: 10.1093/bioinformatics/btab712.
7
A transformer architecture based on BERT and 2D convolutional neural network to identify DNA enhancers from sequence information.基于 BERT 和二维卷积神经网络的变压器架构,用于从序列信息中识别 DNA 增强子。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab005.
8
A comparison of word embeddings for the biomedical natural language processing.生物医学自然语言处理中词嵌入的比较。
J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.
9
BERT-based Ranking for Biomedical Entity Normalization.基于BERT的生物医学实体规范化排序
AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:269-277. eCollection 2020.
10
Using Language Representation Learning Approach to Efficiently Identify Protein Complex Categories in Electron Transport Chain.利用语言表示学习方法高效识别电子传递链中的蛋白质复合物类别。
Mol Inform. 2020 Oct;39(10):e2000033. doi: 10.1002/minf.202000033. Epub 2020 Jul 16.

引用本文的文献

1
DeepEpiIL13: Deep Learning for Rapid and Accurate Prediction of IL-13-Inducing Epitopes Using Pretrained Language Models and Multiwindow Convolutional Neural Networks.DeepEpiIL13:使用预训练语言模型和多窗口卷积神经网络对诱导白细胞介素-13的表位进行快速准确预测的深度学习方法
ACS Omega. 2025 Feb 26;10(9):9675-9683. doi: 10.1021/acsomega.4c10960. eCollection 2025 Mar 11.
2
Data-Driven Technology Roadmaps to Identify Potential Technology Opportunities for Hyperuricemia Drugs.用于识别高尿酸血症药物潜在技术机会的数据驱动技术路线图。
Pharmaceuticals (Basel). 2022 Nov 3;15(11):1357. doi: 10.3390/ph15111357.