DeepPLM_mCNN：一种基于预训练语言模型特征的多窗口 CNN 方法，用于增强离子通道和离子转运体的识别。

DeepPLM_mCNN: An approach for enhancing ion channel and ion transporter recognition by multi-window CNN based on features from pre-trained language models.

机构信息

Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan.

Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan; Department of Computer Science and Engineering, Karakoram International University, Pakistan.

出版信息

Comput Biol Chem. 2024 Jun;110:108055. doi: 10.1016/j.compbiolchem.2024.108055. Epub 2024 Mar 20.

DOI:10.1016/j.compbiolchem.2024.108055

PMID:38555810

Abstract

Accurate classification of membrane proteins like ion channels and transporters is critical for elucidating cellular processes and drug development. We present DeepPLM_mCNN, a novel framework combining Pretrained Language Models (PLMs) and multi-window convolutional neural networks (mCNNs) for effective classification of membrane proteins into ion channels and ion transporters. Our approach extracts informative features from protein sequences by utilizing various PLMs, including TAPE, ProtT5_XL_U50, ESM-1b, ESM-2_480, and ESM-2_1280. These PLM-derived features are then input into a mCNN architecture to learn conserved motifs important for classification. When evaluated on ion transporters, our best performing model utilizing ProtT5 achieved 90% sensitivity, 95.8% specificity, and 95.4% overall accuracy. For ion channels, we obtained 88.3% sensitivity, 95.7% specificity, and 95.2% overall accuracy using ESM-1b features. Our proposed DeepPLM_mCNN framework demonstrates significant improvements over previous methods on unseen test data. This study illustrates the potential of combining PLMs and deep learning for accurate computational identification of membrane proteins from sequence data alone. Our findings have important implications for membrane protein research and drug development targeting ion channels and transporters. The data and source codes in this study are publicly available at the following link: https://github.com/s1129108/DeepPLM_mCNN.

摘要

准确地对膜蛋白（如离子通道和转运蛋白）进行分类，对于阐明细胞过程和药物开发至关重要。我们提出了 DeepPLM_mCNN，这是一种结合了预训练语言模型（PLM）和多窗口卷积神经网络（mCNN）的新框架，用于有效地将膜蛋白分类为离子通道和离子转运蛋白。我们的方法通过利用各种 PLM（包括 TAPE、ProtT5_XL_U50、ESM-1b、ESM-2_480 和 ESM-2_1280）从蛋白质序列中提取信息丰富的特征。然后，这些来自 PLM 的特征被输入到 mCNN 架构中，以学习对分类很重要的保守基序。在评估离子转运蛋白时，我们使用 ProtT5 的最佳表现模型实现了 90%的灵敏度、95.8%的特异性和 95.4%的整体准确性。对于离子通道，我们使用 ESM-1b 特征获得了 88.3%的灵敏度、95.7%的特异性和 95.2%的整体准确性。我们提出的 DeepPLM_mCNN 框架在未见测试数据上显著优于以前的方法。这项研究说明了结合 PLM 和深度学习从序列数据准确计算识别膜蛋白的潜力。我们的研究结果对离子通道和转运蛋白靶向的膜蛋白研究和药物开发具有重要意义。本研究的数据和源代码可在以下链接获得：https://github.com/s1129108/DeepPLM_mCNN。

相似文献

DeepPLM_mCNN: An approach for enhancing ion channel and ion transporter recognition by multi-window CNN based on features from pre-trained language models.

Comput Biol Chem. 2024 Jun;110:108055. doi: 10.1016/j.compbiolchem.2024.108055. Epub 2024 Mar 20.

Integrating Pre-Trained protein language model and multiple window scanning deep learning networks for accurate identification of secondary active transporters in membrane proteins.

Methods. 2023 Dec;220:11-20. doi: 10.1016/j.ymeth.2023.10.008. Epub 2023 Oct 21.

Exploiting protein language models for the precise classification of ion channels and ion transporters.

Proteins. 2024 Aug;92(8):998-1055. doi: 10.1002/prot.26694. Epub 2024 Apr 24.

MCNN: Multiple Convolutional Neural Networks for RNA-Protein Binding Sites Prediction.

IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):1180-1187. doi: 10.1109/TCBB.2022.3170367. Epub 2023 Apr 3.

MCNN_MC: Computational Prediction of Mitochondrial Carriers and Investigation of Bongkrekic Acid Toxicity Using Protein Language Models and Convolutional Neural Networks.

J Chem Inf Model. 2024 Dec 23;64(24):9125-9134. doi: 10.1021/acs.jcim.4c00961. Epub 2024 Aug 12.

PreDBP-PLMs: Prediction of DNA-binding proteins based on pre-trained protein language models and convolutional neural networks.

Anal Biochem. 2024 Nov;694:115603. doi: 10.1016/j.ab.2024.115603. Epub 2024 Jul 8.

NCSP-PLM: An ensemble learning framework for predicting non-classical secreted proteins based on protein language models and deep learning.

Math Biosci Eng. 2024 Jan;21(1):1472-1488. doi: 10.3934/mbe.2024063. Epub 2022 Dec 28.

DeepNeoAG: Neoantigen epitope prediction from melanoma antigens using a synergistic deep learning model combining protein language models and multi-window scanning convolutional neural networks.

Int J Biol Macromol. 2024 Nov;281(Pt 1):136252. doi: 10.1016/j.ijbiomac.2024.136252. Epub 2024 Oct 2.

mCNN-ETC: identifying electron transporters and their functional families by using multiple windows scanning techniques in convolutional neural networks with evolutionary information of protein sequences.

Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab352.

Multi-Branch-CNN: Classification of ion channel interacting peptides using multi-branch convolutional neural network.

Comput Biol Med. 2022 Aug;147:105717. doi: 10.1016/j.compbiomed.2022.105717. Epub 2022 Jun 8.

引用本文的文献

CaBind_MCNN: Identifying Potential Calcium Channel Blocker Targets by Predicting Calcium-Binding Sites in Ion Channels and Ion Transporters Using Protein Language Models and Multiscale Feature Extraction.

J Chem Inf Model. 2025 Feb 24;65(4):2145-2157. doi: 10.1021/acs.jcim.4c02252. Epub 2025 Feb 6.

MCNN_MC: Computational Prediction of Mitochondrial Carriers and Investigation of Bongkrekic Acid Toxicity Using Protein Language Models and Convolutional Neural Networks.

J Chem Inf Model. 2024 Dec 23;64(24):9125-9134. doi: 10.1021/acs.jcim.4c00961. Epub 2024 Aug 12.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

DeepPLM_mCNN：一种基于预训练语言模型特征的多窗口 CNN 方法，用于增强离子通道和离子转运体的识别。

DeepPLM_mCNN: An approach for enhancing ion channel and ion transporter recognition by multi-window CNN based on features from pre-trained language models.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献