Suppr超能文献

DeepPLM_mCNN:一种基于预训练语言模型特征的多窗口 CNN 方法,用于增强离子通道和离子转运体的识别。

DeepPLM_mCNN: An approach for enhancing ion channel and ion transporter recognition by multi-window CNN based on features from pre-trained language models.

机构信息

Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan.

Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan; Department of Computer Science and Engineering, Karakoram International University, Pakistan.

出版信息

Comput Biol Chem. 2024 Jun;110:108055. doi: 10.1016/j.compbiolchem.2024.108055. Epub 2024 Mar 20.

Abstract

Accurate classification of membrane proteins like ion channels and transporters is critical for elucidating cellular processes and drug development. We present DeepPLM_mCNN, a novel framework combining Pretrained Language Models (PLMs) and multi-window convolutional neural networks (mCNNs) for effective classification of membrane proteins into ion channels and ion transporters. Our approach extracts informative features from protein sequences by utilizing various PLMs, including TAPE, ProtT5_XL_U50, ESM-1b, ESM-2_480, and ESM-2_1280. These PLM-derived features are then input into a mCNN architecture to learn conserved motifs important for classification. When evaluated on ion transporters, our best performing model utilizing ProtT5 achieved 90% sensitivity, 95.8% specificity, and 95.4% overall accuracy. For ion channels, we obtained 88.3% sensitivity, 95.7% specificity, and 95.2% overall accuracy using ESM-1b features. Our proposed DeepPLM_mCNN framework demonstrates significant improvements over previous methods on unseen test data. This study illustrates the potential of combining PLMs and deep learning for accurate computational identification of membrane proteins from sequence data alone. Our findings have important implications for membrane protein research and drug development targeting ion channels and transporters. The data and source codes in this study are publicly available at the following link: https://github.com/s1129108/DeepPLM_mCNN.

摘要

准确地对膜蛋白(如离子通道和转运蛋白)进行分类,对于阐明细胞过程和药物开发至关重要。我们提出了 DeepPLM_mCNN,这是一种结合了预训练语言模型(PLM)和多窗口卷积神经网络(mCNN)的新框架,用于有效地将膜蛋白分类为离子通道和离子转运蛋白。我们的方法通过利用各种 PLM(包括 TAPE、ProtT5_XL_U50、ESM-1b、ESM-2_480 和 ESM-2_1280)从蛋白质序列中提取信息丰富的特征。然后,这些来自 PLM 的特征被输入到 mCNN 架构中,以学习对分类很重要的保守基序。在评估离子转运蛋白时,我们使用 ProtT5 的最佳表现模型实现了 90%的灵敏度、95.8%的特异性和 95.4%的整体准确性。对于离子通道,我们使用 ESM-1b 特征获得了 88.3%的灵敏度、95.7%的特异性和 95.2%的整体准确性。我们提出的 DeepPLM_mCNN 框架在未见测试数据上显著优于以前的方法。这项研究说明了结合 PLM 和深度学习从序列数据准确计算识别膜蛋白的潜力。我们的研究结果对离子通道和转运蛋白靶向的膜蛋白研究和药物开发具有重要意义。本研究的数据和源代码可在以下链接获得:https://github.com/s1129108/DeepPLM_mCNN。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验