Suppr超能文献

利用深度学习和大型蛋白质语言模型预测外周膜蛋白的蛋白质-膜界面。

Using deep learning and large protein language models to predict protein-membrane interfaces of peripheral membrane proteins.

作者信息

Paranou Dimitra, Chatzigoulas Alexios, Cournia Zoe

机构信息

Biomedical Research Foundation, Academy of Athens, Athens 11527, Greece.

Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, Athens 15784, Greece.

出版信息

Bioinform Adv. 2024 May 28;4(1):vbae078. doi: 10.1093/bioadv/vbae078. eCollection 2024.

Abstract

MOTIVATION

Characterizing interactions at the protein-membrane interface is crucial as abnormal peripheral protein-membrane attachment is involved in the onset of many diseases. However, a limiting factor in studying and understanding protein-membrane interactions is that the membrane-binding domains of peripheral membrane proteins (PMPs) are typically unknown. By applying artificial intelligence techniques in the context of natural language processing (NLP), the accuracy and prediction time for protein-membrane interface analysis can be significantly improved compared to existing methods. Here, we assess whether NLP and protein language models (pLMs) can be used to predict membrane-interacting amino acids for PMPs.

RESULTS

We utilize available experimental data and generate protein embeddings from two pLMs (ProtTrans and ESM) to train classifier models. Overall, the results demonstrate the first proof of concept study and the promising potential of using deep learning and pLMs to predict protein-membrane interfaces for PMPs faster, with similar accuracy, and without the need for 3D structural data compared to existing tools.

AVAILABILITY AND IMPLEMENTATION

The code is available at https://github.com/zoecournia/pLM-PMI. All data are available in the Supplementary material.

摘要

动机

表征蛋白质 - 膜界面的相互作用至关重要,因为外周蛋白与膜的异常附着与许多疾病的发生有关。然而,研究和理解蛋白质 - 膜相互作用的一个限制因素是外周膜蛋白(PMPs)的膜结合结构域通常未知。通过在自然语言处理(NLP)的背景下应用人工智能技术,与现有方法相比,蛋白质 - 膜界面分析的准确性和预测时间可以得到显著提高。在这里,我们评估NLP和蛋白质语言模型(pLMs)是否可用于预测PMPs与膜相互作用的氨基酸。

结果

我们利用现有的实验数据,并从两个pLMs(ProtTrans和ESM)生成蛋白质嵌入来训练分类器模型。总体而言,结果证明了第一个概念验证研究,以及使用深度学习和pLMs更快地预测PMPs的蛋白质 - 膜界面的潜力,与现有工具相比,具有相似的准确性,且无需3D结构数据。

可用性和实现

代码可在https://github.com/zoecournia/pLM-PMI获取。所有数据都在补充材料中提供。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/69bf/11572487/e1b21c929fad/vbae078f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验