Paranou Dimitra, Chatzigoulas Alexios, Cournia Zoe
Biomedical Research Foundation, Academy of Athens, Athens 11527, Greece.
Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, Athens 15784, Greece.
Bioinform Adv. 2024 May 28;4(1):vbae078. doi: 10.1093/bioadv/vbae078. eCollection 2024.
Characterizing interactions at the protein-membrane interface is crucial as abnormal peripheral protein-membrane attachment is involved in the onset of many diseases. However, a limiting factor in studying and understanding protein-membrane interactions is that the membrane-binding domains of peripheral membrane proteins (PMPs) are typically unknown. By applying artificial intelligence techniques in the context of natural language processing (NLP), the accuracy and prediction time for protein-membrane interface analysis can be significantly improved compared to existing methods. Here, we assess whether NLP and protein language models (pLMs) can be used to predict membrane-interacting amino acids for PMPs.
We utilize available experimental data and generate protein embeddings from two pLMs (ProtTrans and ESM) to train classifier models. Overall, the results demonstrate the first proof of concept study and the promising potential of using deep learning and pLMs to predict protein-membrane interfaces for PMPs faster, with similar accuracy, and without the need for 3D structural data compared to existing tools.
The code is available at https://github.com/zoecournia/pLM-PMI. All data are available in the Supplementary material.
表征蛋白质 - 膜界面的相互作用至关重要,因为外周蛋白与膜的异常附着与许多疾病的发生有关。然而,研究和理解蛋白质 - 膜相互作用的一个限制因素是外周膜蛋白(PMPs)的膜结合结构域通常未知。通过在自然语言处理(NLP)的背景下应用人工智能技术,与现有方法相比,蛋白质 - 膜界面分析的准确性和预测时间可以得到显著提高。在这里,我们评估NLP和蛋白质语言模型(pLMs)是否可用于预测PMPs与膜相互作用的氨基酸。
我们利用现有的实验数据,并从两个pLMs(ProtTrans和ESM)生成蛋白质嵌入来训练分类器模型。总体而言,结果证明了第一个概念验证研究,以及使用深度学习和pLMs更快地预测PMPs的蛋白质 - 膜界面的潜力,与现有工具相比,具有相似的准确性,且无需3D结构数据。