Suppr超能文献

基于语言模型的蛋白质-核酸结合位点预测研究进展

Advances in Language-Model-Informed Protein-Nucleic Acid Binding Site Prediction.

作者信息

Tarafder Sumit, Wang Xinyu, Roche Rahmatullah, Bhattacharya Debswapna

机构信息

Department of Computer Science, Virginia Tech, Blacksburg, VA, USA.

TSYS School of Computer Science, Columbus State University, Columbus, GA, USA.

出版信息

Methods Mol Biol. 2025;2941:139-151. doi: 10.1007/978-1-0716-4623-6_9.

Abstract

Interactions between proteins and nucleic acids are essential for understanding a wide range of cellular and evolutionary processes. Recent advancements in protein language models (pLMs), trained on vast protein sequence data, have revolutionized various predictive modeling tasks, offering unprecedented scalability and generalizability. Consequently, a number of computational methods have been developed in the recent past for protein-nucleic acid binding site prediction powered by pLMs. To this end, we recently developed the EquiPNAS method that integrates pLM embeddings with E(3) equivariant deep graph neural networks for enhancing accuracy and robustness in predicting protein-DNA and protein-RNA binding sites, thereby reducing the dependency on evolutionary information. Here we present an overview of the recent protein-nucleic acid binding site prediction methods, emphasizing the recent advances in harnessing the potential of pLMs, and provide a detailed description of the EquiPNAS methodology as well as the necessary materials and procedures for the computational prediction of protein-DNA and protein-RNA binding sites.

摘要

蛋白质与核酸之间的相互作用对于理解广泛的细胞和进化过程至关重要。基于大量蛋白质序列数据训练的蛋白质语言模型(pLMs)的最新进展,彻底改变了各种预测建模任务,提供了前所未有的可扩展性和通用性。因此,最近已经开发了许多计算方法,用于由pLMs驱动的蛋白质-核酸结合位点预测。为此,我们最近开发了EquiPNAS方法,该方法将pLM嵌入与E(3)等变深度图神经网络相结合,以提高预测蛋白质-DNA和蛋白质-RNA结合位点的准确性和稳健性,从而减少对进化信息的依赖。在这里,我们概述了最近的蛋白质-核酸结合位点预测方法,强调了利用pLMs潜力的最新进展,并详细描述了EquiPNAS方法以及蛋白质-DNA和蛋白质-RNA结合位点计算预测所需的材料和程序。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验