EquiPNAS:利用基于蛋白质语言模型的等变深度图神经网络提高蛋白质-核酸结合位点预测。

EquiPNAS: improved protein-nucleic acid binding site prediction using protein-language-model-informed equivariant deep graph neural networks.

机构信息

Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, USA.

出版信息

Nucleic Acids Res. 2024 Mar 21;52(5):e27. doi: 10.1093/nar/gkae039.

Abstract

Protein language models (pLMs) trained on a large corpus of protein sequences have shown unprecedented scalability and broad generalizability in a wide range of predictive modeling tasks, but their power has not yet been harnessed for predicting protein-nucleic acid binding sites, critical for characterizing the interactions between proteins and nucleic acids. Here, we present EquiPNAS, a new pLM-informed E(3) equivariant deep graph neural network framework for improved protein-nucleic acid binding site prediction. By combining the strengths of pLM and symmetry-aware deep graph learning, EquiPNAS consistently outperforms the state-of-the-art methods for both protein-DNA and protein-RNA binding site prediction on multiple datasets across a diverse set of predictive modeling scenarios ranging from using experimental input to AlphaFold2 predictions. Our ablation study reveals that the pLM embeddings used in EquiPNAS are sufficiently powerful to dramatically reduce the dependence on the availability of evolutionary information without compromising on accuracy, and that the symmetry-aware nature of the E(3) equivariant graph-based neural architecture offers remarkable robustness and performance resilience. EquiPNAS is freely available at https://github.com/Bhattacharya-Lab/EquiPNAS.

摘要

基于大量蛋白质序列语料库训练的蛋白质语言模型 (pLMs) 在广泛的预测建模任务中展现出了前所未有的可扩展性和广泛的通用性,但它们的功能尚未被用于预测蛋白质-核酸结合位点,而这对于描述蛋白质和核酸之间的相互作用至关重要。在这里,我们提出了 EquiPNAS,这是一种新的 pLM 启发的 E(3)等变深度图神经网络框架,用于改进蛋白质-核酸结合位点预测。通过结合 pLM 和对称感知深度图学习的优势,EquiPNAS 在多个数据集上的蛋白质-DNA 和蛋白质-RNA 结合位点预测方面始终优于最先进的方法,涵盖了从使用实验输入到 AlphaFold2 预测的各种预测建模场景。我们的消融研究表明,EquiPNAS 中使用的 pLM 嵌入足以在不影响准确性的情况下,大大减少对进化信息可用性的依赖,并且 E(3)等变基于图的神经网络架构的对称性使得它具有显著的鲁棒性和性能弹性。EquiPNAS 可在 https://github.com/Bhattacharya-Lab/EquiPNAS 上免费获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/329c/10954458/b797adf1834e/gkae039figgra1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索