基于能量的图卷积网络在蛋白质对接模型评分中的应用。

Energy-based graph convolutional networks for scoring protein docking models.

机构信息

Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas.

TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, Texas.

出版信息

Proteins. 2020 Aug;88(8):1091-1099. doi: 10.1002/prot.25888. Epub 2020 Mar 16.

DOI:10.1002/prot.25888

PMID:32144844

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7374013/

Abstract

Structural information about protein-protein interactions, often missing at the interactome scale, is important for mechanistic understanding of cells and rational discovery of therapeutics. Protein docking provides a computational alternative for such information. However, ranking near-native docked models high among a large number of candidates, often known as the scoring problem, remains a critical challenge. Moreover, estimating model quality, also known as the quality assessment problem, is rarely addressed in protein docking. In this study, the two challenging problems in protein docking are regarded as relative and absolute scoring, respectively, and addressed in one physics-inspired deep learning framework. We represent protein and complex structures as intra- and inter-molecular residue contact graphs with atom-resolution node and edge features. And we propose a novel graph convolutional kernel that aggregates interacting nodes' features through edges so that generalized interaction energies can be learned directly from 3D data. The resulting energy-based graph convolutional networks (EGCN) with multihead attention are trained to predict intra- and inter-molecular energies, binding affinities, and quality measures (interface RMSD) for encounter complexes. Compared to a state-of-the-art scoring function for model ranking, EGCN significantly improves ranking for a critical assessment of predicted interactions (CAPRI) test set involving homology docking; and is comparable or slightly better for Score_set, a CAPRI benchmark set generated by diverse community-wide docking protocols not known to training data. For Score_set quality assessment, EGCN shows about 27% improvement to our previous efforts. Directly learning from 3D structure data in graph representation, EGCN represents the first successful development of graph convolutional networks for protein docking.

摘要

蛋白质-蛋白质相互作用的结构信息在相互作用组学尺度上经常缺失，对于理解细胞的机制和合理发现治疗方法至关重要。蛋白质对接为获取此类信息提供了一种计算替代方法。然而，在大量候选物中对接近天然的对接模型进行高排名，通常称为打分问题，仍然是一个关键挑战。此外，在蛋白质对接中很少解决模型质量估计问题，也称为质量评估问题。在这项研究中，蛋白质对接中的两个具有挑战性的问题分别被视为相对和绝对打分，并在一个受物理启发的深度学习框架中进行了处理。我们将蛋白质和复合物结构表示为具有原子分辨率节点和边特征的分子内和分子间残基接触图。我们提出了一种新的图卷积核，通过边聚合相互作用节点的特征，以便可以直接从 3D 数据中学习广义相互作用能。基于能量的带有多头注意力的图卷积网络（EGCN）用于预测分子内和分子间能量、结合亲和力和质量度量（界面 RMSD）用于遭遇复合物。与用于模型排名的最先进打分函数相比，EGCN 显著提高了同源对接的关键评估预测相互作用 (CAPRI) 测试集的排名；对于 Score_set，即由不同社区范围的对接协议生成的 CAPRI 基准集，该协议不为人知训练数据，EGCN 的表现与之相当或略好。对于 Score_set 的质量评估，EGCN 相对于我们之前的努力提高了约 27%。EGCN 通过图表示中的 3D 结构数据直接学习，代表了用于蛋白质对接的图卷积网络的首次成功开发。

相似文献

Energy-based graph convolutional networks for scoring protein docking models.基于能量的图卷积网络在蛋白质对接模型评分中的应用。

Proteins. 2020 Aug;88(8):1091-1099. doi: 10.1002/prot.25888. Epub 2020 Mar 16.

Coarse-grained and atomic resolution biomolecular docking with the ATTRACT approach.采用 ATTRACT 方法进行粗粒化和原子分辨率的生物分子对接。

Proteins. 2020 Aug;88(8):1018-1028. doi: 10.1002/prot.25860. Epub 2019 Dec 13.

Challenges and opportunities of automated protein-protein docking: HDOCK server vs human predictions in CAPRI Rounds 38-46.自动化蛋白质-蛋白质对接的挑战与机遇：CAPRI 第 38-46 轮中 HDOCK 服务器与人类预测的比较。

Proteins. 2020 Aug;88(8):1055-1069. doi: 10.1002/prot.25874. Epub 2020 Feb 7.

How to choose templates for modeling of protein complexes: Insights from benchmarking template-based docking.如何为蛋白质复合物建模选择模板：基于基准测试的模板对接的见解。

Proteins. 2020 Aug;88(8):1070-1081. doi: 10.1002/prot.25875. Epub 2020 Feb 7.

Using restraints in EROS-DOCK improves model quality in pairwise and multicomponent protein docking.在 EROS-DOCK 中使用约束可以提高对映体和多组分蛋白对接中模型的质量。

Proteins. 2020 Aug;88(8):1121-1128. doi: 10.1002/prot.25959. Epub 2020 Jul 8.

Template-based modeling of diverse protein interactions in CAPRI rounds 38-45.基于模板的 CAPRI 第 38-45 轮中不同蛋白质相互作用的建模。

Proteins. 2020 Aug;88(8):939-947. doi: 10.1002/prot.25845. Epub 2019 Nov 21.

Performance of human and server prediction in CAPRI rounds 38-45.在 CAPRI 第 38-45 轮中人类和服务器预测的表现。

Proteins. 2020 Aug;88(8):1110-1120. doi: 10.1002/prot.25956. Epub 2020 Jul 1.

An overview of data-driven HADDOCK strategies in CAPRI rounds 38-45.基于数据驱动的 HADDOCK 在 CAPRI 第 38-45 轮中的应用概述。

Proteins. 2020 Aug;88(8):1029-1036. doi: 10.1002/prot.25869. Epub 2020 Jan 7.

Docking proteins and peptides under evolutionary constraints in Critical Assessment of PRediction of Interactions rounds 38 to 45.在第 38 至 45 轮关键评估蛋白质-蛋白质相互作用预测挑战赛中受进化约束的对接蛋白和肽。

Proteins. 2020 Aug;88(8):986-998. doi: 10.1002/prot.25857. Epub 2019 Dec 3.

ClusPro in rounds 38 to 45 of CAPRI: Toward combining template-based methods with free docking.ClusPro 在 CAPRI 的第 38 至 45 轮：朝着将基于模板的方法与自由对接相结合的方向发展。

Proteins. 2020 Aug;88(8):1082-1090. doi: 10.1002/prot.25887. Epub 2020 Mar 23.

引用本文的文献

EquiRank: Improved protein-protein interface quality estimation using protein language-model-informed equivariant graph neural networks.EquiRank：使用蛋白质语言模型引导的等变图神经网络改进蛋白质-蛋白质界面质量评估

Comput Struct Biotechnol J. 2024 Dec 30;27:160-170. doi: 10.1016/j.csbj.2024.12.015. eCollection 2025.

How to build the virtual cell with artificial intelligence: Priorities and opportunities.如何利用人工智能构建虚拟细胞：优先事项与机遇

Cell. 2024 Dec 12;187(25):7045-7063. doi: 10.1016/j.cell.2024.11.015.

How to Build the Virtual Cell with Artificial Intelligence: Priorities and Opportunities.如何利用人工智能构建虚拟细胞：优先事项与机遇

ArXiv. 2024 Oct 14:arXiv:2409.11654v2.

EGG: Accuracy Estimation of Individual Multimeric Protein Models Using Deep Energy-Based Models and Graph Neural Networks.EGG：使用深度能量模型和图神经网络估计个体多聚体蛋白质模型的准确性。

Int J Mol Sci. 2024 Jun 6;25(11):6250. doi: 10.3390/ijms25116250.

A Survey of Deep Learning Methods for Estimating the Accuracy of Protein Quaternary Structure Models.深度学习方法估计蛋白质四级结构模型准确性的研究综述。

Biomolecules. 2024 May 13;14(5):574. doi: 10.3390/biom14050574.

Predicting transcriptional activation domain function using Graph Neural Networks.使用图神经网络预测转录激活域功能。

bioRxiv. 2024 May 12:2024.05.08.593266. doi: 10.1101/2024.05.08.593266.

Leveraging machine learning models for peptide-protein interaction prediction.利用机器学习模型进行肽-蛋白质相互作用预测。

RSC Chem Biol. 2024 Mar 13;5(5):401-417. doi: 10.1039/d3cb00208j. eCollection 2024 May 8.

Recent advances and challenges in protein complex model accuracy estimation.蛋白质复合物模型准确性评估的最新进展与挑战

Comput Struct Biotechnol J. 2024 Apr 21;23:1824-1832. doi: 10.1016/j.csbj.2024.04.049. eCollection 2024 Dec.

Leveraging Machine Learning Models for Peptide-Protein Interaction Prediction.利用机器学习模型进行肽-蛋白质相互作用预测。

ArXiv. 2024 Feb 7:arXiv:2310.18249v2.

A gated graph transformer for protein complex structure quality assessment and its performance in CASP15.门控图转换器用于蛋白质复合物结构质量评估及其在 CASP15 中的性能。

Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i308-i317. doi: 10.1093/bioinformatics/btad203.

本文引用的文献

Bayesian Active Learning for Optimization and Uncertainty Quantification in Protein Docking.贝叶斯主动学习在蛋白质对接中的优化和不确定性量化。

J Chem Theory Comput. 2020 Aug 11;16(8):5334-5347. doi: 10.1021/acs.jctc.0c00476. Epub 2020 Jul 6.

Blind prediction of homo- and hetero-protein complexes: The CASP13-CAPRI experiment.同/异源蛋白质复合物的盲预测：CASP13-CAPRI 实验。

Proteins. 2019 Dec;87(12):1200-1221. doi: 10.1002/prot.25838. Epub 2019 Oct 25.

iScore: a novel graph kernel-based function for scoring protein-protein docking models.iScore：一种用于评估蛋白质-蛋白质对接模型的基于新型图核的函数。

Bioinformatics. 2020 Jan 1;36(1):112-121. doi: 10.1093/bioinformatics/btz496.

Protein model quality assessment using 3D oriented convolutional neural networks.使用三维定向卷积神经网络进行蛋白质模型质量评估。

Bioinformatics. 2019 Sep 15;35(18):3313-3319. doi: 10.1093/bioinformatics/btz122.

What method to use for protein-protein docking?用于蛋白质-蛋白质对接的方法是什么？

Curr Opin Struct Biol. 2019 Apr;55:1-7. doi: 10.1016/j.sbi.2018.12.010. Epub 2019 Feb 1.

RNA3DCNN: Local and global quality assessments of RNA 3D structures using 3D deep convolutional neural networks.RNA3DCNN：使用 3D 深度卷积神经网络对 RNA 3D 结构进行局部和全局质量评估。

PLoS Comput Biol. 2018 Nov 27;14(11):e1006514. doi: 10.1371/journal.pcbi.1006514. eCollection 2018 Nov.

Deep convolutional networks for quality assessment of protein folds.深度卷积神经网络在蛋白质折叠质量评估中的应用。

Bioinformatics. 2018 Dec 1;34(23):4046-4053. doi: 10.1093/bioinformatics/bty494.

The ClusPro web server for protein-protein docking.ClusPro 网页服务器，用于蛋白质-蛋白质对接。

Nat Protoc. 2017 Feb;12(2):255-278. doi: 10.1038/nprot.2016.169. Epub 2017 Jan 12.

DeepQA: improving the estimation of single protein model quality with deep belief networks.深度问答：利用深度信念网络改进单一蛋白质模型质量的评估

BMC Bioinformatics. 2016 Dec 5;17(1):495. doi: 10.1186/s12859-016-1405-y.

Predicting protein conformational changes for unbound and homology docking: learning from intrinsic and induced flexibility.预测未结合和同源对接的蛋白质构象变化：从内在和诱导柔性中学习。

Proteins. 2017 Mar;85(3):544-556. doi: 10.1002/prot.25212. Epub 2016 Dec 5.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验