Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA.
Proteins. 2022 Mar;90(3):720-731. doi: 10.1002/prot.26269. Epub 2021 Nov 2.
Predicting the quaternary structure of protein complex is an important problem. Inter-chain residue-residue contact prediction can provide useful information to guide the ab initio reconstruction of quaternary structures. However, few methods have been developed to build quaternary structures from predicted inter-chain contacts. Here, we develop the first method based on gradient descent optimization (GD) to build quaternary structures of protein dimers utilizing inter-chain contacts as distance restraints. We evaluate GD on several datasets of homodimers and heterodimers using true/predicted contacts and monomer structures as input. GD consistently performs better than both simulated annealing and Markov Chain Monte Carlo simulation. Starting from an arbitrarily quaternary structure randomly initialized from the tertiary structures of protein chains and using true inter-chain contacts as input, GD can reconstruct high-quality structural models for homodimers and heterodimers with average TM-score ranging from 0.92 to 0.99 and average interface root mean square distance from 0.72 Å to 1.64 Å. On a dataset of 115 homodimers, using predicted inter-chain contacts as restraints, the average TM-score of the structural models built by GD is 0.76. For 46% of the homodimers, high-quality structural models with TM-score ≥ 0.9 are reconstructed from predicted contacts. There is a strong correlation between the quality of the reconstructed models and the precision and recall of predicted contacts. Only a moderate precision or recall of inter-chain contact prediction is needed to build good structural models for most homodimers. Moreover, GD improves the quality of quaternary structures predicted by AlphaFold2 on a Critical Assessment of Techniques for Protein Structure Prediction-Critical Assessments of Predictions of Interactions dataset.
预测蛋白质复合物的四级结构是一个重要的问题。链间残基-残基接触预测可以提供有用的信息来指导四级结构的从头重建。然而,很少有方法被开发出来用于根据预测的链间接触构建四级结构。在这里,我们开发了第一个基于梯度下降优化(GD)的方法,利用链间接触作为距离约束来构建蛋白质二聚体的四级结构。我们使用真实/预测的接触和单体结构作为输入,在几个同源二聚体和异源二聚体的数据集上评估 GD。GD 的性能始终优于模拟退火和马尔可夫链蒙特卡罗模拟。从蛋白质链的三级结构随机初始化的任意四级结构开始,并使用真实的链间接触作为输入,GD 可以重建同源二聚体和异源二聚体的高质量结构模型,平均 TM 分数范围为 0.92 到 0.99,平均界面均方根距离为 0.72 到 1.64 埃。在一个包含 115 个同源二聚体的数据集上,使用预测的链间接触作为约束,GD 构建的结构模型的平均 TM 分数为 0.76。对于 46%的同源二聚体,从预测的接触中重建了 TM 分数≥0.9 的高质量结构模型。重建模型的质量与预测接触的精度和召回率之间存在很强的相关性。对于大多数同源二聚体,只需要中等精度或召回率的链间接触预测就可以构建良好的结构模型。此外,GD 提高了 AlphaFold2 在蛋白质结构预测技术评估-Critical Assessments of Predictions of Interactions 数据集上预测的四级结构的质量。