Jing Xiaoyang, Xu Jinbo
Toyota Technological Institute at Chicago, Chicago, IL 60637, USA.
Nat Comput Sci. 2021 Jul;1(7):462-469. doi: 10.1038/s43588-021-00098-9. Epub 2021 Jul 15.
Protein model refinement is the last step applied to improve the quality of a predicted protein model. Currently the most successful refinement methods rely on extensive conformational sampling and thus, take hours or days to refine even a single protein model. Here we propose a fast and effective model refinement method that applies GNN (graph neural networks) to predict refined inter-atom distance probability distribution from an initial model and then rebuilds 3D models from the predicted distance distribution. Tested on the CASP (Critical Assessment of Structure Prediction) refinement targets, our method has comparable accuracy as two leading human groups Feig and Baker, but runs substantially faster. Our method may refine one protein model within ~11 minutes on 1 CPU while Baker needs ~30 hours on 60 CPUs and Feig needs ~16 hours on 1 GPU. Finally, our study shows that GNN outperforms ResNet (convolutional residual neural networks) for model refinement when very limited conformational sampling is allowed.
蛋白质模型优化是用于提高预测蛋白质模型质量的最后一步。目前,最成功的优化方法依赖于广泛的构象采样,因此,即使是优化单个蛋白质模型也需要数小时或数天时间。在此,我们提出了一种快速有效的模型优化方法,该方法应用图神经网络(GNN)从初始模型预测优化后的原子间距离概率分布,然后根据预测的距离分布重建三维模型。在蛋白质结构预测关键评估(CASP)优化目标上进行测试时,我们的方法与两个领先的人类团队Feig和Baker具有相当的准确性,但运行速度要快得多。我们的方法在1个中央处理器(CPU)上约11分钟内可优化一个蛋白质模型,而Baker团队在60个CPU上需要约30小时,Feig团队在1个图形处理器(GPU)上需要约16小时。最后,我们的研究表明,在允许非常有限的构象采样时,对于模型优化,图神经网络优于残差神经网络(ResNet)。