Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan, USA.
Proteins. 2021 Dec;89(12):1870-1887. doi: 10.1002/prot.26161. Epub 2021 Jun 29.
Protein structure refinement is the last step in protein structure prediction pipelines. Physics-based refinement via molecular dynamics (MD) simulations has made significant progress during recent years. During CASP14, we tested a new refinement protocol based on an improved sampling strategy via MD simulations. MD simulations were carried out at an elevated temperature (360 K). An optimized use of biasing restraints and the use of multiple starting models led to enhanced sampling. The new protocol generally improved the model quality. In comparison with our previous protocols, the CASP14 protocol showed clear improvements. Our approach was successful with most initial models, many based on deep learning methods. However, we found that our approach was not able to refine machine-learning models from the AlphaFold2 group, often decreasing already high initial qualities. To better understand the role of refinement given new types of models based on machine-learning, a detailed analysis via MD simulations and Markov state modeling is presented here. We continue to find that MD-based refinement has the potential to improve AI predictions. We also identified several practical issues that make it difficult to realize that potential. Increasingly important is the consideration of inter-domain and oligomeric contacts in simulations; the presence of large kinetic barriers in refinement pathways also continues to present challenges. Finally, we provide a perspective on how physics-based refinement could continue to play a role in the future for improving initial predictions based on machine learning-based methods.
蛋白质结构精修是蛋白质结构预测管道的最后一步。近年来,基于物理的精修通过分子动力学(MD)模拟取得了重大进展。在 CASP14 中,我们测试了一种新的基于 MD 模拟改进采样策略的精修方案。MD 模拟在高温(360 K)下进行。通过优化偏置约束的使用和使用多个起始模型,实现了增强的采样。新方案通常提高了模型质量。与我们之前的方案相比,CASP14 方案显示出明显的改进。我们的方法对于大多数初始模型都很成功,其中许多模型基于深度学习方法。然而,我们发现我们的方法无法精修来自 AlphaFold2 组的机器学习模型,这些模型的初始质量通常已经很高,反而会降低其质量。为了更好地理解基于机器学习的新型模型的精修作用,我们通过 MD 模拟和 Markov 状态建模进行了详细分析。我们继续发现,基于 MD 的精修有可能改进 AI 预测。我们还确定了一些实际问题,这些问题使得难以实现这种潜力。越来越重要的是在模拟中考虑域间和寡聚接触;在精修途径中存在大的动力学障碍仍然是一个挑战。最后,我们就基于物理的精修如何在未来继续发挥作用,为基于机器学习的方法的初始预测提供改进提供了一个视角。