Adiyaman Recep, Edmunds Nicholas S, Genc Ahmet G, Alharbi Shuaa M A, McGuffin Liam J
School of Biological Sciences, University of Reading, Reading RG6 6EX, UK.
Bioinform Adv. 2023 Jun 14;3(1):vbad078. doi: 10.1093/bioadv/vbad078. eCollection 2023.
The accuracy gap between predicted and experimental structures has been significantly reduced following the development of AlphaFold2 (AF2). However, for many targets, AF2 models still have room for improvement. In previous CASP experiments, highly computationally intensive MD simulation-based methods have been widely used to improve the accuracy of single 3D models. Here, our ReFOLD pipeline was adapted to refine AF2 predictions while maintaining high model accuracy at a modest computational cost. Furthermore, the AF2 recycling process was utilized to improve 3D models by using them as custom template inputs for tertiary and quaternary structure predictions.
According to the Molprobity score, 94% of the generated 3D models by ReFOLD were improved. AF2 recycling showed an improvement rate of 87.5% (using MSAs) and 81.25% (using single sequences) for monomeric AF2 models and 100% (MSA) and 97.8% (single sequence) for monomeric non-AF2 models, as measured by the average change in lDDT. By the same measure, the recycling of multimeric models showed an improvement rate of as much as 80% for AF2-Multimer (AF2M) models and 94% for non-AF2M models.
Refinement using AlphaFold2-Multimer recycling is available as part of the MultiFOLD docker package (https://hub.docker.com/r/mcguffin/multifold). The ReFOLD server is available at https://www.reading.ac.uk/bioinf/ReFOLD/ and the modified scripts can be downloaded from https://www.reading.ac.uk/bioinf/downloads/.
Supplementary data are available at online.
随着AlphaFold2(AF2)的发展,预测结构与实验结构之间的准确性差距已显著缩小。然而,对于许多目标而言,AF2模型仍有改进空间。在以往的蛋白质结构预测关键评估(CASP)实验中,基于高计算强度分子动力学(MD)模拟的方法被广泛用于提高单个三维模型的准确性。在此,我们的ReFOLD流程经过调整,以在适度的计算成本下保持高模型准确性的同时优化AF2预测。此外,AF2循环利用过程被用于通过将三维模型用作三级和四级结构预测的自定义模板输入来改进三维模型。
根据Molprobity评分,ReFOLD生成的94%的三维模型得到了改进。通过lDDT的平均变化衡量,AF2循环利用显示,对于单体AF2模型,使用多序列比对(MSA)时改进率为87.5%,使用单序列时为81.25%;对于单体非AF2模型,使用MSA时改进率为100%,使用单序列时为97.8%。同样通过该衡量标准,多聚体模型的循环利用显示,对于AF2多聚体(AF2M)模型,改进率高达80%,对于非AF2M模型,改进率为94%。
使用AlphaFold2多聚体循环利用进行优化可作为MultiFOLD docker包(https://hub.docker.com/r/mcguffin/multifold)的一部分获取。ReFOLD服务器可在https://www.reading.ac.uk/bioinf/ReFOLD/获取,修改后的脚本可从https://www.reading.ac.uk/bioinf/downloads/下载。
补充数据可在网上获取。