Liu Jun, Zhao Kai-Long, He Guang-Xing, Wang Liu-Jing, Zhou Xiao-Gen, Zhang Gui-Jun
College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China.
Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109-2218, USA.
Bioinformatics. 2021 Dec 22;38(1):99-107. doi: 10.1093/bioinformatics/btab620.
With the great progress of deep learning-based inter-residue contact/distance prediction, the discrete space formed by fragment assembly cannot satisfy the distance constraint well. Thus, the optimal solution of the continuous space may not be achieved. Designing an effective closed-loop continuous dihedral angle optimization strategy that complements the discrete fragment assembly is crucial to improve the performance of the distance-assisted fragment assembly method.
In this article, we proposed a de novo protein structure prediction method called IPTDFold based on closed-loop iterative partition sampling, topology adjustment and residue-level distance deviation optimization. First, local dihedral angle crossover and mutation operators are designed to explore the conformational space extensively and achieve information exchange between the conformations in the population. Then, the dihedral angle rotation model of loop region with partial inter-residue distance constraints is constructed, and the rotation angle satisfying the constraints is obtained by differential evolution algorithm, so as to adjust the spatial position relationship between the secondary structures. Finally, the residue distance deviation is evaluated according to the difference between the conformation and the predicted distance, and the dihedral angle of the residue is optimized with biased probability. The final model is generated by iterating the above three steps. IPTDFold is tested on 462 benchmark proteins, 24 FM targets of CASP13 and 20 FM targets of CASP14. Results show that IPTDFold is significantly superior to the distance-assisted fragment assembly method Rosetta_D (Rosetta with distance). In particular, the prediction accuracy of IPTDFold does not decrease as the length of the protein increases. When using the same FastRelax protocol, the prediction accuracy of IPTDFold is significantly superior to that of trRosetta without orientation constraints, and is equivalent to that of the full version of trRosetta.
The source code and executable are freely available at https://github.com/iobio-zjut/IPTDFold.
Supplementary data are available at Bioinformatics online.
随着基于深度学习的残基间接触/距离预测取得巨大进展,片段组装形成的离散空间无法很好地满足距离约束。因此,可能无法实现连续空间的最优解。设计一种有效的闭环连续二面角优化策略来补充离散片段组装,对于提高距离辅助片段组装方法的性能至关重要。
在本文中,我们提出了一种基于闭环迭代分区采样、拓扑调整和残基级距离偏差优化的从头蛋白质结构预测方法IPTDFold。首先,设计局部二面角交叉和变异算子,广泛探索构象空间,实现群体中构象之间的信息交换。然后,构建具有部分残基间距离约束的环区域二面角旋转模型,通过差分进化算法获得满足约束的旋转角度,从而调整二级结构之间的空间位置关系。最后,根据构象与预测距离的差异评估残基距离偏差,并以偏置概率优化残基的二面角。通过迭代上述三个步骤生成最终模型。IPTDFold在462个基准蛋白质、CASP13的24个FM靶标和CASP14的20个FM靶标上进行了测试。结果表明,IPTDFold明显优于距离辅助片段组装方法Rosetta_D(带距离的Rosetta)。特别是,IPTDFold的预测准确性不会随着蛋白质长度的增加而降低。当使用相同的FastRelax协议时,IPTDFold的预测准确性明显优于无方向约束的trRosetta,并且与完整版本的trRosetta相当。
源代码和可执行文件可在https://github.com/iobio-zjut/IPTDFold上免费获取。
补充数据可在《生物信息学》在线获取。