Wang Junlin, Wang Wenbo, Shang Yi
IEEE/ACM Trans Comput Biol Bioinform. 2023 Sep-Oct;20(5):3306-3313. doi: 10.1109/TCBB.2023.3264899. Epub 2023 Oct 9.
The functions of proteins are largely determined by their three-dimensional (3D) structures. Loop modeling tries to predict the conformation of a relatively short stretch of protein backbone and sidechain. It is a difficult problem due to conformational variability. Recently, AlphaFold2 has achieved outstanding results in 3-D protein structure prediction and is expected to perform well on loop modeling. In this paper, we investigate the performances of AlphaFold2 variants on popular loop modeling benchmark datasets and propose an efficient protocol of using AlphaFold2 for loop modeling, called IAFLoop. To predict the structure of a loop region, IAFLoop gives a moderately extended segment of the target loop region as input to AlphaFold2, runs a fast version of AlphaFold2 using a reduced database without ensembling, and uses RMSD based consensus scores to select the final output models. Our experimental results on benchmark datasets show that IAFLoop generated highly accurate loop models. It achieves comparable performance to the original application of AlphaFold2 in terms of RMSD error, and achieving much better results on some targets, while only using half of the time. Compared to the best previous methods, IAFLoop reduces the RMSD error by almost half on the 8-residual loop dataset, and more than 70% on the 12-residual loop dataset.
蛋白质的功能很大程度上由其三维(3D)结构决定。环建模试图预测蛋白质主链和侧链相对较短片段的构象。由于构象的可变性,这是一个难题。最近,AlphaFold2在三维蛋白质结构预测方面取得了出色成果,预计在环建模方面也会表现良好。在本文中,我们研究了AlphaFold2变体在流行的环建模基准数据集上的性能,并提出了一种使用AlphaFold2进行环建模的有效方案,称为IAFLoop。为了预测环区域的结构,IAFLoop将目标环区域的适度延伸片段作为输入提供给AlphaFold2,使用精简数据库运行AlphaFold2的快速版本且不进行集成,并使用基于均方根偏差(RMSD)的一致性分数来选择最终输出模型。我们在基准数据集上的实验结果表明,IAFLoop生成了高度准确的环模型。在RMSD误差方面,它实现了与AlphaFold2原始应用相当的性能,并且在某些目标上取得了更好的结果,同时仅使用了一半的时间。与之前最好的方法相比,IAFLoop在8残基环数据集上使RMSD误差降低了近一半,在12残基环数据集上降低了70%以上。