Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, M13 9PL, United Kingdom.
Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, United Kingdom.
Sci Rep. 2018 Sep 12;8(1):13694. doi: 10.1038/s41598-018-31891-8.
Difficulty in sampling large and complex conformational spaces remains a key limitation in fragment-based de novo prediction of protein structure. Our previous work has shown that even for small-to-medium-sized proteins, some current methods inadequately sample alternative structures. We have developed two new conformational sampling techniques, one employing a bilevel optimisation framework and the other employing iterated local search. We combine strategies of forced structural perturbation (where some fragment insertions are accepted regardless of their impact on scores) and greedy local optimisation, allowing greater exploration of the available conformational space. Comparisons against the Rosetta Abinitio method indicate that our protocols more frequently generate native-like predictions for many targets, even following the low-resolution phase, using a given set of fragment libraries. By contrasting results across two different fragment sets, we show that our methods are able to better take advantage of high-quality fragments. These improvements can also translate into more reliable identification of near-native structures in a simple clustering-based model selection procedure. We show that when fragment libraries are sufficiently well-constructed, improved breadth of exploration within runs improves prediction accuracy. Our results also suggest that in benchmarking scenarios, a total exclusion of fragments drawn from homologous templates can make performance differences between methods appear less pronounced.
在基于片段的从头预测蛋白质结构中,难以采样大型和复杂构象空间仍然是一个关键限制。我们之前的工作表明,即使对于中小规模的蛋白质,一些当前的方法也不能充分地采样替代结构。我们开发了两种新的构象采样技术,一种采用双层优化框架,另一种采用迭代局部搜索。我们结合了强制结构扰动的策略(其中一些片段插入被接受,而不管它们对分数的影响如何)和贪婪的局部优化,从而可以更充分地探索可用的构象空间。与 Rosetta Abinitio 方法的比较表明,即使在使用给定的片段库进行低分辨率阶段之后,我们的协议也更频繁地为许多目标生成类似天然的预测。通过比较两个不同片段集的结果,我们表明我们的方法能够更好地利用高质量的片段。这些改进也可以转化为在简单的基于聚类的模型选择过程中更可靠地识别近天然结构。我们表明,当片段库构建得足够好时,运行中探索范围的扩大可以提高预测准确性。我们的结果还表明,在基准测试场景中,完全排除来自同源模板的片段可以使方法之间的性能差异显得不那么明显。