Thachuk Chris, Shmygelska Alena, Hoos Holger H
Department of Computer Science, University of British Columbia, BC, V6T 1Z4, Canada.
BMC Bioinformatics. 2007 Sep 17;8:342. doi: 10.1186/1471-2105-8-342.
The ab initio protein folding problem consists of predicting protein tertiary structure from a given amino acid sequence by minimizing an energy function; it is one of the most important and challenging problems in biochemistry, molecular biology and biophysics. The ab initio protein folding problem is computationally challenging and has been shown to be NuRho -hard even when conformations are restricted to a lattice. In this work, we implement and evaluate the replica exchange Monte Carlo (REMC) method, which has already been applied very successfully to more complex protein models and other optimization problems with complex energy landscapes, in combination with the highly effective pull move neighbourhood in two widely studied Hydrophobic Polar (HP) lattice models.
We demonstrate that REMC is highly effective for solving instances of the square (2D)and cubic (3D) HP protein folding problem. When using the pull move neighbourhood, REMCoutperforms current state-of-the-art algorithms for most benchmark instances. Additionally, we show that this new algorithm provides a larger ensemble of ground-state structures than the existing state-of-the-art methods. Furthermore, it scales well with sequence length, and it finds significantly better conformations on long biological sequences and sequences with a provably unique ground-state structure, which is believed to be a characteristic of real proteins. We also present evidence that our REMC algorithm can fold sequences which exhibit significant interaction between termini in the hydrophobic core relatively easily.
We demonstrate that REMC utilizing the pull move neighbourhood significantly outperforms current state-of-the-art methods for protein structure prediction in the HP model on 2D and 3D lattices. This is particularly noteworthy, since so far, the state-of-the-art methods for2D and 3D HP protein folding - in particular, the pruned-enriched Rosenbluth method (PERM) and,to some extent, Ant Colony Optimisation (ACO) - were based on chain growth mechanisms. To the best of our knowledge, this is the first application of REMC to HP protein folding on the cubic lattice, and the first extension of the pull move neighbourhood to a 3D lattice.
从头算蛋白质折叠问题是指通过最小化能量函数,从给定的氨基酸序列预测蛋白质三级结构;它是生物化学、分子生物学和生物物理学中最重要且最具挑战性的问题之一。从头算蛋白质折叠问题在计算上具有挑战性,并且已被证明即使构象限制在晶格上也是NP难的。在这项工作中,我们实现并评估了复制交换蒙特卡罗(REMC)方法,该方法已非常成功地应用于更复杂的蛋白质模型和其他具有复杂能量景观的优化问题,并结合了两种广泛研究的疏水-极性(HP)晶格模型中的高效拉动移动邻域。
我们证明REMC对于解决二维(2D)和三维(3D)HP蛋白质折叠问题的实例非常有效。当使用拉动移动邻域时,对于大多数基准实例,REMC的性能优于当前的最先进算法。此外,我们表明这种新算法比现有的最先进方法提供了更大的基态结构集合。此外,它与序列长度具有良好的扩展性,并且在长生物序列和具有可证明唯一基态结构的序列上能找到明显更好的构象,这被认为是真实蛋白质的一个特征。我们还提供证据表明,我们的REMC算法可以相对轻松地折叠在疏水核心中末端之间表现出显著相互作用的序列。
我们证明,在二维和三维晶格上的HP模型中,利用拉动移动邻域的REMC在蛋白质结构预测方面显著优于当前的最先进方法。这一点特别值得注意,因为到目前为止,二维和三维HP蛋白质折叠的最先进方法——特别是剪枝富集罗森布鲁斯方法(PERM)以及在某种程度上的蚁群优化(ACO)——都是基于链增长机制的。据我们所知,这是REMC首次应用于立方晶格上的HP蛋白质折叠,也是拉动移动邻域首次扩展到三维晶格。