Liu Jingfa, Song Beibei, Yao Yonglei, Xue Yu, Liu Wenjie, Liu Zhaoxia
Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Science & Technology, Nanjing, 210044, China and School of Computer & Software, Nanjing University of Information Science & Technology, Nanjing, 210044, China and Network Information Center, Nanjing University of Information Science & Technology, Nanjing 210044, China.
Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Science & Technology, Nanjing, 210044, China and School of Computer & Software, Nanjing University of Information Science & Technology, Nanjing, 210044, China.
Phys Rev E Stat Nonlin Soft Matter Phys. 2014 Oct;90(4):042715. doi: 10.1103/PhysRevE.90.042715. Epub 2014 Oct 15.
Finding the global minimum-energy structure is one of the main problems of protein structure prediction. The face-centered-cubic (fcc) hydrophobic-hydrophilic (HP) lattice model can reach high approximation ratios of real protein structures, so the fcc lattice model is a good choice to predict the protein structures. The lacking of an effective global optimization method is the key obstacle in solving this problem. The Wang-Landau sampling method is especially useful for complex systems with a rough energy landscape and has been successfully applied to solving many optimization problems. We apply the improved Wang-Landau (IWL) sampling method, which incorporates the generation of an initial conformation based on the greedy strategy and the neighborhood strategy based on pull moves into the Wang-Landau sampling method to predict the protein structures on the fcc HP lattice model. Unlike conventional Monte Carlo simulations that generate a probability distribution at a given temperature, the Wang-Landau sampling method can estimate the density of states accurately via a random walk, which produces a flat histogram in energy space. We test 12 general benchmark instances on both two-dimensional and three-dimensional (3D) fcc HP lattice models. The lowest energies by the IWL sampling method are as good as or better than those of other methods in the literature for all instances. We then test five sets of larger-scale instances, denoted by the S, R, F90, F180, and CASP target instances on the 3D fcc HP lattice model. The numerical results show that our algorithm performs better than the other five methods in the literature on both the lowest energies and the average lowest energies in all runs. The IWL sampling method turns out to be a powerful tool to study the structure prediction of the fcc HP lattice model proteins.
寻找全局最小能量结构是蛋白质结构预测的主要问题之一。面心立方(fcc)疏水 - 亲水(HP)晶格模型能够达到对真实蛋白质结构的高近似率,因此fcc晶格模型是预测蛋白质结构的一个不错选择。缺乏有效的全局优化方法是解决此问题的关键障碍。王 - 朗道抽样方法对于具有粗糙能量景观的复杂系统特别有用,并且已成功应用于解决许多优化问题。我们应用改进的王 - 朗道(IWL)抽样方法,该方法将基于贪婪策略生成初始构象以及基于拉动移动的邻域策略纳入王 - 朗道抽样方法,以在fcc HP晶格模型上预测蛋白质结构。与在给定温度下生成概率分布的传统蒙特卡罗模拟不同,王 - 朗道抽样方法可以通过随机游走准确估计态密度,这在能量空间中产生一个平坦的直方图。我们在二维和三维(3D)fcc HP晶格模型上测试了12个通用基准实例。对于所有实例,IWL抽样方法得到的最低能量与文献中其他方法的结果一样好或更好。然后我们在3D fcc HP晶格模型上测试了五组更大规模的实例,分别表示为S、R、F90、F180和CASP目标实例。数值结果表明,在所有运行中,我们的算法在最低能量和平均最低能量方面都比文献中的其他五种方法表现更好。事实证明,IWL抽样方法是研究fcc HP晶格模型蛋白质结构预测的有力工具。