Shmygelska Alena, Hoos Holger H
Department of Computer Science, University of British Columbia, Vancouver, Canada.
BMC Bioinformatics. 2005 Feb 14;6:30. doi: 10.1186/1471-2105-6-30.
The protein folding problem is a fundamental problems in computational molecular biology and biochemical physics. Various optimisation methods have been applied to formulations of the ab-initio folding problem that are based on reduced models of protein structure, including Monte Carlo methods, Evolutionary Algorithms, Tabu Search and hybrid approaches. In our work, we have introduced an ant colony optimisation (ACO) algorithm to address the non-deterministic polynomial-time hard (NP-hard) combinatorial problem of predicting a protein's conformation from its amino acid sequence under a widely studied, conceptually simple model - the 2-dimensional (2D) and 3-dimensional (3D) hydrophobic-polar (HP) model.
We present an improvement of our previous ACO algorithm for the 2D HP model and its extension to the 3D HP model. We show that this new algorithm, dubbed ACO-HPPFP-3, performs better than previous state-of-the-art algorithms on sequences whose native conformations do not contain structural nuclei (parts of the native fold that predominantly consist of local interactions) at the ends, but rather in the middle of the sequence, and that it generally finds a more diverse set of native conformations.
The application of ACO to this bioinformatics problem compares favourably with specialised, state-of-the-art methods for the 2D and 3D HP protein folding problem; our empirical results indicate that our rather simple ACO algorithm scales worse with sequence length but usually finds a more diverse ensemble of native states. Therefore the development of ACO algorithms for more complex and realistic models of protein structure holds significant promise.
蛋白质折叠问题是计算分子生物学和生物化学物理学中的一个基本问题。各种优化方法已应用于基于蛋白质结构简化模型的从头折叠问题的公式化,包括蒙特卡罗方法、进化算法、禁忌搜索和混合方法。在我们的工作中,我们引入了一种蚁群优化(ACO)算法,以解决在一个广泛研究且概念简单的模型——二维(2D)和三维(3D)疏水-极性(HP)模型下,从氨基酸序列预测蛋白质构象的非确定性多项式时间难(NP-hard)组合问题。
我们展示了对先前用于二维HP模型的ACO算法的改进及其向三维HP模型的扩展。我们表明,这种名为ACO-HPPFP-3的新算法,在天然构象在序列末端不包含结构核(主要由局部相互作用组成的天然折叠部分)而是在序列中间的序列上,比先前的最先进算法表现更好,并且它通常能找到更多样化的天然构象集。
将ACO应用于这个生物信息学问题与用于二维和三维HP蛋白质折叠问题的专门的、最先进的方法相比具有优势;我们的实证结果表明,我们相当简单的ACO算法随序列长度的扩展性较差,但通常能找到更多样化的天然状态集合。因此,为更复杂和现实的蛋白质结构模型开发ACO算法具有重大前景。