Chen Yan, Shang Yi, Xu Dong
Proc Congr Evol Comput. 2014 Jul;2014:1038-1045. doi: 10.1109/CEC.2014.6900443.
Protein structure prediction, i.e., computationally predicting the three-dimensional structure of a protein from its primary sequence, is one of the most important and challenging problems in bioinformatics. Model refinement is a key step in the prediction process, where improved structures are constructed based on a pool of initially generated models. Since the refinement category was added to the biennial Critical Assessment of Structure Prediction (CASP) in 2008, CASP results show that it is a challenge for existing model refinement methods to improve model quality consistently. This paper presents three evolutionary algorithms for protein model refinement, in which multidimensional scaling(MDS), the MODELLER software, and a hybrid of both are used as crossover operators, respectively. The MDS-based method takes a purely geometrical approach and generates a child model by combining the contact maps of multiple parents. The MODELLER-based method takes a statistical and energy minimization approach, and uses the remodeling module in MODELLER program to generate new models from multiple parents. The hybrid method first generates models using the MDS-based method and then run them through the MODELLER-based method, aiming at combining the strength of both. Promising results have been obtained in experiments using CASP datasets. The MDS-based method improved the best of a pool of predicted models in terms of the global distance test score (GDT-TS) in 9 out of 16test targets.
蛋白质结构预测,即从蛋白质的一级序列通过计算预测其三维结构,是生物信息学中最重要且最具挑战性的问题之一。模型优化是预测过程中的关键步骤,在此步骤中,基于最初生成的一组模型构建出改进后的结构。自2008年细化类别被添加到两年一次的蛋白质结构预测关键评估(CASP)中以来,CASP结果表明,对于现有的模型优化方法而言,持续提高模型质量是一项挑战。本文提出了三种用于蛋白质模型优化的进化算法,其中分别将多维缩放(MDS)、MODELLER软件以及二者的混合方法用作交叉算子。基于MDS的方法采用纯粹的几何方法,通过组合多个亲本的接触图来生成子代模型。基于MODELLER的方法采用统计和能量最小化方法,并使用MODELLER程序中的重塑模块从多个亲本生成新模型。混合方法首先使用基于MDS的方法生成模型,然后通过基于MODELLER的方法运行这些模型,旨在结合二者的优势。在使用CASP数据集进行的实验中取得了有前景的结果。基于MDS的方法在16个测试目标中的9个目标上,就全局距离测试得分(GDT-TS)而言改进了一组预测模型中的最佳模型。