Hung Chun-Min, Huang Yueh-Min, Chang Ming-Shi
Department of Engineering Science, National Cheng Kung University, No.1, Ta-Hsueh Road, Tainan 701, Taiwan, ROC.
Department of Biochemistry, National Cheng Kung University, No.1, Ta-Hsueh Road, Tainan 701, Taiwan, ROC.
Nonlinear Anal Theory Methods Appl. 2006 Sep 1;65(5):1070-1093. doi: 10.1016/j.na.2005.09.048. Epub 2005 Nov 28.
A hybrid evolutionary model is used to propose a hierarchical homology of protein sequences to identify protein functions systematically. The proposed model offers considerable potentials, considering the inconsistency of existing methods for predicting novel proteins. Because some novel proteins might align without meaningful conserved domains, maximizing the score of sequence alignment is not the best criterion for predicting protein functions. This work presents a decision model that can minimize the cost of making a decision for predicting protein functions using the hierarchical homologies. Particularly, the model has three characteristics: (i) it is a hybrid evolutionary model with multiple fitness functions that uses genetic programming to predict protein functions on a distantly related protein family, (ii) it incorporates modified robust point matching to accurately compare all feature points using the moment invariant and thin-plate spline theorems, and (iii) the hierarchical homologies holding up a novel protein sequence in the form of a causal tree can effectively demonstrate the relationship between proteins. This work describes the comparisons of nucleocapsid proteins from the putative polyprotein SARS virus and other coronaviruses in other hosts using the model.
一种混合进化模型被用于提出蛋白质序列的层次同源性,以系统地识别蛋白质功能。考虑到现有预测新蛋白质方法的不一致性,所提出的模型具有相当大的潜力。因为一些新蛋白质可能在没有有意义的保守结构域的情况下进行比对,所以最大化序列比对分数并不是预测蛋白质功能的最佳标准。这项工作提出了一种决策模型,该模型可以使用层次同源性来最小化预测蛋白质功能的决策成本。特别是,该模型具有三个特点:(i)它是一种具有多个适应度函数的混合进化模型,使用遗传编程来预测远缘相关蛋白质家族上的蛋白质功能;(ii)它结合了改进的鲁棒点匹配,以使用矩不变性和薄板样条定理准确比较所有特征点;(iii)以因果树形式支持新蛋白质序列的层次同源性可以有效地展示蛋白质之间的关系。这项工作描述了使用该模型对假定的多蛋白SARS病毒的核衣壳蛋白与其他宿主中的其他冠状病毒进行的比较。