Peyravi Farzad, Latif Alimohammad, Moshtaghioun Seyed Mohammad
* Department of Computer Engineering, Yazd University, Yazd, Iran.
† Department of Biology, Yazd University, Yazd, Iran.
J Bioinform Comput Biol. 2019 Apr;17(2):1950007. doi: 10.1142/S0219720019500070.
The prediction of protein structure from its amino acid sequence is one of the most prominent problems in computational biology. The biological function of a protein depends on its tertiary structure which is determined by its amino acid sequence via the process of protein folding. We propose a novel fold recognition method for protein tertiary structure prediction based on a hidden Markov model and 3D coordinates of amino acid residues. The method introduces states based on the basis vectors in Bravais cubic lattices to learn the path of amino acids of the proteins of each fold. Three hidden Markov models are considered based on simple cubic, body-centered cubic (BCC) and face-centered cubic (FCC) lattices. A 10-fold cross validation was performed on a set of 42 fold SCOP dataset. The proposed composite methodology is compared to fold recognition methods which have HMM as base of their algorithms having approaches on only amino acid sequence or secondary structure. The accuracy of proposed model based on face-centered cubic lattices is quite better in comparison with SAM, 3-HMM optimized and Markov chain optimized in overall experiment. The huge data of 3D space help the model to have greater performance in comparison to methods which use only primary structures or only secondary structures.
从氨基酸序列预测蛋白质结构是计算生物学中最突出的问题之一。蛋白质的生物学功能取决于其三级结构,而三级结构是通过蛋白质折叠过程由其氨基酸序列决定的。我们提出了一种基于隐马尔可夫模型和氨基酸残基三维坐标的蛋白质三级结构预测的新型折叠识别方法。该方法基于布拉维立方晶格中的基向量引入状态,以学习每种折叠的蛋白质氨基酸路径。基于简单立方、体心立方(BCC)和面心立方(FCC)晶格考虑了三个隐马尔可夫模型。在一组42个折叠的SCOP数据集上进行了10折交叉验证。将所提出的复合方法与以隐马尔可夫模型为算法基础、仅基于氨基酸序列或二级结构的折叠识别方法进行了比较。在总体实验中,与SAM、3-HMM优化和马尔可夫链优化相比,基于面心立方晶格的所提出模型的准确性要好得多。与仅使用一级结构或仅使用二级结构的方法相比,三维空间的大量数据有助于该模型具有更好的性能。