Xiao Xuan, Wang Pu, Chou Kuo-Chen
Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen 33300, China.
J Theor Biol. 2008 Oct 7;254(3):691-6. doi: 10.1016/j.jtbi.2008.06.016. Epub 2008 Jun 24.
A novel approach was developed for predicting the structural classes of proteins based on their sequences. It was assumed that proteins belonging to the same structural class must bear some sort of similar texture on the images generated by the cellular automaton evolving rule [Wolfram, S., 1984. Cellular automation as models of complexity. Nature 311, 419-424]. Based on this, two geometric invariant moment factors derived from the image functions were used as the pseudo amino acid components [Chou, K.C., 2001. Prediction of protein cellular attributes using pseudo amino acid composition. Proteins: Struct., Funct., Genet. (Erratum: ibid., 2001, vol. 44, 60) 43, 246-255] to formulate the protein samples for statistical prediction. The success rates thus obtained on a previously constructed benchmark dataset are quite promising, implying that the cellular automaton image can help to reveal some inherent and subtle features deeply hidden in a pile of long and complicated amino acid sequences.
开发了一种基于蛋白质序列预测其结构类别的新方法。假定属于同一结构类别的蛋白质在由细胞自动机演化规则生成的图像上必定具有某种相似的特征[沃尔夫勒姆,S.,1984年。作为复杂性模型的细胞自动机。《自然》311, 419 - 424]。基于此,将从图像函数导出的两个几何不变矩因子用作伪氨基酸组分[周克成,2001年。使用伪氨基酸组成预测蛋白质细胞属性。《蛋白质:结构、功能、遗传学》(勘误:同上,2001年,第44卷,60页)43, 246 - 255]来构建用于统计预测的蛋白质样本。在先前构建的基准数据集上获得的成功率相当可观,这意味着细胞自动机图像有助于揭示隐藏在一堆冗长复杂的氨基酸序列中的一些内在和微妙特征。