Xiao X, Shao S, Ding Y, Huang Z, Chen X, Chou K-C
Bio-Informatics Research Center, Donghua University, Shanghai, China.
Amino Acids. 2005 Feb;28(1):29-35. doi: 10.1007/s00726-004-0154-9. Epub 2005 Feb 10.
A novel approach to visualize biological sequences is developed based on cellular automata (Wolfram, S. Nature 1984, 311, 419-424), a set of discrete dynamical systems in which space and time are discrete. By transforming the symbolic sequence codes into the digital codes, and using some optimal space-time evolvement rules of cellular automata, a biological sequence can be represented by a unique image, the so-called cellular automata image. Many important features, which are originally hidden in a long and complicated biological sequence, can be clearly revealed thru its cellular automata image. With biological sequences entering into databanks rapidly increasing in the post-genomic era, it is anticipated that the cellular automata image will become a very useful vehicle for investigation into their key features, identification of their function, as well as revelation of their "fingerprint". It is anticipated that by using the concept of the pseudo amino acid composition (Chou, K.C. Proteins: Structure, Function, and Genetics, 2001, 43, 246-255), the cellular automata image approach can also be used to improve the quality of predicting protein attributes, such as structural class and subcellular location.
基于细胞自动机(沃尔夫勒姆,S.《自然》1984年,第311卷,第419 - 424页)开发了一种可视化生物序列的新方法,细胞自动机是一组离散动力系统,其中空间和时间都是离散的。通过将符号序列编码转换为数字编码,并使用细胞自动机的一些最优时空演化规则,生物序列可以由一个独特的图像表示,即所谓的细胞自动机图像。许多原本隐藏在长而复杂的生物序列中的重要特征,可以通过其细胞自动机图像清晰地揭示出来。在后基因组时代,随着进入数据库的生物序列迅速增加,预计细胞自动机图像将成为研究其关键特征、识别其功能以及揭示其“指纹”的非常有用的工具。预计通过使用伪氨基酸组成的概念(周,K.C.《蛋白质:结构、功能和遗传学》,2001年,第43卷,第246 - 255页),细胞自动机图像方法也可用于提高预测蛋白质属性(如结构类别和亚细胞定位)的质量。