Shepherd Adrian J, Gorse Denise, Thornton Janet M
Department of Biochemistry and Molecular Biology, University College London, London, United Kingdom.
Proteins. 2003 Feb 1;50(2):290-302. doi: 10.1002/prot.10290.
A novel method is presented for the prediction of protein architecture from sequence using neural networks. The method involves the preprocessing of protein sequence data by numerically encoding it and then applying a Fourier transform. The encoded and transformed data are then used to train a neural network to recognize a number of different protein architectures. The method proved significantly better than comparable alternative strategies such as percentage dipeptide frequency, but is still limited by the size of the data set and the input demands of a neural network. Its main potential is as a complement to existing fold recognition techniques, with its ability to identify global symmetries within protein structures its greatest strength.
提出了一种使用神经网络从序列预测蛋白质结构的新方法。该方法包括通过对蛋白质序列数据进行数字编码然后应用傅里叶变换来进行预处理。然后将编码和变换后的数据用于训练神经网络以识别多种不同的蛋白质结构。该方法被证明比诸如二肽频率百分比等可比的替代策略要好得多,但仍然受到数据集大小和神经网络输入要求的限制。其主要潜力在于作为现有折叠识别技术的补充,识别蛋白质结构内全局对称性的能力是其最大优势。