Taguchi Y-h, Gromiha M Michael
Department of Physics, Faculty of Science and Technology, Chuo University, 1-13-27 Kasuga, Bunkyo-ku, Tokyo 112-8551, Japan.
BMC Bioinformatics. 2007 Oct 22;8:404. doi: 10.1186/1471-2105-8-404.
Predicting the three-dimensional structure of a protein from its amino acid sequence is a long-standing goal in computational/molecular biology. The discrimination of different structural classes and folding types are intermediate steps in protein structure prediction.
In this work, we have proposed a method based on linear discriminant analysis (LDA) for discriminating 30 different folding types of globular proteins using amino acid occurrence. Our method was tested with a non-redundant set of 1612 proteins and it discriminated them with the accuracy of 38%, which is comparable to or better than other methods in the literature. A web server has been developed for discriminating the folding type of a query protein from its amino acid sequence and it is available at http://granular.com/PROLDA/.
Amino acid occurrence has been successfully used to discriminate different folding types of globular proteins. The discrimination accuracy obtained with amino acid occurrence is better than that obtained with amino acid composition and/or amino acid properties. In addition, the method is very fast to obtain the results.
从氨基酸序列预测蛋白质的三维结构是计算生物学/分子生物学中长期以来的目标。区分不同的结构类别和折叠类型是蛋白质结构预测中的中间步骤。
在这项工作中,我们提出了一种基于线性判别分析(LDA)的方法,利用氨基酸出现频率来区分30种不同的球状蛋白质折叠类型。我们的方法用一组1612个非冗余蛋白质进行了测试,其区分这些蛋白质的准确率为38%,与文献中的其他方法相当或更好。已开发出一个网络服务器,用于根据查询蛋白质的氨基酸序列来区分其折叠类型,可在http://granular.com/PROLDA/获取。
氨基酸出现频率已成功用于区分不同的球状蛋白质折叠类型。使用氨基酸出现频率获得的区分准确率优于使用氨基酸组成和/或氨基酸性质获得的准确率。此外,该方法获取结果的速度非常快。