Li Z R, Lin H H, Han L Y, Jiang L, Chen X, Chen Y Z
Bioinformatics and Drug Design Group, Department of Computational Science, National University of Singapore, Blk SOC1, Level 7, 3 Science Drive 2, Singapore 117543.
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W32-7. doi: 10.1093/nar/gkl305.
Sequence-derived structural and physicochemical features have frequently been used in the development of statistical learning models for predicting proteins and peptides of different structural, functional and interaction profiles. PROFEAT (Protein Features) is a web server for computing commonly-used structural and physicochemical features of proteins and peptides from amino acid sequence. It computes six feature groups composed of ten features that include 51 descriptors and 1447 descriptor values. The computed features include amino acid composition, dipeptide composition, normalized Moreau-Broto autocorrelation, Moran autocorrelation, Geary autocorrelation, sequence-order-coupling number, quasi-sequence-order descriptors and the composition, transition and distribution of various structural and physicochemical properties. In addition, it can also compute previous autocorrelations descriptors based on user-defined properties. Our computational algorithms were extensively tested and the computed protein features have been used in a number of published works for predicting proteins of functional classes, protein-protein interactions and MHC-binding peptides. PROFEAT is accessible at http://jing.cz3.nus.edu.sg/cgi-bin/prof/prof.cgi.
基于序列的结构和物理化学特征经常被用于开发统计学习模型,以预测具有不同结构、功能和相互作用特征的蛋白质和肽。PROFEAT(蛋白质特征)是一个网络服务器,用于从氨基酸序列计算蛋白质和肽常用的结构和物理化学特征。它计算由十个特征组成的六个特征组,包括51个描述符和1447个描述符值。计算得到的特征包括氨基酸组成、二肽组成、归一化的莫罗-布罗托自相关、莫兰自相关、吉尔里自相关、序列顺序耦合数、准序列顺序描述符以及各种结构和物理化学性质的组成、转变和分布。此外,它还可以根据用户定义的属性计算先前的自相关描述符。我们的计算算法经过了广泛测试,计算得到的蛋白质特征已在许多已发表的作品中用于预测功能类别的蛋白质、蛋白质-蛋白质相互作用和MHC结合肽。可通过http://jing.cz3.nus.edu.sg/cgi-bin/prof/prof.cgi访问PROFEAT。