Laboratory of DNA Information Analysis, University of Tokyo, Minato-ku, Tokyo, Japan.
BMC Bioinformatics. 2013 Jul 24;14:233. doi: 10.1186/1471-2105-14-233.
Assigning a protein into one of its folds is a transitional step for discovering three dimensional protein structure, which is a challenging task in bimolecular (biological) science. The present research focuses on: 1) the development of classifiers, and 2) the development of feature extraction techniques based on syntactic and/or physicochemical properties.
Apart from the above two main categories of research, we have shown that the selection of physicochemical attributes of the amino acids is an important step in protein fold recognition and has not been explored adequately. We have presented a multi-dimensional successive feature selection (MD-SFS) approach to systematically select attributes. The proposed method is applied on protein sequence data and an improvement of around 24% in fold recognition has been noted when selecting attributes appropriately.
The MD-SFS has been applied successfully in selecting physicochemical attributes of the amino acids. The selected attributes show improved protein fold recognition performance.
将蛋白质分配到其折叠之一是发现三维蛋白质结构的过渡步骤,这是双分子(生物)科学中的一项具有挑战性的任务。本研究集中于:1)分类器的开发,以及 2)基于语法和/或物理化学性质的特征提取技术的开发。
除了上述两个主要研究类别之外,我们还表明,选择氨基酸的物理化学属性是蛋白质折叠识别的重要步骤,并且尚未得到充分探索。我们提出了一种多维连续特征选择(MD-SFS)方法来系统地选择属性。在所提出的方法应用于蛋白质序列数据时,当适当地选择属性时,已经注意到折叠识别提高了约 24%。
MD-SFS 已成功应用于选择氨基酸的物理化学属性。所选属性显示出改进的蛋白质折叠识别性能。