Chen Chao, Zhou Xibin, Tian Yuanxin, Zou Xiaoyong, Cai Peixiang
School of Chemistry and Chemical Engineering, Sun Yat-Sen University, Guangzhou 510275, PR China.
Anal Biochem. 2006 Oct 1;357(1):116-21. doi: 10.1016/j.ab.2006.07.022. Epub 2006 Aug 7.
Because a priori knowledge of a protein structural class can provide useful information about its overall structure, the determination of protein structural class is a quite meaningful topic in protein science. However, with the rapid increase in newly found protein sequences entering into databanks, it is both time-consuming and expensive to do so based solely on experimental techniques. Therefore, it is vitally important to develop a computational method for predicting the protein structural class quickly and accurately. To deal with the challenge, this article presents a dual-layer support vector machine (SVM) fusion network that is featured by using a different pseudo-amino acid composition (PseAA). The PseAA here contains much information that is related to the sequence order of a protein and the distribution of the hydrophobic amino acids along its chain. As a showcase, the rigorous jackknife cross-validation test was performed on the two benchmark data sets constructed by Zhou. A significant enhancement in success rates was observed, indicating that the current approach may serve as a powerful complementary tool to other existing methods in this area.
由于蛋白质结构类别的先验知识可以提供有关其整体结构的有用信息,因此蛋白质结构类别的确定是蛋白质科学中一个非常有意义的课题。然而,随着新发现的蛋白质序列快速增加并进入数据库,仅基于实验技术来确定结构类别既耗时又昂贵。因此,开发一种快速准确地预测蛋白质结构类别的计算方法至关重要。为应对这一挑战,本文提出了一种双层支持向量机(SVM)融合网络,其特点是使用了不同的伪氨基酸组成(PseAA)。这里的PseAA包含了许多与蛋白质的序列顺序以及疏水氨基酸沿其链的分布相关的信息。作为一个展示,对周构建的两个基准数据集进行了严格的留一法交叉验证测试。观察到成功率有显著提高,表明当前方法可作为该领域其他现有方法的有力补充工具。