Suppr超能文献

细胞周期蛋白预测:一种基于支持向量机的细胞周期蛋白蛋白质序列预测方法。

CyclinPred: a SVM-based method for predicting cyclin protein sequences.

作者信息

Kalita Mridul K, Nandal Umesh K, Pattnaik Ansuman, Sivalingam Anandhan, Ramasamy Gowthaman, Kumar Manish, Raghava Gajendra P S, Gupta Dinesh

机构信息

Structural and Computational Biology Group, International Centre for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi, India.

出版信息

PLoS One. 2008 Jul 2;3(7):e2605. doi: 10.1371/journal.pone.0002605.

Abstract

Functional annotation of protein sequences with low similarity to well characterized protein sequences is a major challenge of computational biology in the post genomic era. The cyclin protein family is once such important family of proteins which consists of sequences with low sequence similarity making discovery of novel cyclins and establishing orthologous relationships amongst the cyclins, a difficult task. The currently identified cyclin motifs and cyclin associated domains do not represent all of the identified and characterized cyclin sequences. We describe a Support Vector Machine (SVM) based classifier, CyclinPred, which can predict cyclin sequences with high efficiency. The SVM classifier was trained with features of selected cyclin and non cyclin protein sequences. The training features of the protein sequences include amino acid composition, dipeptide composition, secondary structure composition and PSI-BLAST generated Position Specific Scoring Matrix (PSSM) profiles. Results obtained from Leave-One-Out cross validation or jackknife test, self consistency and holdout tests prove that the SVM classifier trained with features of PSSM profile was more accurate than the classifiers based on either of the other features alone or hybrids of these features. A cyclin prediction server--CyclinPred has been setup based on SVM model trained with PSSM profiles. CyclinPred prediction results prove that the method may be used as a cyclin prediction tool, complementing conventional cyclin prediction methods.

摘要

对与特征明确的蛋白质序列相似度较低的蛋白质序列进行功能注释是后基因组时代计算生物学面临的一项重大挑战。细胞周期蛋白家族就是这样一个重要的蛋白质家族,其成员序列间的相似度较低,这使得发现新的细胞周期蛋白以及确定细胞周期蛋白之间的直系同源关系成为一项艰巨的任务。目前已鉴定出的细胞周期蛋白基序和细胞周期蛋白相关结构域并不能涵盖所有已鉴定和表征的细胞周期蛋白序列。我们描述了一种基于支持向量机(SVM)的分类器CyclinPred,它能够高效地预测细胞周期蛋白序列。该SVM分类器使用选定的细胞周期蛋白和非细胞周期蛋白序列的特征进行训练。蛋白质序列的训练特征包括氨基酸组成、二肽组成、二级结构组成以及PSI-BLAST生成的位置特异性评分矩阵(PSSM)谱。留一法交叉验证或刀切法测试、自一致性测试和留出法测试的结果证明,用PSSM谱特征训练的SVM分类器比仅基于其他任何单一特征或这些特征组合的分类器更准确。基于用PSSM谱训练的SVM模型建立了一个细胞周期蛋白预测服务器——CyclinPred。CyclinPred的预测结果证明,该方法可作为一种细胞周期蛋白预测工具,对传统的细胞周期蛋白预测方法起到补充作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3477/2435623/d1342dbeb316/pone.0002605.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验