Suppr超能文献

用于预测氨基酸可及表面积的两阶段支持向量回归方法。

Two-stage support vector regression approach for predicting accessible surface areas of amino acids.

作者信息

Nguyen Minh N, Rajapakse Jagath C

机构信息

BioInformatics Research Centre, School of Computer Engineering, Nanyang Technological University, Singapore.

出版信息

Proteins. 2006 May 15;63(3):542-50. doi: 10.1002/prot.20883.

Abstract

We address the problem of predicting solvent accessible surface area (ASA) of amino acid residues in protein sequences, without classifying them into buried and exposed types. A two-stage support vector regression (SVR) approach is proposed to predict real values of ASA from the position-specific scoring matrices generated from PSI-BLAST profiles. By adding SVR as the second stage to capture the influences on the ASA value of a residue by those of its neighbors, the two-stage SVR approach achieves improvements of mean absolute errors up to 3.3%, and correlation coefficients of 0.66, 0.68, and 0.67 on the Manesh dataset of 215 proteins, the Barton dataset of 502 nonhomologous proteins, and the Carugo dataset of 338 proteins, respectively, which are better than the scores published earlier on these datasets. A Web server for protein ASA prediction by using a two-stage SVR method has been developed and is available (http://birc.ntu.edu.sg/~ pas0186457/asa.html).

摘要

我们研究了预测蛋白质序列中氨基酸残基溶剂可及表面积(ASA)的问题,而无需将它们分类为埋藏型和暴露型。提出了一种两阶段支持向量回归(SVR)方法,以从PSI-BLAST图谱生成的位置特异性评分矩阵预测ASA的实际值。通过将SVR作为第二阶段添加,以捕获相邻残基对一个残基ASA值的影响,两阶段SVR方法在215个蛋白质的Manesh数据集、502个非同源蛋白质的Barton数据集和338个蛋白质的Carugo数据集上分别实现了平均绝对误差提高3.3%,相关系数分别为0.66、0.68和0.67,这些结果优于这些数据集上先前公布的分数。已经开发了一个使用两阶段SVR方法进行蛋白质ASA预测的网络服务器,可通过(http://birc.ntu.edu.sg/~ pas0186457/asa.html)访问。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验