Suppr超能文献

PhosPred-RF:一种仅使用序列信息的基于序列的磷酸化位点新型预测工具。

PhosPred-RF: A Novel Sequence-Based Predictor for Phosphorylation Sites Using Sequential Information Only.

作者信息

Wei Leyi, Xing Pengwei, Tang Jijun, Zou Quan

出版信息

IEEE Trans Nanobioscience. 2017 Jun;16(4):240-247. doi: 10.1109/TNB.2017.2661756. Epub 2017 Jan 31.

Abstract

Many recent efforts have been made for the development of machine learning-based methods for fast and accurate phosphorylation site prediction. Currently, a majority of well-performing methods are based on hybrid information to build prediction models, such as evolutionary information, disorder information, and so on. Unfortunately, this type of methods suffers two major limitations: one is that it would not be much of help for protein phosphorylation site prediction in case of no obvious homology detected; the other is that computing such the complicated information is time-consuming, which probably limits the usage of predictors in practical applications. In this paper, we present a simple, fast, and powerful feature representation algorithm, which sufficiently explores the sequential information from multiple perspectives only based on primary sequences, and successfully captures the differences between true phosphorylation sites and hboxnon-phosphorylation sites. Using the proposed features, we propose a random forest-based predictor named PhosPred-RF in the prediction of protein phosphorylation sites from proteins. We evaluate and compare the proposed predictor with the state-of-the-art predictors on some benchmark data sets. The experimental results show that PhosPred-RF outperforms other existing predictors, demonstrating its potential to be a useful tool for protein phosphorylation site prediction. Currently, the proposed PhosPred-RF is freely accessible to the public through the user-friendly webserver http://server.malab.cn/PhosPred-RF.

摘要

最近,人们为开发基于机器学习的快速准确的磷酸化位点预测方法做出了许多努力。目前,大多数性能良好的方法都是基于混合信息来构建预测模型的,例如进化信息、无序信息等等。不幸的是,这类方法存在两个主要局限性:一是在未检测到明显同源性的情况下,对蛋白质磷酸化位点预测帮助不大;另一个是计算如此复杂的信息很耗时,这可能会限制预测器在实际应用中的使用。在本文中,我们提出了一种简单、快速且强大的特征表示算法,该算法仅基于一级序列从多个角度充分挖掘序列信息,并成功捕捉到真正的磷酸化位点与非磷酸化位点之间的差异。利用所提出的特征,我们在从蛋白质预测蛋白质磷酸化位点时提出了一种基于随机森林的预测器,名为PhosPred-RF。我们在一些基准数据集上对所提出的预测器与当前最先进的预测器进行了评估和比较。实验结果表明,PhosPred-RF优于其他现有预测器,证明了其成为蛋白质磷酸化位点预测有用工具的潜力。目前,通过用户友好的网络服务器http://server.malab.cn/PhosPred-RF可向公众免费提供所提出的PhosPred-RF。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验