Suppr超能文献

通过结合伪氨基酸组成和基于轮廓的蛋白质表示来鉴定DNA结合蛋白

DNA binding protein identification by combining pseudo amino acid composition and profile-based protein representation.

作者信息

Liu Bin, Wang Shanyi, Wang Xiaolong

机构信息

School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China.

Key Laboratory of Network Oriented Intelligent Computation, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China.

出版信息

Sci Rep. 2015 Oct 20;5:15479. doi: 10.1038/srep15479.

Abstract

DNA-binding proteins play an important role in most cellular processes. Therefore, it is necessary to develop an efficient predictor for identifying DNA-binding proteins only based on the sequence information of proteins. The bottleneck for constructing a useful predictor is to find suitable features capturing the characteristics of DNA binding proteins. We applied PseAAC to DNA binding protein identification, and PseAAC was further improved by incorporating the evolutionary information by using profile-based protein representation. Finally, Combined with Support Vector Machines (SVMs), a predictor called iDNAPro-PseAAC was proposed. Experimental results on an updated benchmark dataset showed that iDNAPro-PseAAC outperformed some state-of-the-art approaches, and it can achieve stable performance on an independent dataset. By using an ensemble learning approach to incorporate more negative samples (non-DNA binding proteins) in the training process, the performance of iDNAPro-PseAAC was further improved. The web server of iDNAPro-PseAAC is available at http://bioinformatics.hitsz.edu.cn/iDNAPro-PseAAC/.

摘要

DNA结合蛋白在大多数细胞过程中发挥着重要作用。因此,有必要开发一种仅基于蛋白质序列信息来识别DNA结合蛋白的高效预测器。构建一个有用的预测器的瓶颈在于找到能够捕捉DNA结合蛋白特征的合适特征。我们将伪氨基酸组成(PseAAC)应用于DNA结合蛋白识别,并通过使用基于轮廓的蛋白质表示纳入进化信息来进一步改进PseAAC。最后,结合支持向量机(SVM),提出了一种名为iDNAPro-PseAAC的预测器。在一个更新的基准数据集上的实验结果表明,iDNAPro-PseAAC优于一些现有方法,并且在独立数据集上能够实现稳定的性能。通过在训练过程中使用集成学习方法纳入更多负样本(非DNA结合蛋白),iDNAPro-PseAAC的性能得到了进一步提高。iDNAPro-PseAAC的网络服务器可在http://bioinformatics.hitsz.edu.cn/iDNAPro-PseAAC/获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5686/4611492/866e13192f81/srep15479-f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验