Department of Bioengineering/Bioinformatics, University of Illinois at Chicago, Chicago, IL, USA.
Nucleic Acids Res. 2010 Jul;38(Web Server issue):W431-5. doi: 10.1093/nar/gkq361. Epub 2010 May 16.
Nucleic acid-binding proteins are involved in a great number of cellular processes. Understanding the mechanisms underlying these proteins first requires the identification of specific residues involved in nucleic acid binding. Prediction of NA-binding residues can provide practical assistance in the functional annotation of NA-binding proteins. Predictions can also be used to expedite mutagenesis experiments, guiding researchers to the correct binding residues in these proteins. Here, we present a method for the identification of amino acid residues involved in DNA- and RNA-binding using sequence-based attributes. The method used in this work combines the C4.5 algorithm with bootstrap aggregation and cost-sensitive learning. Our DNA-binding model achieved 79.1% accuracy, while the RNA-binding model reached an accuracy of 73.2%. The NAPS web server is freely available at http://proteomics.bioengr.uic.edu/NAPS.
核酸结合蛋白参与了大量的细胞过程。要了解这些蛋白质的作用机制,首先需要确定参与核酸结合的特定残基。NA 结合残基的预测可为 NA 结合蛋白的功能注释提供实际帮助。预测还可用于加速突变实验,指导研究人员找到这些蛋白质中正确的结合残基。本文提出了一种基于序列属性的鉴定 DNA 和 RNA 结合氨基酸残基的方法。本工作中使用的方法结合了 C4.5 算法、引导聚合和代价敏感学习。我们的 DNA 结合模型的准确率为 79.1%,而 RNA 结合模型的准确率达到了 73.2%。NAPS 网络服务器可免费在 http://proteomics.bioengr.uic.edu/NAPS 获得。