Wang Y, Xue Z, Shen G, Xu J
Institute of Biophysics and Biochemistry, School of Life Science, Huazhong University of Science and Technology, Wuhan City, China.
Amino Acids. 2008 Aug;35(2):295-302. doi: 10.1007/s00726-007-0634-9. Epub 2008 Jan 31.
Protein-RNA interactions play a key role in a number of biological processes such as protein synthesis, mRNA processing, assembly and function of ribosomes and eukaryotic spliceosomes. A reliable identification of RNA-binding sites in RNA-binding proteins is important for functional annotation and site-directed mutagenesis. We developed a novel method for the prediction of protein residues that interact with RNA using support vector machine (SVM) and position-specific scoring matrices (PSSMs). Two cases have been considered in the prediction of protein residues at RNA-binding surfaces. One is given the sequence information of a protein chain that is known to interact with RNA; the other is given the structural information. Thus, five different inputs have been tested. Coupled with PSI-BLAST profiles and predicted secondary structure, the present approach yields a Matthews correlation coefficient (MCC) of 0.432 by a 7-fold cross-validation, which is the best among all previous reported RNA-binding sites prediction methods. When given the structural information, we have obtained the MCC value of 0.457, with PSSMs, observed secondary structure and solvent accessibility information assigned by DSSP as input. A web server implementing the prediction method is available at the following URL: http://210.42.106.80/printr/ .
蛋白质 - RNA 相互作用在许多生物过程中发挥关键作用,如蛋白质合成、mRNA 加工、核糖体和真核剪接体的组装及功能。可靠识别 RNA 结合蛋白中的 RNA 结合位点对于功能注释和定点诱变很重要。我们开发了一种使用支持向量机(SVM)和位置特异性得分矩阵(PSSM)预测与 RNA 相互作用的蛋白质残基的新方法。在预测 RNA 结合表面的蛋白质残基时考虑了两种情况。一种是已知与 RNA 相互作用的蛋白质链的序列信息;另一种是结构信息。因此,测试了五种不同的输入。结合 PSI - BLAST 概况和预测的二级结构,本方法通过 7 折交叉验证产生的马修斯相关系数(MCC)为 0.432,这在所有先前报道的 RNA 结合位点预测方法中是最好的。当给出结构信息时,以 PSSM、由 DSSP 分配的观察到的二级结构和溶剂可及性信息作为输入,我们获得了 0.457 的 MCC 值。可通过以下网址访问实现该预测方法的网络服务器:http://210.42.106.80/printr/ 。