Suppr超能文献

利用结构邻域改进RNA结合残基的预测。

Improve the prediction of RNA-binding residues using structural neighbours.

作者信息

Li Quan, Cao Zanxia, Liu Haiyan

机构信息

School of Life Sciences, Hefei National Laboratory for Physical Sciences at Microscale, University of Science and Technology of China (USTC), Hefei, Anhui, 230027, China.

出版信息

Protein Pept Lett. 2010 Mar;17(3):287-96. doi: 10.2174/092986610790780279.

Abstract

The interactions between RNA-binding proteins (RBPs) with RNA play key roles in managing some of the cell's basic functions. The identification and prediction of RNA binding sites is important for understanding the RNA-binding mechanism. Computational approaches are being developed to predict RNA-binding residues based on the sequence- or structure-derived features. To achieve higher prediction accuracy, improvements on current prediction methods are necessary. We identified that the structural neighbors of RNA-binding and non-RNA-binding residues have different amino acid compositions. Combining this structure-derived feature with evolutionary (PSSM) and other structural information (secondary structure and solvent accessibility) significantly improves the predictions over existing methods. Using a multiple linear regression approach and 6-fold cross validation, our best model can achieve an overall correct rate of 87.8% and MCC of 0.47, with a specificity of 93.4%, correctly predict 52.4% of the RNA-binding residues for a dataset containing 107 non-homologous RNA-binding proteins. Compared with existing methods, including the amino acid compositions of structure neighbors lead to clearly improvement. A web server was developed for predicting RNA binding residues in a protein sequence (or structure),which is available at http://mcgill.3322.org/RNA/.

摘要

RNA结合蛋白(RBPs)与RNA之间的相互作用在调控细胞的一些基本功能中起着关键作用。RNA结合位点的识别和预测对于理解RNA结合机制至关重要。目前正在开发基于序列或结构衍生特征来预测RNA结合残基的计算方法。为了获得更高的预测准确性,有必要改进当前的预测方法。我们发现RNA结合残基和非RNA结合残基的结构邻域具有不同的氨基酸组成。将这种结构衍生特征与进化信息(PSSM)和其他结构信息(二级结构和溶剂可及性)相结合,相对于现有方法显著提高了预测效果。使用多元线性回归方法和6折交叉验证,我们的最佳模型总体正确率可达87.8%,马修斯相关系数(MCC)为0.47,特异性为93.4%,对于包含107个非同源RNA结合蛋白的数据集,能正确预测52.4%的RNA结合残基。与现有方法相比,纳入结构邻域的氨基酸组成带来了明显的改进。我们开发了一个用于预测蛋白质序列(或结构)中RNA结合残基的网络服务器,可通过http://mcgill.3322.org/RNA/访问。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验