通过整合各种序列信息鉴定蛋白质中的 RNA 结合位点。

Identification of RNA-binding sites in proteins by integrating various sequence information.

机构信息

Key Laboratory of Green Chemistry and Technology, College of Chemistry, Ministry of Education, Sichuan University, Chengdu, 610064, China.

出版信息

Amino Acids. 2011 Jan;40(1):239-48. doi: 10.1007/s00726-010-0639-7. Epub 2010 Jun 12.

DOI:10.1007/s00726-010-0639-7

PMID:20549269

Abstract

RNA-protein interactions play a pivotal role in various biological processes, such as mRNA processing, protein synthesis, assembly, and function of ribosome. In this work, we have introduced a computational method for predicting RNA-binding sites in proteins based on support vector machines by using a variety of features from amino acid sequence information including position-specific scoring matrix (PSSM) profiles, physicochemical properties and predicted solvent accessibility. Considering the influence of the surrounding residues of an amino acid and the dependency effect from the neighboring amino acids, a sliding window and a smoothing window are used to encode the PSSM profiles. The outer fivefold cross-validation method is evaluated on the data set of 77 RNA-binding proteins (RBP77). It achieves an overall accuracy of 88.66% with the Matthew's correlation coefficient (MCC) of 0.69. Furthermore, an independent data set of 39 RNA-binding proteins (RBP39) is employed to further evaluate the performance and achieves an overall accuracy of 82.36% with the MCC of 0.44. The result shows that our method has good generalization abilities in predicting RNA-binding sites for novel proteins. Compared with other previous methods, our method performs well on the same data set. The prediction results suggest that the used features are effective in predicting RNA-binding sites in proteins. The code and all data sets used in this article are freely available at http://cic.scu.edu.cn/bioinformatics/Predict_RBP.rar .

摘要

RNA 与蛋白质的相互作用在各种生物过程中起着关键作用，例如 mRNA 加工、蛋白质合成、核糖体的组装和功能。在这项工作中，我们引入了一种基于支持向量机的计算方法，用于预测蛋白质中的 RNA 结合位点，该方法使用了来自氨基酸序列信息的多种特征，包括位置特异性评分矩阵 (PSSM) 谱、理化性质和预测的溶剂可及性。考虑到氨基酸周围残基的影响和来自相邻氨基酸的依赖效应，使用滑动窗口和平滑窗口对 PSSM 谱进行编码。在 77 个 RNA 结合蛋白 (RBP77) 的数据集上进行了五重交叉验证方法的外部评估。它的总体准确率为 88.66%，马修斯相关系数 (MCC) 为 0.69。此外，还使用了 39 个 RNA 结合蛋白 (RBP39) 的独立数据集来进一步评估性能，总体准确率为 82.36%，MCC 为 0.44。结果表明，我们的方法在预测新蛋白质的 RNA 结合位点方面具有良好的泛化能力。与其他先前的方法相比，我们的方法在同一数据集上表现良好。预测结果表明，所使用的特征在预测蛋白质中的 RNA 结合位点方面是有效的。本文中使用的代码和所有数据集均可在 http://cic.scu.edu.cn/bioinformatics/Predict_RBP.rar 上免费获取。

相似文献

Identification of RNA-binding sites in proteins by integrating various sequence information.通过整合各种序列信息鉴定蛋白质中的 RNA 结合位点。

Amino Acids. 2011 Jan;40(1):239-48. doi: 10.1007/s00726-010-0639-7. Epub 2010 Jun 12.

Prediction of protein-RNA binding sites by a random forest method with combined features.基于组合特征的随机森林方法预测蛋白质-RNA 结合位点。

Bioinformatics. 2010 Jul 1;26(13):1616-22. doi: 10.1093/bioinformatics/btq253. Epub 2010 May 18.

Prediction of RNA-binding residues in proteins from primary sequence using an enriched random forest model with a novel hybrid feature.基于新型混合特征的富集随机森林模型预测蛋白质中 RNA 结合残基的一级序列

Proteins. 2011 Apr;79(4):1230-9. doi: 10.1002/prot.22958. Epub 2011 Jan 25.

PRINTR: prediction of RNA binding sites in proteins using SVM and profiles.PRINTR：使用支持向量机和图谱预测蛋白质中的RNA结合位点

Amino Acids. 2008 Aug;35(2):295-302. doi: 10.1007/s00726-007-0634-9. Epub 2008 Jan 31.

SVM based prediction of RNA-binding proteins using binding residues and evolutionary information.基于支持向量机的 RNA 结合蛋白结合残基和进化信息预测。

J Mol Recognit. 2011 Mar-Apr;24(2):303-13. doi: 10.1002/jmr.1061.

Prediction of RNA binding sites in a protein using SVM and PSSM profile.使用支持向量机和位置特异性得分矩阵预测蛋白质中的RNA结合位点。

Proteins. 2008 Apr;71(1):189-94. doi: 10.1002/prot.21677.

Design of accurate predictors for DNA-binding sites in proteins using hybrid SVM-PSSM method.使用混合支持向量机-位置特异性打分矩阵（SVM-PSSM）方法设计蛋白质中DNA结合位点的精确预测器。

Biosystems. 2007 Jul-Aug;90(1):234-41. doi: 10.1016/j.biosystems.2006.08.007. Epub 2006 Aug 23.

Applying the Naïve Bayes classifier with kernel density estimation to the prediction of protein-protein interaction sites.应用朴素贝叶斯分类器和核密度估计对蛋白质-蛋白质相互作用位点进行预测。

Bioinformatics. 2010 Aug 1;26(15):1841-8. doi: 10.1093/bioinformatics/btq302. Epub 2010 Jun 6.

RISP: a web-based server for prediction of RNA-binding sites in proteins.RISP：一个基于网络的用于预测蛋白质中RNA结合位点的服务器。

Comput Methods Programs Biomed. 2008 May;90(2):148-53. doi: 10.1016/j.cmpb.2007.12.003. Epub 2008 Feb 7.

Predicting rRNA-, RNA-, and DNA-binding proteins from primary structure with support vector machines.利用支持向量机从一级结构预测核糖体RNA、RNA和DNA结合蛋白。

J Theor Biol. 2006 May 21;240(2):175-84. doi: 10.1016/j.jtbi.2005.09.018. Epub 2005 Nov 7.

引用本文的文献

Twenty years of advances in prediction of nucleic acid-binding residues in protein sequences.蛋白质序列中核酸结合残基预测二十年进展

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf016.

A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond.蛋白质中心预测因子在生物分子相互作用研究中的综合综述：从蛋白质到核酸及其他。

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae162.

EPDRNA: A Model for Identifying DNA-RNA Binding Sites in Disease-Related Proteins.EPDRNA：一种用于识别疾病相关蛋白质中DNA-RNA结合位点的模型。

Protein J. 2024 Jun;43(3):513-521. doi: 10.1007/s10930-024-10183-3. Epub 2024 Mar 16.

PRIP: A Protein-RNA Interface Predictor Based on Semantics of Sequences.PRIP：一种基于序列语义的蛋白质-核糖核酸界面预测工具

Life (Basel). 2022 Feb 18;12(2):307. doi: 10.3390/life12020307.

Comprehensive Survey and Comparative Assessment of RNA-Binding Residue Predictions with Analysis by RNA Type.RNA 结合残基预测的综合调查和比较评估，同时按 RNA 类型进行分析。

Int J Mol Sci. 2020 Sep 19;21(18):6879. doi: 10.3390/ijms21186879.

RPI-Bind: a structure-based method for accurate identification of RNA-protein binding sites.RPI-Bind：一种基于结构的方法，用于准确识别 RNA-蛋白质结合位点。

Sci Rep. 2017 Apr 4;7(1):614. doi: 10.1038/s41598-017-00795-4.

Prediction of redox-sensitive cysteines using sequential distance and other sequence-based features.利用序列距离和其他基于序列的特征预测氧化还原敏感型半胱氨酸。

BMC Bioinformatics. 2016 Aug 24;17(1):316. doi: 10.1186/s12859-016-1185-4.

Accurate prediction of RNA-binding protein residues with two discriminative structural descriptors.利用两种判别性结构描述符准确预测RNA结合蛋白残基

BMC Bioinformatics. 2016 Jun 7;17(1):231. doi: 10.1186/s12859-016-1110-x.

A Large-Scale Assessment of Nucleic Acids Binding Site Prediction Programs.核酸结合位点预测程序的大规模评估

PLoS Comput Biol. 2015 Dec 17;11(12):e1004639. doi: 10.1371/journal.pcbi.1004639. eCollection 2015 Dec.

Different motif requirements for the localization zipcode element of β-actin mRNA binding by HuD and ZBP1.HuD和ZBP1与β-肌动蛋白mRNA结合的定位邮政编码元件的不同基序要求。

Nucleic Acids Res. 2015 Sep 3;43(15):7432-46. doi: 10.1093/nar/gkv699. Epub 2015 Jul 7.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过整合各种序列信息鉴定蛋白质中的 RNA 结合位点。

Identification of RNA-binding sites in proteins by integrating various sequence information.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献