• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

RNABindRPlus:一种结合机器学习和基于序列同源性的方法来提高蛋白质中预测的RNA结合残基可靠性的预测工具。

RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins.

作者信息

Walia Rasna R, Xue Li C, Wilkins Katherine, El-Manzalawy Yasser, Dobbs Drena, Honavar Vasant

机构信息

Bioinformatics and Computational Biology Program, Iowa State University, Ames, Iowa, United States of America; Department of Computer Science, Iowa State University, Ames, Iowa, United States of America.

College of Information Sciences and Technology, Pennsylvania State University, University Park, Pennsylvania, United States of America.

出版信息

PLoS One. 2014 May 20;9(5):e97725. doi: 10.1371/journal.pone.0097725. eCollection 2014.

DOI:10.1371/journal.pone.0097725
PMID:24846307
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4028231/
Abstract

Protein-RNA interactions are central to essential cellular processes such as protein synthesis and regulation of gene expression and play roles in human infectious and genetic diseases. Reliable identification of protein-RNA interfaces is critical for understanding the structural bases and functional implications of such interactions and for developing effective approaches to rational drug design. Sequence-based computational methods offer a viable, cost-effective way to identify putative RNA-binding residues in RNA-binding proteins. Here we report two novel approaches: (i) HomPRIP, a sequence homology-based method for predicting RNA-binding sites in proteins; (ii) RNABindRPlus, a new method that combines predictions from HomPRIP with those from an optimized Support Vector Machine (SVM) classifier trained on a benchmark dataset of 198 RNA-binding proteins. Although highly reliable, HomPRIP cannot make predictions for the unaligned parts of query proteins and its coverage is limited by the availability of close sequence homologs of the query protein with experimentally determined RNA-binding sites. RNABindRPlus overcomes these limitations. We compared the performance of HomPRIP and RNABindRPlus with that of several state-of-the-art predictors on two test sets, RB44 and RB111. On a subset of proteins for which homologs with experimentally determined interfaces could be reliably identified, HomPRIP outperformed all other methods achieving an MCC of 0.63 on RB44 and 0.83 on RB111. RNABindRPlus was able to predict RNA-binding residues of all proteins in both test sets, achieving an MCC of 0.55 and 0.37, respectively, and outperforming all other methods, including those that make use of structure-derived features of proteins. More importantly, RNABindRPlus outperforms all other methods for any choice of tradeoff between precision and recall. An important advantage of both HomPRIP and RNABindRPlus is that they rely on readily available sequence and sequence-derived features of RNA-binding proteins. A webserver implementation of both methods is freely available at http://einstein.cs.iastate.edu/RNABindRPlus/.

摘要

蛋白质与RNA的相互作用对于蛋白质合成和基因表达调控等基本细胞过程至关重要,并在人类感染性疾病和遗传性疾病中发挥作用。可靠地识别蛋白质-RNA界面对于理解此类相互作用的结构基础和功能影响以及开发合理药物设计的有效方法至关重要。基于序列的计算方法为识别RNA结合蛋白中假定的RNA结合残基提供了一种可行且具有成本效益的方法。在此,我们报告两种新方法:(i)HomPRIP,一种基于序列同源性预测蛋白质中RNA结合位点的方法;(ii)RNABindRPlus,一种将HomPRIP的预测结果与基于198个RNA结合蛋白的基准数据集训练的优化支持向量机(SVM)分类器的预测结果相结合的新方法。尽管HomPRIP非常可靠,但它无法对查询蛋白未比对的部分进行预测,其覆盖范围受到具有实验确定的RNA结合位点的查询蛋白的紧密序列同源物可用性的限制。RNABindRPlus克服了这些限制。我们在两个测试集RB44和RB111上比较了HomPRIP和RNABindRPlus与几种最先进预测器的性能。在能够可靠识别具有实验确定界面的同源物的一部分蛋白质上,HomPRIP优于所有其他方法,在RB44上的马修斯相关系数(MCC)为0.63,在RB111上为0.83。RNABindRPlus能够预测两个测试集中所有蛋白质的RNA结合残基,MCC分别为0.55和0.37,并且优于所有其他方法,包括那些利用蛋白质结构衍生特征的方法。更重要的是,对于精度和召回率之间的任何权衡选择,RNABindRPlus都优于所有其他方法。HomPRIP和RNABindRPlus的一个重要优点是它们依赖于RNA结合蛋白易于获得的序列和序列衍生特征。这两种方法的网络服务器实现可在http://einstein.cs.iastate.edu/RNABindRPlus/免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1860/4028231/238e163506ba/pone.0097725.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1860/4028231/3ec4ab7471f4/pone.0097725.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1860/4028231/da65a77c59ea/pone.0097725.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1860/4028231/9d9221f0330b/pone.0097725.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1860/4028231/2e8ebe33b75b/pone.0097725.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1860/4028231/238e163506ba/pone.0097725.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1860/4028231/3ec4ab7471f4/pone.0097725.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1860/4028231/da65a77c59ea/pone.0097725.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1860/4028231/9d9221f0330b/pone.0097725.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1860/4028231/2e8ebe33b75b/pone.0097725.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1860/4028231/238e163506ba/pone.0097725.g005.jpg

相似文献

1
RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins.RNABindRPlus:一种结合机器学习和基于序列同源性的方法来提高蛋白质中预测的RNA结合残基可靠性的预测工具。
PLoS One. 2014 May 20;9(5):e97725. doi: 10.1371/journal.pone.0097725. eCollection 2014.
2
Protein-RNA interface residue prediction using machine learning: an assessment of the state of the art.基于机器学习的蛋白质-RNA 界面残基预测:现状评估。
BMC Bioinformatics. 2012 May 10;13:89. doi: 10.1186/1471-2105-13-89.
3
HomPPI: a class of sequence homology based protein-protein interface prediction methods.HomPPI:一类基于序列同源性的蛋白质-蛋白质界面预测方法。
BMC Bioinformatics. 2011 Jun 17;12:244. doi: 10.1186/1471-2105-12-244.
4
Structure-based prediction of protein- peptide binding regions using Random Forest.基于结构的随机森林预测蛋白肽结合区域。
Bioinformatics. 2018 Feb 1;34(3):477-484. doi: 10.1093/bioinformatics/btx614.
5
LIBRUS: combined machine learning and homology information for sequence-based ligand-binding residue prediction.LIBRUS:基于序列的配体结合残基预测的机器学习和同源信息相结合。
Bioinformatics. 2009 Dec 1;25(23):3099-107. doi: 10.1093/bioinformatics/btp561. Epub 2009 Sep 28.
6
SVM-HUSTLE--an iterative semi-supervised machine learning approach for pairwise protein remote homology detection.SVM-HUSTLE——一种用于成对蛋白质远程同源性检测的迭代半监督机器学习方法。
Bioinformatics. 2008 Mar 15;24(6):783-90. doi: 10.1093/bioinformatics/btn028. Epub 2008 Feb 1.
7
Predicting RNA-protein interactions using only sequence information.仅使用序列信息预测 RNA-蛋白质相互作用。
BMC Bioinformatics. 2011 Dec 22;12:489. doi: 10.1186/1471-2105-12-489.
8
Sequence-Based Prediction of RNA-Binding Residues in Proteins.基于序列的蛋白质中RNA结合残基预测
Methods Mol Biol. 2017;1484:205-235. doi: 10.1007/978-1-4939-6406-2_15.
9
Prediction of protein-RNA binding sites by a random forest method with combined features.基于组合特征的随机森林方法预测蛋白质-RNA 结合位点。
Bioinformatics. 2010 Jul 1;26(13):1616-22. doi: 10.1093/bioinformatics/btq253. Epub 2010 May 18.
10
Prediction of nuclear proteins using nuclear translocation signals proposed by probabilistic latent semantic indexing.基于概率潜在语义索引的核转位信号预测核蛋白。
BMC Bioinformatics. 2012;13 Suppl 17(Suppl 17):S13. doi: 10.1186/1471-2105-13-S17-S13. Epub 2012 Dec 13.

引用本文的文献

1
Predicting nucleic acid binding sites by attention map-guided graph convolutional network with protein language embeddings and physicochemical information.利用注意力图引导的图卷积网络结合蛋白质语言嵌入和物理化学信息预测核酸结合位点。
Brief Bioinform. 2025 Aug 31;26(5). doi: 10.1093/bib/bbaf457.
2
Cytoplasmic PXR regulates glucose metabolism by binding mRNAs and modulating their stability.细胞质中的孕烷X受体通过结合信使核糖核酸并调节其稳定性来调控葡萄糖代谢。
Nat Struct Mol Biol. 2025 Aug 12. doi: 10.1038/s41594-025-01614-5.
3
Advances in Language-Model-Informed Protein-Nucleic Acid Binding Site Prediction.

本文引用的文献

1
Dissecting the expression landscape of RNA-binding proteins in human cancers.剖析人类癌症中RNA结合蛋白的表达图谱。
Genome Biol. 2014 Jan 10;15(1):R14. doi: 10.1186/gb-2014-15-1-r14.
2
DockRank: ranking docked conformations using partner-specific sequence homology-based protein interface prediction.DockRank:利用基于特定伙伴序列同源性的蛋白质界面预测对对接构象进行排名。
Proteins. 2014 Feb;82(2):250-67. doi: 10.1002/prot.24370. Epub 2013 Oct 17.
3
Long noncoding RNAs and the genetics of cancer.长非编码 RNA 与癌症的遗传学
基于语言模型的蛋白质-核酸结合位点预测研究进展
Methods Mol Biol. 2025;2941:139-151. doi: 10.1007/978-1-0716-4623-6_9.
4
IFN alpha inducible protein 27 (IFI27) acts as a positive regulator of PACT-dependent PKR activation after RNA virus infections.干扰素α诱导蛋白27(IFI27)在RNA病毒感染后作为PACT依赖的PKR激活的正向调节因子发挥作用。
PLoS Pathog. 2025 Jun 16;21(6):e1013246. doi: 10.1371/journal.ppat.1013246. eCollection 2025 Jun.
5
Integrative characterization of MYC RNA-binding function.MYC RNA 结合功能的综合表征
Cell Genom. 2025 Jul 9;5(7):100878. doi: 10.1016/j.xgen.2025.100878. Epub 2025 May 15.
6
Integrative structural analysis of NF45-NF90 heterodimers reveals architectural rearrangements and oligomerization on binding dsRNA.NF45-NF90异二聚体的整合结构分析揭示了结合双链RNA时的结构重排和寡聚化。
Nucleic Acids Res. 2025 Mar 20;53(6). doi: 10.1093/nar/gkaf204.
7
Regulatory roles of RNA binding proteins in the Hippo pathway.RNA结合蛋白在Hippo信号通路中的调控作用。
Cell Death Discov. 2025 Jan 31;11(1):36. doi: 10.1038/s41420-025-02316-z.
8
Twenty years of advances in prediction of nucleic acid-binding residues in protein sequences.蛋白质序列中核酸结合残基预测二十年进展
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf016.
9
Structural comparison of homologous protein-RNA interfaces reveals widespread overall conservation contrasted with versatility in polar contacts.同源蛋白质-RNA 界面的结构比较显示,与极性接触的多样性形成对比的是,整体上广泛存在保守性。
PLoS Comput Biol. 2024 Dec 3;20(12):e1012650. doi: 10.1371/journal.pcbi.1012650. eCollection 2024 Dec.
10
Advances in the Application of Protein Language Modeling for Nucleic Acid Protein Binding Site Prediction.蛋白质语言模型在核酸蛋白质结合位点预测中的应用进展。
Genes (Basel). 2024 Aug 18;15(8):1090. doi: 10.3390/genes15081090.
Br J Cancer. 2013 Jun 25;108(12):2419-25. doi: 10.1038/bjc.2013.233. Epub 2013 May 9.
4
miRNAs and long noncoding RNAs as biomarkers in human diseases.miRNAs 和长非编码 RNA 作为人类疾病的生物标志物。
Expert Rev Mol Diagn. 2013 Mar;13(2):183-204. doi: 10.1586/erm.12.134.
5
CD-HIT: accelerated for clustering the next-generation sequencing data.CD-HIT:用于加速下一代测序数据聚类的工具。
Bioinformatics. 2012 Dec 1;28(23):3150-2. doi: 10.1093/bioinformatics/bts565. Epub 2012 Oct 11.
6
Non-coding RNAs in Alzheimer's disease.阿尔茨海默病中的非编码 RNA。
Mol Neurobiol. 2013 Feb;47(1):382-93. doi: 10.1007/s12035-012-8359-5. Epub 2012 Oct 7.
7
A virological view of innate immune recognition.先天免疫识别的病毒学观点。
Annu Rev Microbiol. 2012;66:177-96. doi: 10.1146/annurev-micro-092611-150203.
8
Decoding the non-coding RNAs in Alzheimer's disease.解析阿尔茨海默病中的非编码 RNA。
Cell Mol Life Sci. 2012 Nov;69(21):3543-59. doi: 10.1007/s00018-012-1125-z. Epub 2012 Sep 6.
9
Protein-RNA interface residue prediction using machine learning: an assessment of the state of the art.基于机器学习的蛋白质-RNA 界面残基预测:现状评估。
BMC Bioinformatics. 2012 May 10;13:89. doi: 10.1186/1471-2105-13-89.
10
Predicting protein-protein interface residues using local surface structural similarity.利用局部表面结构相似性预测蛋白质-蛋白质界面残基。
BMC Bioinformatics. 2012 Mar 18;13:41. doi: 10.1186/1471-2105-13-41.