• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用支持向量机和进化信息预测蛋白质的RNA结合位点。

Predicting RNA-binding sites of proteins using support vector machines and evolutionary information.

作者信息

Cheng Cheng-Wei, Su Emily Chia-Yu, Hwang Jenn-Kang, Sung Ting-Yi, Hsu Wen-Lian

机构信息

Institute of Information Systems and Applications, National Tsing Hua University, Hsinchu, Taiwan.

出版信息

BMC Bioinformatics. 2008 Dec 12;9 Suppl 12(Suppl 12):S6. doi: 10.1186/1471-2105-9-S12-S6.

DOI:10.1186/1471-2105-9-S12-S6
PMID:19091029
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2638146/
Abstract

BACKGROUND

RNA-protein interaction plays an essential role in several biological processes, such as protein synthesis, gene expression, posttranscriptional regulation and viral infectivity. Identification of RNA-binding sites in proteins provides valuable insights for biologists. However, experimental determination of RNA-protein interaction remains time-consuming and labor-intensive. Thus, computational approaches for prediction of RNA-binding sites in proteins have become highly desirable. Extensive studies of RNA-binding site prediction have led to the development of several methods. However, they could yield low sensitivities in trade-off for high specificities.

RESULTS

We propose a method, RNAProB, which incorporates a new smoothed position-specific scoring matrix (PSSM) encoding scheme with a support vector machine model to predict RNA-binding sites in proteins. Besides the incorporation of evolutionary information from standard PSSM profiles, the proposed smoothed PSSM encoding scheme also considers the correlation and dependency from the neighboring residues for each amino acid in a protein. Experimental results show that smoothed PSSM encoding significantly enhances the prediction performance, especially for sensitivity. Using five-fold cross-validation, our method performs better than the state-of-the-art systems by 4.90%-6.83%, 0.88%-5.33%, and 0.10-0.23 in terms of overall accuracy, specificity, and Matthew's correlation coefficient, respectively. Most notably, compared to other approaches, RNAProB significantly improves sensitivity by 7.0%-26.9% over the benchmark data sets. To prevent data over fitting, a three-way data split procedure is incorporated to estimate the prediction performance. Moreover, physicochemical properties and amino acid preferences of RNA-binding proteins are examined and analyzed.

CONCLUSION

Our results demonstrate that smoothed PSSM encoding scheme significantly enhances the performance of RNA-binding site prediction in proteins. This also supports our assumption that smoothed PSSM encoding can better resolve the ambiguity of discriminating between interacting and non-interacting residues by modelling the dependency from surrounding residues. The proposed method can be used in other research areas, such as DNA-binding site prediction, protein-protein interaction, and prediction of posttranslational modification sites.

摘要

背景

RNA-蛋白质相互作用在多个生物学过程中发挥着至关重要的作用,如蛋白质合成、基因表达、转录后调控和病毒感染性。确定蛋白质中的RNA结合位点为生物学家提供了有价值的见解。然而,通过实验确定RNA-蛋白质相互作用仍然耗时且费力。因此,用于预测蛋白质中RNA结合位点的计算方法变得非常必要。对RNA结合位点预测的广泛研究已经催生了几种方法。然而,它们在以高特异性为代价的情况下可能会产生较低的灵敏度。

结果

我们提出了一种名为RNAProB的方法,该方法将一种新的平滑位置特异性评分矩阵(PSSM)编码方案与支持向量机模型相结合,以预测蛋白质中的RNA结合位点。除了纳入来自标准PSSM谱的进化信息外,所提出的平滑PSSM编码方案还考虑了蛋白质中每个氨基酸与其相邻残基之间的相关性和依赖性。实验结果表明,平滑PSSM编码显著提高了预测性能,尤其是在灵敏度方面。使用五折交叉验证,我们的方法在总体准确率、特异性和马修斯相关系数方面分别比现有最先进的系统高出4.90%-6.83%、0.88%-5.33%和0.10-0.23。最值得注意的是,与其他方法相比,RNAProB在基准数据集上的灵敏度显著提高了7.0%-26.9%。为了防止数据过拟合,采用了一种三路数据分割程序来估计预测性能。此外,还对RNA结合蛋白的理化性质和氨基酸偏好进行了研究和分析。

结论

我们的结果表明,平滑PSSM编码方案显著提高了蛋白质中RNA结合位点预测的性能。这也支持了我们的假设,即平滑PSSM编码可以通过对周围残基的依赖性进行建模,更好地解决区分相互作用和非相互作用残基的模糊性。所提出的方法可用于其他研究领域,如DNA结合位点预测、蛋白质-蛋白质相互作用和翻译后修饰位点的预测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/fb4124270d5b/1471-2105-9-S12-S6-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/5ddcf88b23de/1471-2105-9-S12-S6-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/34ba00df4d15/1471-2105-9-S12-S6-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/ab831582d436/1471-2105-9-S12-S6-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/9f9e627f4642/1471-2105-9-S12-S6-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/b3ccb8dc8341/1471-2105-9-S12-S6-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/784362809c64/1471-2105-9-S12-S6-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/6113b15bcee0/1471-2105-9-S12-S6-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/1febfdc00103/1471-2105-9-S12-S6-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/821fd2b4600b/1471-2105-9-S12-S6-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/fb4124270d5b/1471-2105-9-S12-S6-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/5ddcf88b23de/1471-2105-9-S12-S6-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/34ba00df4d15/1471-2105-9-S12-S6-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/ab831582d436/1471-2105-9-S12-S6-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/9f9e627f4642/1471-2105-9-S12-S6-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/b3ccb8dc8341/1471-2105-9-S12-S6-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/784362809c64/1471-2105-9-S12-S6-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/6113b15bcee0/1471-2105-9-S12-S6-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/1febfdc00103/1471-2105-9-S12-S6-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/821fd2b4600b/1471-2105-9-S12-S6-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e5/2638146/fb4124270d5b/1471-2105-9-S12-S6-10.jpg

相似文献

1
Predicting RNA-binding sites of proteins using support vector machines and evolutionary information.使用支持向量机和进化信息预测蛋白质的RNA结合位点。
BMC Bioinformatics. 2008 Dec 12;9 Suppl 12(Suppl 12):S6. doi: 10.1186/1471-2105-9-S12-S6.
2
Protein-RNA interface residue prediction using machine learning: an assessment of the state of the art.基于机器学习的蛋白质-RNA 界面残基预测:现状评估。
BMC Bioinformatics. 2012 May 10;13:89. doi: 10.1186/1471-2105-13-89.
3
Identification of RNA-binding sites in proteins by integrating various sequence information.通过整合各种序列信息鉴定蛋白质中的 RNA 结合位点。
Amino Acids. 2011 Jan;40(1):239-48. doi: 10.1007/s00726-010-0639-7. Epub 2010 Jun 12.
4
Prediction of lipid-binding sites based on support vector machine and position specific scoring matrix.基于支持向量机和位置特异性评分矩阵预测脂质结合位点。
Protein J. 2010 Aug;29(6):427-31. doi: 10.1007/s10930-010-9269-x.
5
SNB-PSSM: A spatial neighbor-based PSSM used for protein-RNA binding site prediction.SNB-PSSM:一种基于空间邻居的 PSSM,用于蛋白质-RNA 结合位点预测。
J Mol Recognit. 2021 Jun;34(6):e2887. doi: 10.1002/jmr.2887. Epub 2021 Jan 14.
6
PSSM-based prediction of DNA binding sites in proteins.基于位置特异性得分矩阵的蛋白质中DNA结合位点预测
BMC Bioinformatics. 2005 Feb 19;6:33. doi: 10.1186/1471-2105-6-33.
7
SVM based prediction of RNA-binding proteins using binding residues and evolutionary information.基于支持向量机的 RNA 结合蛋白结合残基和进化信息预测。
J Mol Recognit. 2011 Mar-Apr;24(2):303-13. doi: 10.1002/jmr.1061.
8
Real value prediction of protein solvent accessibility using enhanced PSSM features.使用增强的位置特异性得分矩阵(PSSM)特征对蛋白质溶剂可及性进行实际值预测。
BMC Bioinformatics. 2008 Dec 12;9 Suppl 12(Suppl 12):S12. doi: 10.1186/1471-2105-9-S12-S12.
9
BindN+ for accurate prediction of DNA and RNA-binding residues from protein sequence features.BindN+ 用于从蛋白质序列特征准确预测DNA和RNA结合残基。
BMC Syst Biol. 2010 May 28;4 Suppl 1(Suppl 1):S3. doi: 10.1186/1752-0509-4-S1-S3.
10
Design of accurate predictors for DNA-binding sites in proteins using hybrid SVM-PSSM method.使用混合支持向量机-位置特异性打分矩阵(SVM-PSSM)方法设计蛋白质中DNA结合位点的精确预测器。
Biosystems. 2007 Jul-Aug;90(1):234-41. doi: 10.1016/j.biosystems.2006.08.007. Epub 2006 Aug 23.

引用本文的文献

1
Enhancing the Feature Representation of Protein Sequence Descriptors in Protein-Protein Interaction Prediction.在蛋白质-蛋白质相互作用预测中增强蛋白质序列描述符的特征表示
Interdiscip Sci. 2025 Jun 2. doi: 10.1007/s12539-025-00723-5.
2
Twenty years of advances in prediction of nucleic acid-binding residues in protein sequences.蛋白质序列中核酸结合残基预测二十年进展
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf016.
3
A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond.

本文引用的文献

1
PSLDoc: Protein subcellular localization prediction based on gapped-dipeptides and probabilistic latent semantic analysis.PSLDoc:基于间隔二肽和概率潜在语义分析的蛋白质亚细胞定位预测
Proteins. 2008 Aug;72(2):693-710. doi: 10.1002/prot.21944.
2
Prediction of RNA-binding residues in protein sequences using support vector machines.使用支持向量机预测蛋白质序列中的RNA结合残基。
Conf Proc IEEE Eng Med Biol Soc. 2006;2006:5830-3. doi: 10.1109/IEMBS.2006.260025.
3
Prediction of RNA binding sites in a protein using SVM and PSSM profile.
蛋白质中心预测因子在生物分子相互作用研究中的综合综述:从蛋白质到核酸及其他。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae162.
4
ALDELE: All-Purpose Deep Learning Toolkits for Predicting the Biocatalytic Activities of Enzymes.ALDELE:用于预测酶生物催化活性的通用深度学习工具包。
J Chem Inf Model. 2024 Apr 22;64(8):3123-3139. doi: 10.1021/acs.jcim.4c00058. Epub 2024 Apr 4.
5
RNAincoder: a deep learning-based encoder for RNA and RNA-associated interaction.RNAincoder:一种基于深度学习的 RNA 及其相关相互作用的编码器。
Nucleic Acids Res. 2023 Jul 5;51(W1):W509-W519. doi: 10.1093/nar/gkad404.
6
HyperCys: A Structure- and Sequence-Based Predictor of Hyper-Reactive Druggable Cysteines.HyperCys:基于结构和序列的高反应性可成药半胱氨酸预测器。
Int J Mol Sci. 2023 Mar 22;24(6):5960. doi: 10.3390/ijms24065960.
7
HybridRNAbind: prediction of RNA interacting residues across structure-annotated and disorder-annotated proteins.HybridRNAbind:跨结构注释和无序注释蛋白质预测 RNA 相互作用残基。
Nucleic Acids Res. 2023 Mar 21;51(5):e25. doi: 10.1093/nar/gkac1253.
8
Artificial intelligence methods enhance the discovery of RNA interactions.人工智能方法促进了RNA相互作用的发现。
Front Mol Biosci. 2022 Oct 7;9:1000205. doi: 10.3389/fmolb.2022.1000205. eCollection 2022.
9
PSSMCOOL: a comprehensive R package for generating evolutionary-based descriptors of protein sequences from PSSM profiles.PSSMCOOL:一个用于从PSSM谱生成基于进化的蛋白质序列描述符的综合R包。
Biol Methods Protoc. 2022 Mar 30;7(1):bpac008. doi: 10.1093/biomethods/bpac008. eCollection 2022.
10
PRIP: A Protein-RNA Interface Predictor Based on Semantics of Sequences.PRIP:一种基于序列语义的蛋白质-核糖核酸界面预测工具
Life (Basel). 2022 Feb 18;12(2):307. doi: 10.3390/life12020307.
使用支持向量机和位置特异性得分矩阵预测蛋白质中的RNA结合位点。
Proteins. 2008 Apr;71(1):189-94. doi: 10.1002/prot.21677.
4
Protein subcellular localization prediction based on compartment-specific features and structure conservation.基于特定区室特征和结构保守性的蛋白质亚细胞定位预测
BMC Bioinformatics. 2007 Sep 8;8:330. doi: 10.1186/1471-2105-8-330.
5
Functional specialization of domains tandemly duplicated within 16S rRNA methyltransferase RsmC.16S rRNA甲基转移酶RsmC内串联重复结构域的功能特化
Nucleic Acids Res. 2007;35(13):4264-74. doi: 10.1093/nar/gkm411. Epub 2007 Jun 18.
6
RNABindR: a server for analyzing and predicting RNA-binding sites in proteins.RNABindR:一个用于分析和预测蛋白质中RNA结合位点的服务器。
Nucleic Acids Res. 2007 Jul;35(Web Server issue):W578-84. doi: 10.1093/nar/gkm294. Epub 2007 May 5.
7
Fragile X related protein 1 isoforms differentially modulate the affinity of fragile X mental retardation protein for G-quartet RNA structure.脆性X相关蛋白1亚型对脆性X智力低下蛋白与G-四联体RNA结构的亲和力具有不同的调节作用。
Nucleic Acids Res. 2007;35(1):299-306. doi: 10.1093/nar/gkl1021. Epub 2006 Dec 14.
8
BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences.BindN:一种用于高效预测氨基酸序列中DNA和RNA结合位点的基于网络的工具。
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W243-8. doi: 10.1093/nar/gkl298.
9
Prediction of RNA binding sites in proteins from amino acid sequence.从氨基酸序列预测蛋白质中的RNA结合位点。
RNA. 2006 Aug;12(8):1450-62. doi: 10.1261/rna.2197306. Epub 2006 Jun 21.
10
Prediction of protein subcellular localization.蛋白质亚细胞定位预测
Proteins. 2006 Aug 15;64(3):643-51. doi: 10.1002/prot.21018.