考虑结合伙伴预测与蛋白质结合的 RNA 核苷酸。

Predicting protein-binding RNA nucleotides with consideration of binding partners.

机构信息

Department of Computer Science and Engineering, Inha University, Incheon, South Korea.

出版信息

Comput Methods Programs Biomed. 2015 Jun;120(1):3-15. doi: 10.1016/j.cmpb.2015.03.010. Epub 2015 Apr 8.

DOI:10.1016/j.cmpb.2015.03.010

Abstract

In recent years several computational methods have been developed to predict RNA-binding sites in protein. Most of these methods do not consider interacting partners of a protein, so they predict the same RNA-binding sites for a given protein sequence even if the protein binds to different RNAs. Unlike the problem of predicting RNA-binding sites in protein, the problem of predicting protein-binding sites in RNA has received little attention mainly because it is much more difficult and shows a lower accuracy on average. In our previous study, we developed a method that predicts protein-binding nucleotides from an RNA sequence. In an effort to improve the prediction accuracy and usefulness of the previous method, we developed a new method that uses both RNA and protein sequence data. In this study, we identified effective features of RNA and protein molecules and developed a new support vector machine (SVM) model to predict protein-binding nucleotides from RNA and protein sequence data. The new model that used both protein and RNA sequence data achieved a sensitivity of 86.5%, a specificity of 86.2%, a positive predictive value (PPV) of 72.6%, a negative predictive value (NPV) of 93.8% and Matthews correlation coefficient (MCC) of 0.69 in a 10-fold cross validation; it achieved a sensitivity of 58.8%, a specificity of 87.4%, a PPV of 65.1%, a NPV of 84.2% and MCC of 0.48 in independent testing. For comparative purpose, we built another prediction model that used RNA sequence data alone and ran it on the same dataset. In a 10 fold-cross validation it achieved a sensitivity of 85.7%, a specificity of 80.5%, a PPV of 67.7%, a NPV of 92.2% and MCC of 0.63; in independent testing it achieved a sensitivity of 67.7%, a specificity of 78.8%, a PPV of 57.6%, a NPV of 85.2% and MCC of 0.45. In both cross-validations and independent testing, the new model that used both RNA and protein sequences showed a better performance than the model that used RNA sequence data alone in most performance measures. To the best of our knowledge, this is the first sequence-based prediction of protein-binding nucleotides in RNA which considers the binding partner of RNA. The new model will provide valuable information for designing biochemical experiments to find putative protein-binding sites in RNA with unknown structure.

摘要

近年来，已经开发出几种计算方法来预测蛋白质中的 RNA 结合位点。这些方法大多没有考虑蛋白质的相互作用伙伴，因此即使蛋白质与不同的 RNA 结合，它们也会预测出相同的 RNA 结合位点。与预测蛋白质中 RNA 结合位点的问题不同，预测 RNA 中蛋白质结合位点的问题主要受到关注，这主要是因为它更困难，平均准确性较低。在我们之前的研究中，我们开发了一种从 RNA 序列预测蛋白质结合核苷酸的方法。为了提高以前方法的预测准确性和实用性，我们开发了一种使用 RNA 和蛋白质序列数据的新方法。在这项研究中，我们确定了 RNA 和蛋白质分子的有效特征，并开发了一种新的支持向量机 (SVM) 模型，用于从 RNA 和蛋白质序列数据中预测蛋白质结合核苷酸。在 10 倍交叉验证中，使用蛋白质和 RNA 序列数据的新模型的灵敏度为 86.5%，特异性为 86.2%，阳性预测值 (PPV) 为 72.6%，阴性预测值 (NPV) 为 93.8%，马修斯相关系数 (MCC) 为 0.69；在独立测试中，它的灵敏度为 58.8%，特异性为 87.4%，PPV 为 65.1%，NPV 为 84.2%，MCC 为 0.48。为了进行比较，我们构建了另一个仅使用 RNA 序列数据的预测模型，并在同一数据集上运行该模型。在 10 倍交叉验证中，它的灵敏度为 85.7%，特异性为 80.5%，PPV 为 67.7%，NPV 为 92.2%，MCC 为 0.63；在独立测试中，它的灵敏度为 67.7%，特异性为 78.8%，PPV 为 57.6%，NPV 为 85.2%，MCC 为 0.45。在交叉验证和独立测试中，与仅使用 RNA 序列数据的模型相比，使用 RNA 和蛋白质序列的新模型在大多数性能指标上都表现出更好的性能。据我们所知，这是第一个考虑 RNA 结合伙伴的基于序列的 RNA 中蛋白质结合核苷酸的预测。该新模型将为设计生化实验提供有价值的信息，以找到具有未知结构的 RNA 中假定的蛋白质结合位点。

相似文献

Predicting protein-binding RNA nucleotides with consideration of binding partners.考虑结合伙伴预测与蛋白质结合的 RNA 核苷酸。

Comput Methods Programs Biomed. 2015 Jun;120(1):3-15. doi: 10.1016/j.cmpb.2015.03.010. Epub 2015 Apr 8.

Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.基于序列的DNA中蛋白质结合位点预测：两种支持向量机模型的比较研究

Comput Methods Programs Biomed. 2014 Nov;117(2):158-67. doi: 10.1016/j.cmpb.2014.07.009. Epub 2014 Aug 1.

Predicting protein-binding regions in RNA using nucleotide profiles and compositions.利用核苷酸谱和组成预测RNA中的蛋白质结合区域。

BMC Syst Biol. 2017 Mar 14;11(Suppl 2):16. doi: 10.1186/s12918-017-0386-4.

Predicting protein-binding RNA nucleotides using the feature-based removal of data redundancy and the interaction propensity of nucleotide triplets.利用基于特征的数据冗余消除和核苷酸三联体的相互作用倾向预测与蛋白质结合的 RNA 核苷酸。

Comput Biol Med. 2013 Nov;43(11):1687-97. doi: 10.1016/j.compbiomed.2013.08.011. Epub 2013 Aug 21.

PNImodeler: web server for inferring protein-binding nucleotides from sequence data.PNImodeler：用于从序列数据推断蛋白质结合核苷酸的网络服务器。

BMC Genomics. 2015;16 Suppl 3(Suppl 3):S6. doi: 10.1186/1471-2164-16-S3-S6. Epub 2015 Jan 29.

SVM based prediction of RNA-binding proteins using binding residues and evolutionary information.基于支持向量机的 RNA 结合蛋白结合残基和进化信息预测。

J Mol Recognit. 2011 Mar-Apr;24(2):303-13. doi: 10.1002/jmr.1061.

Prediction of RNA-binding amino acids from protein and RNA sequences.从蛋白质和 RNA 序列预测 RNA 结合氨基酸。

BMC Bioinformatics. 2011;12 Suppl 13(Suppl 13):S7. doi: 10.1186/1471-2105-12-S13-S7. Epub 2011 Nov 30.

PRINTR: prediction of RNA binding sites in proteins using SVM and profiles.PRINTR：使用支持向量机和图谱预测蛋白质中的RNA结合位点

Amino Acids. 2008 Aug;35(2):295-302. doi: 10.1007/s00726-007-0634-9. Epub 2008 Jan 31.

Prediction of RNA binding sites in a protein using SVM and PSSM profile.使用支持向量机和位置特异性得分矩阵预测蛋白质中的RNA结合位点。

Proteins. 2008 Apr;71(1):189-94. doi: 10.1002/prot.21677.

Identification of protein-interacting nucleotides in a RNA sequence using composition profile of tri-nucleotides.利用三核苷酸的组成概况鉴定RNA序列中的蛋白质相互作用核苷酸。

Genomics. 2015 Apr;105(4):197-203. doi: 10.1016/j.ygeno.2015.01.005. Epub 2015 Jan 30.

引用本文的文献

Predicting protein-binding regions in RNA using nucleotide profiles and compositions.利用核苷酸谱和组成预测RNA中的蛋白质结合区域。

BMC Syst Biol. 2017 Mar 14;11(Suppl 2):16. doi: 10.1186/s12918-017-0386-4.

考虑结合伙伴预测与蛋白质结合的 RNA 核苷酸。

Predicting protein-binding RNA nucleotides with consideration of binding partners.

机构信息

Department of Computer Science and Engineering, Inha University, Incheon, South Korea.

出版信息

Comput Methods Programs Biomed. 2015 Jun;120(1):3-15. doi: 10.1016/j.cmpb.2015.03.010. Epub 2015 Apr 8.

DOI:10.1016/j.cmpb.2015.03.010

PMID:25907142

Abstract

摘要

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

考虑结合伙伴预测与蛋白质结合的 RNA 核苷酸。

Predicting protein-binding RNA nucleotides with consideration of binding partners.

机构信息

出版信息

相似文献

引用本文的文献

考虑结合伙伴预测与蛋白质结合的 RNA 核苷酸。

Predicting protein-binding RNA nucleotides with consideration of binding partners.

机构信息

出版信息

相似文献

引用本文的文献