基于信息熵的蛋白质特异性 RNA 结合位点预测。

Protein-Specific Prediction of RNA-Binding Sites Based on Information Entropy.

机构信息

College of Chemistry, Sichuan University, Chengdu, Sichuan 610064, China.

Product R&D and Testing Center, Shilin Xingdian Agricultural Products Development Co., Ltd., Kunming, Yunnan 652200, China.

出版信息

Comput Intell Neurosci. 2022 Oct 3;2022:8626628. doi: 10.1155/2022/8626628. eCollection 2022.

DOI:10.1155/2022/8626628

PMID:36225547

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9550406/

Abstract

Understanding the protein-RNA interaction mechanism can help us to further explore various biological processes. The experimental techniques still have some limitations, such as the high cost of economy and time. Predicting protein-RNA-binding sites by using computational methods is an excellent research tool. Here, we developed a universal method for predicting protein-specific RNA-binding sites, so one general model for a given protein was constructed on a fixed dataset by fusing the data of different experimental techniques. At the same time, information theory was employed to characterize the sequence conservation of RNA-binding segments. Conversation difference profiles between binding and nonbinding segments were constructed by information entropy (IE), which indicates a significant difference. Finally, the 19 proteins-specific models based on random forest (RF) were built based on IE encoding. The performance on the independent datasets demonstrates that our method can obtain competitive results when compared with the current best prediction model.

摘要

理解蛋白质-RNA 相互作用的机制可以帮助我们进一步探索各种生物过程。实验技术仍然存在一些限制，例如经济和时间成本高。使用计算方法预测蛋白质-RNA 结合位点是一种极好的研究工具。在这里，我们开发了一种通用的方法来预测蛋白质特异性 RNA 结合位点，因此通过融合不同实验技术的数据，为给定的蛋白质构建了一个通用模型。同时，信息论被用来描述 RNA 结合片段的序列保守性。通过信息熵（IE）构建了结合和非结合片段之间的转换差异分布，表明存在显著差异。最后，基于 IE 编码构建了 19 个基于随机森林（RF）的蛋白质特异性模型。在独立数据集上的性能表明，与当前最佳预测模型相比，我们的方法可以获得有竞争力的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37a8/9550406/e7d4f259e72d/CIN2022-8626628.001.jpg

相似文献

Protein-Specific Prediction of RNA-Binding Sites Based on Information Entropy.基于信息熵的蛋白质特异性 RNA 结合位点预测。

Comput Intell Neurosci. 2022 Oct 3;2022:8626628. doi: 10.1155/2022/8626628. eCollection 2022.

Predicting protein-binding regions in RNA using nucleotide profiles and compositions.利用核苷酸谱和组成预测RNA中的蛋白质结合区域。

BMC Syst Biol. 2017 Mar 14;11(Suppl 2):16. doi: 10.1186/s12918-017-0386-4.

Predicting RNA-binding sites of proteins using support vector machines and evolutionary information.使用支持向量机和进化信息预测蛋白质的RNA结合位点。

BMC Bioinformatics. 2008 Dec 12;9 Suppl 12(Suppl 12):S6. doi: 10.1186/1471-2105-9-S12-S6.

Prediction of RNA-binding amino acids from protein and RNA sequences.从蛋白质和 RNA 序列预测 RNA 结合氨基酸。

BMC Bioinformatics. 2011;12 Suppl 13(Suppl 13):S7. doi: 10.1186/1471-2105-12-S13-S7. Epub 2011 Nov 30.

Computational Prediction of RNA-Binding Proteins and Binding Sites.RNA结合蛋白及结合位点的计算预测

Int J Mol Sci. 2015 Nov 3;16(11):26303-17. doi: 10.3390/ijms161125952.

Prediction of RNA-protein interactions using conjoint triad feature and chaos game representation.基于联合三联体特征和混沌游戏表示预测 RNA-蛋白质相互作用。

Bioengineered. 2018;9(1):242-251. doi: 10.1080/21655979.2018.1470721.

Structure-based prediction of protein- peptide binding regions using Random Forest.基于结构的随机森林预测蛋白肽结合区域。

Bioinformatics. 2018 Feb 1;34(3):477-484. doi: 10.1093/bioinformatics/btx614.

Predicting protein-binding RNA nucleotides with consideration of binding partners.考虑结合伙伴预测与蛋白质结合的 RNA 核苷酸。

Comput Methods Programs Biomed. 2015 Jun;120(1):3-15. doi: 10.1016/j.cmpb.2015.03.010. Epub 2015 Apr 8.

Identification of RNA-binding sites in proteins by integrating various sequence information.通过整合各种序列信息鉴定蛋白质中的 RNA 结合位点。

Amino Acids. 2011 Jan;40(1):239-48. doi: 10.1007/s00726-010-0639-7. Epub 2010 Jun 12.

Predicting protein-binding RNA nucleotides using the feature-based removal of data redundancy and the interaction propensity of nucleotide triplets.利用基于特征的数据冗余消除和核苷酸三联体的相互作用倾向预测与蛋白质结合的 RNA 核苷酸。

Comput Biol Med. 2013 Nov;43(11):1687-97. doi: 10.1016/j.compbiomed.2013.08.011. Epub 2013 Aug 21.

本文引用的文献

Individually double minimum-distance definition of protein-RNA binding residues and application to structure-based prediction.个体双最小距离定义蛋白质 RNA 结合残基及其在基于结构预测中的应用。

J Comput Aided Mol Des. 2018 Dec;32(12):1363-1373. doi: 10.1007/s10822-018-0177-z. Epub 2018 Nov 26.

Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks.使用深度卷积和递归神经网络预测 RNA-蛋白质序列和结构的结合偏好。

BMC Genomics. 2018 Jul 3;19(1):511. doi: 10.1186/s12864-018-4889-1.

Predicting RNA-protein binding sites and motifs through combining local and global deep convolutional neural networks.通过结合局部和全局深度卷积神经网络预测 RNA 与蛋白质的结合位点和基序。

Bioinformatics. 2018 Oct 15;34(20):3427-3436. doi: 10.1093/bioinformatics/bty364.

The lncLocator: a subcellular localization predictor for long non-coding RNAs based on a stacked ensemble classifier.lncLocator：一种基于堆叠集成分类器的长非编码 RNA 亚细胞定位预测器。

Bioinformatics. 2018 Jul 1;34(13):2185-2194. doi: 10.1093/bioinformatics/bty085.

Effective prediction of bacterial type IV secreted effectors by combined features of both C-termini and N-termini.通过C端和N端的联合特征对细菌IV型分泌效应蛋白进行有效预测。

J Comput Aided Mol Des. 2017 Nov;31(11):1029-1038. doi: 10.1007/s10822-017-0080-z. Epub 2017 Nov 10.

RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach.基于新型混合深度学习跨域知识整合方法的RNA-蛋白质结合基序挖掘

BMC Bioinformatics. 2017 Feb 28;18(1):136. doi: 10.1186/s12859-017-1561-8.

Orthogonal matrix factorization enables integrative analysis of multiple RNA binding proteins.正交矩阵分解能够对多种RNA结合蛋白进行综合分析。

Bioinformatics. 2016 May 15;32(10):1527-35. doi: 10.1093/bioinformatics/btw003. Epub 2016 Jan 18.

A deep learning framework for modeling structural features of RNA-binding protein targets.一种用于对RNA结合蛋白靶点的结构特征进行建模的深度学习框架。

Nucleic Acids Res. 2016 Feb 29;44(4):e32. doi: 10.1093/nar/gkv1025. Epub 2015 Oct 13.

Position-specific prediction of methylation sites from sequence conservation based on information theory.基于信息论从序列保守性对甲基化位点进行位点特异性预测。

Sci Rep. 2015 Jul 23;5:12403. doi: 10.1038/srep12403.

A sequence-based two-level method for the prediction of type I secreted RTX proteins.一种基于序列的两级方法用于预测I型分泌的RTX蛋白。

Analyst. 2015 May 7;140(9):3048-56. doi: 10.1039/c5an00311c. Epub 2015 Mar 24.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于信息熵的蛋白质特异性 RNA 结合位点预测。

Protein-Specific Prediction of RNA-Binding Sites Based on Information Entropy.

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献