• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用 RNA 一级序列和二级结构的分布式表示来推断 RNA 结合蛋白结合位点的深度神经网络。

Deep neural networks for inferring binding sites of RNA-binding proteins by using distributed representations of RNA primary sequence and secondary structure.

机构信息

School of Computer Science and Engineering, Central South University, Changsha, 410075, China.

Aliyun School of Big Data, Changzhou University, Changzhou, 213164, China.

出版信息

BMC Genomics. 2020 Dec 17;21(Suppl 13):866. doi: 10.1186/s12864-020-07239-w.

DOI:10.1186/s12864-020-07239-w
PMID:33334313
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7745412/
Abstract

BACKGROUND

RNA binding proteins (RBPs) play a vital role in post-transcriptional processes in all eukaryotes, such as splicing regulation, mRNA transport, and modulation of mRNA translation and decay. The identification of RBP binding sites is a crucial step in understanding the biological mechanism of post-transcriptional gene regulation. However, the determination of RBP binding sites on a large scale is a challenging task due to high cost of biochemical assays. Quite a number of studies have exploited machine learning methods to predict binding sites. Especially, deep learning is increasingly used in the bioinformatics field by virtue of its ability to learn generalized representations from DNA and protein sequences.

RESULTS

In this paper, we implemented a novel deep neural network model, DeepRKE, which combines primary RNA sequence and secondary structure information to effectively predict RBP binding sites. Specifically, we used word embedding algorithm to extract features of RNA sequences and secondary structures, i.e., distributed representation of k-mers sequence rather than traditional one-hot encoding. The distributed representations are taken as input of convolutional neural networks (CNN) and bidirectional long-term short-term memory networks (BiLSTM) to identify RBP binding sites. Our results show that deepRKE outperforms existing counterpart methods on two large-scale benchmark datasets.

CONCLUSIONS

Our extensive experimental results show that DeepRKE is an efficacious tool for predicting RBP binding sites. The distributed representations of RNA sequences and secondary structures can effectively detect the latent relationship and similarity between k-mers, and thus improve the predictive performance. The source code of DeepRKE is available at https://github.com/youzhiliu/DeepRKE/ .

摘要

背景

RNA 结合蛋白 (RBPs) 在所有真核生物的转录后过程中发挥着至关重要的作用,例如剪接调控、mRNA 运输以及调节 mRNA 翻译和降解。确定 RBP 结合位点是理解转录后基因调控生物学机制的关键步骤。然而,由于生化测定的成本高昂,大规模确定 RBP 结合位点是一项具有挑战性的任务。相当多的研究利用机器学习方法来预测结合位点。特别是,深度学习凭借其从 DNA 和蛋白质序列中学习泛化表示的能力,在生物信息学领域得到了越来越多的应用。

结果

在本文中,我们实现了一种新的深度神经网络模型 DeepRKE,该模型结合了原始 RNA 序列和二级结构信息,有效地预测了 RBP 结合位点。具体来说,我们使用词嵌入算法提取 RNA 序列和二级结构的特征,即 k-mer 序列的分布式表示,而不是传统的独热编码。分布式表示作为卷积神经网络 (CNN) 和双向长短时记忆网络 (BiLSTM) 的输入,以识别 RBP 结合位点。我们的结果表明,在两个大型基准数据集上,DeepRKE 优于现有的对比方法。

结论

我们的广泛实验结果表明,DeepRKE 是一种预测 RBP 结合位点的有效工具。RNA 序列和二级结构的分布式表示可以有效地检测 k-mer 之间的潜在关系和相似性,从而提高预测性能。DeepRKE 的源代码可在 https://github.com/youzhiliu/DeepRKE/ 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a040/7745412/488bf5ec9fdd/12864_2020_7239_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a040/7745412/5c6503c402d0/12864_2020_7239_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a040/7745412/d819c0d1c778/12864_2020_7239_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a040/7745412/488bf5ec9fdd/12864_2020_7239_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a040/7745412/5c6503c402d0/12864_2020_7239_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a040/7745412/d819c0d1c778/12864_2020_7239_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a040/7745412/488bf5ec9fdd/12864_2020_7239_Fig3_HTML.jpg

相似文献

1
Deep neural networks for inferring binding sites of RNA-binding proteins by using distributed representations of RNA primary sequence and secondary structure.利用 RNA 一级序列和二级结构的分布式表示来推断 RNA 结合蛋白结合位点的深度神经网络。
BMC Genomics. 2020 Dec 17;21(Suppl 13):866. doi: 10.1186/s12864-020-07239-w.
2
Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks.使用深度卷积和递归神经网络预测 RNA-蛋白质序列和结构的结合偏好。
BMC Genomics. 2018 Jul 3;19(1):511. doi: 10.1186/s12864-018-4889-1.
3
DeepPN: a deep parallel neural network based on convolutional neural network and graph convolutional network for predicting RNA-protein binding sites.DeepPN:一种基于卷积神经网络和图卷积网络的深度并行神经网络,用于预测 RNA-蛋白质结合位点。
BMC Bioinformatics. 2022 Jun 29;23(1):257. doi: 10.1186/s12859-022-04798-5.
4
Prediction of the RBP binding sites on lncRNAs using the high-order nucleotide encoding convolutional neural network.使用高阶核苷酸编码卷积神经网络预测长链非编码RNA上的RBP结合位点
Anal Biochem. 2019 Oct 15;583:113364. doi: 10.1016/j.ab.2019.113364. Epub 2019 Jul 16.
5
Predicting RNA-protein binding sites and motifs through combining local and global deep convolutional neural networks.通过结合局部和全局深度卷积神经网络预测 RNA 与蛋白质的结合位点和基序。
Bioinformatics. 2018 Oct 15;34(20):3427-3436. doi: 10.1093/bioinformatics/bty364.
6
RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach.基于新型混合深度学习跨域知识整合方法的RNA-蛋白质结合基序挖掘
BMC Bioinformatics. 2017 Feb 28;18(1):136. doi: 10.1186/s12859-017-1561-8.
7
Integrating thermodynamic and sequence contexts improves protein-RNA binding prediction.整合热力学和序列背景可提高蛋白质-RNA 结合预测。
PLoS Comput Biol. 2019 Sep 4;15(9):e1007283. doi: 10.1371/journal.pcbi.1007283. eCollection 2019 Sep.
8
CRBPDL: Identification of circRNA-RBP interaction sites using an ensemble neural network approach.环状 RNA 与 RNA 结合蛋白相互作用位点的鉴定:基于集成神经网络方法。
PLoS Comput Biol. 2022 Jan 20;18(1):e1009798. doi: 10.1371/journal.pcbi.1009798. eCollection 2022 Jan.
9
CRIP: predicting circRNA-RBP-binding sites using a codon-based encoding and hybrid deep neural networks.CRIP:基于密码子编码和混合深度神经网络的 circRNA-RBP 结合位点预测。
RNA. 2019 Dec;25(12):1604-1615. doi: 10.1261/rna.070565.119. Epub 2019 Sep 19.
10
AC-Caps: Attention Based Capsule Network for Predicting RBP Binding Sites of LncRNA.AC-Caps:用于预测 lncRNA 的 RBP 结合位点的基于注意力的胶囊网络。
Interdiscip Sci. 2020 Dec;12(4):414-423. doi: 10.1007/s12539-020-00379-3. Epub 2020 Jun 22.

引用本文的文献

1
Emerging RNA-centric technologies to probe RNA-protein interactions: importance in decoding the life cycle of positive sense single strand RNA viruses and antiviral discovery.用于探测RNA-蛋白质相互作用的新兴RNA中心技术:在解读正链单链RNA病毒生命周期及抗病毒发现中的重要性。
Front Cell Infect Microbiol. 2025 May 21;15:1580337. doi: 10.3389/fcimb.2025.1580337. eCollection 2025.
2
Research on Plant RNA-Binding Protein Prediction Method Based on Improved Ensemble Learning.基于改进集成学习的植物RNA结合蛋白预测方法研究
Biology (Basel). 2025 Jun 10;14(6):672. doi: 10.3390/biology14060672.
3
RNA sequence analysis landscape: A comprehensive review of task types, databases, datasets, word embedding methods, and language models.

本文引用的文献

1
Gene2vec: gene subsequence embedding for prediction of mammalian -methyladenosine sites from mRNA.Gene2vec:基于基因子序列的嵌体模型,用于从 mRNA 预测哺乳动物 m6A 修饰位点。
RNA. 2019 Feb;25(2):205-218. doi: 10.1261/rna.069112.118. Epub 2018 Nov 13.
2
A deep neural network approach for learning intrinsic protein-RNA binding preferences.一种用于学习内在蛋白-RNA 结合偏好的深度神经网络方法。
Bioinformatics. 2018 Sep 1;34(17):i638-i646. doi: 10.1093/bioinformatics/bty600.
3
Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks.
RNA序列分析全景:任务类型、数据库、数据集、词嵌入方法及语言模型的全面综述
Heliyon. 2025 Jan 6;11(2):e41488. doi: 10.1016/j.heliyon.2024.e41488. eCollection 2025 Jan 30.
4
Optimizing protein sequence classification: integrating deep learning models with Bayesian optimization for enhanced biological analysis.优化蛋白质序列分类:将深度学习模型与贝叶斯优化相结合,以增强生物分析。
BMC Med Inform Decis Mak. 2024 Aug 27;24(1):236. doi: 10.1186/s12911-024-02631-y.
5
Transfer Learning Allows Accurate RBP Target Site Prediction with Limited Sample Sizes.迁移学习可在样本量有限的情况下实现准确的RNA结合蛋白靶位点预测。
Biology (Basel). 2023 Sep 25;12(10):1276. doi: 10.3390/biology12101276.
6
KDeep: a new memory-efficient data extraction method for accurately predicting DNA/RNA transcription factor binding sites.KDeep:一种新的内存高效数据提取方法,可准确预测 DNA/RNA 转录因子结合位点。
J Transl Med. 2023 Oct 16;21(1):727. doi: 10.1186/s12967-023-04593-7.
7
A systematic benchmark of machine learning methods for protein-RNA interaction prediction.一种蛋白质- RNA 相互作用预测的机器学习方法的系统基准测试。
Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad307.
8
CPPVec: an accurate coding potential predictor based on a distributed representation of protein sequence.CPPVec:一种基于蛋白质序列分布式表示的准确编码潜能预测器。
BMC Genomics. 2023 May 17;24(1):264. doi: 10.1186/s12864-023-09365-7.
9
Prediction of RNA-protein interactions using a nucleotide language model.使用核苷酸语言模型预测RNA-蛋白质相互作用。
Bioinform Adv. 2022 Apr 7;2(1):vbac023. doi: 10.1093/bioadv/vbac023. eCollection 2022.
10
PRIESSTESS: interpretable, high-performing models of the sequence and structure preferences of RNA-binding proteins.PRIESSTESS:可解释的、高性能的 RNA 结合蛋白序列和结构偏好模型。
Nucleic Acids Res. 2022 Oct 28;50(19):e111. doi: 10.1093/nar/gkac694.
使用深度卷积和递归神经网络预测 RNA-蛋白质序列和结构的结合偏好。
BMC Genomics. 2018 Jul 3;19(1):511. doi: 10.1186/s12864-018-4889-1.
4
pysster: classification of biological sequences by learning sequence and structure motifs with convolutional neural networks.pysster:通过使用卷积神经网络学习序列和结构基元对生物序列进行分类。
Bioinformatics. 2018 Sep 1;34(17):3035-3037. doi: 10.1093/bioinformatics/bty222.
5
Ontological function annotation of long non-coding RNAs through hierarchical multi-label classification.通过层次多标签分类对长非编码 RNA 进行本体功能注释。
Bioinformatics. 2018 May 15;34(10):1750-1757. doi: 10.1093/bioinformatics/btx833.
6
RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach.基于新型混合深度学习跨域知识整合方法的RNA-蛋白质结合基序挖掘
BMC Bioinformatics. 2017 Feb 28;18(1):136. doi: 10.1186/s12859-017-1561-8.
7
RNAcommender: genome-wide recommendation of RNA-protein interactions.RNAcommender:全基因组推荐 RNA-蛋白质相互作用。
Bioinformatics. 2016 Dec 1;32(23):3627-3634. doi: 10.1093/bioinformatics/btw517. Epub 2016 Aug 8.
8
RCK: accurate and efficient inference of sequence- and structure-based protein-RNA binding models from RNAcompete data.RCK:基于RNAcompete数据准确高效地推断基于序列和结构的蛋白质-RNA结合模型。
Bioinformatics. 2016 Jun 15;32(12):i351-i359. doi: 10.1093/bioinformatics/btw259.
9
The RNA-binding protein TTP is a global post-transcriptional regulator of feedback control in inflammation.RNA结合蛋白TTP是炎症中反馈控制的全局转录后调节因子。
Nucleic Acids Res. 2016 Sep 6;44(15):7418-40. doi: 10.1093/nar/gkw474. Epub 2016 May 24.
10
Interaction of tau with the RNA-Binding Protein TIA1 Regulates tau Pathophysiology and Toxicity.tau蛋白与RNA结合蛋白TIA1的相互作用调节tau蛋白的病理生理学和毒性。
Cell Rep. 2016 May 17;15(7):1455-1466. doi: 10.1016/j.celrep.2016.04.045. Epub 2016 May 6.