• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

单链隐马尔可夫模型:从高通量RNA结合蛋白数据中提取直观的序列结构基序

ssHMM: extracting intuitive sequence-structure motifs from high-throughput RNA-binding protein data.

作者信息

Heller David, Krestel Ralf, Ohler Uwe, Vingron Martin, Marsico Annalisa

机构信息

Max Planck Institute for Molecular Genetics, Ihnestr. 63-73 14195 Berlin, Germany.

Hasso Plattner Institute, Prof.-Dr.-Helmert-Str. 2-3 14482 Potsdam, Germany.

出版信息

Nucleic Acids Res. 2017 Nov 2;45(19):11004-11018. doi: 10.1093/nar/gkx756.

DOI:10.1093/nar/gkx756
PMID:28977546
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5737366/
Abstract

RNA-binding proteins (RBPs) play an important role in RNA post-transcriptional regulation and recognize target RNAs via sequence-structure motifs. The extent to which RNA structure influences protein binding in the presence or absence of a sequence motif is still poorly understood. Existing RNA motif finders either take the structure of the RNA only partially into account, or employ models which are not directly interpretable as sequence-structure motifs. We developed ssHMM, an RNA motif finder based on a hidden Markov model (HMM) and Gibbs sampling which fully captures the relationship between RNA sequence and secondary structure preference of a given RBP. Compared to previous methods which output separate logos for sequence and structure, it directly produces a combined sequence-structure motif when trained on a large set of sequences. ssHMM's model is visualized intuitively as a graph and facilitates biological interpretation. ssHMM can be used to find novel bona fide sequence-structure motifs of uncharacterized RBPs, such as the one presented here for the YY1 protein. ssHMM reaches a high motif recovery rate on synthetic data, it recovers known RBP motifs from CLIP-Seq data, and scales linearly on the input size, being considerably faster than MEMERIS and RNAcontext on large datasets while being on par with GraphProt. It is freely available on Github and as a Docker image.

摘要

RNA结合蛋白(RBPs)在RNA转录后调控中发挥着重要作用,并通过序列-结构基序识别靶RNA。在存在或不存在序列基序的情况下,RNA结构对蛋白质结合的影响程度仍知之甚少。现有的RNA基序查找工具要么只部分考虑RNA的结构,要么采用不能直接解释为序列-结构基序的模型。我们开发了ssHMM,这是一种基于隐马尔可夫模型(HMM)和吉布斯采样的RNA基序查找工具,它能完全捕捉RNA序列与给定RBP的二级结构偏好之间的关系。与之前输出序列和结构单独标识的方法相比,在对大量序列进行训练时,它能直接生成组合的序列-结构基序。ssHMM的模型可以直观地可视化为图形,便于生物学解释。ssHMM可用于查找未表征RBP的新型真实序列-结构基序,例如此处展示的YY1蛋白的基序。ssHMM在合成数据上具有较高的基序回收率,能从CLIP-Seq数据中恢复已知的RBP基序,并且在输入大小上呈线性扩展,在大型数据集上比MEMERIS和RNAcontext快得多,同时与GraphProt相当。它可在Github上免费获取,也可作为Docker镜像使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bc5/5737366/970807c9a3c1/gkx756fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bc5/5737366/cd93de4c0cda/gkx756fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bc5/5737366/0e406d269589/gkx756fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bc5/5737366/fbde25ce6ac0/gkx756fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bc5/5737366/4a287bacbbd9/gkx756fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bc5/5737366/c89903318d89/gkx756fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bc5/5737366/2a3939e4e213/gkx756fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bc5/5737366/970807c9a3c1/gkx756fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bc5/5737366/cd93de4c0cda/gkx756fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bc5/5737366/0e406d269589/gkx756fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bc5/5737366/fbde25ce6ac0/gkx756fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bc5/5737366/4a287bacbbd9/gkx756fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bc5/5737366/c89903318d89/gkx756fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bc5/5737366/2a3939e4e213/gkx756fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bc5/5737366/970807c9a3c1/gkx756fig7.jpg

相似文献

1
ssHMM: extracting intuitive sequence-structure motifs from high-throughput RNA-binding protein data.单链隐马尔可夫模型:从高通量RNA结合蛋白数据中提取直观的序列结构基序
Nucleic Acids Res. 2017 Nov 2;45(19):11004-11018. doi: 10.1093/nar/gkx756.
2
A combined sequence and structure based method for discovering enriched motifs in RNA from in vivo binding data.一种基于序列和结构相结合的方法,用于从体内结合数据中发现RNA中富集的基序。
Methods. 2017 Apr 15;118-119:73-81. doi: 10.1016/j.ymeth.2017.03.003. Epub 2017 Mar 6.
3
RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach.基于新型混合深度学习跨域知识整合方法的RNA-蛋白质结合基序挖掘
BMC Bioinformatics. 2017 Feb 28;18(1):136. doi: 10.1186/s12859-017-1561-8.
4
Leveraging cross-link modification events in CLIP-seq for motif discovery.利用CLIP-seq中的交联修饰事件进行基序发现。
Nucleic Acids Res. 2015 Jan;43(1):95-103. doi: 10.1093/nar/gku1288. Epub 2014 Dec 10.
5
Finding the target sites of RNA-binding proteins.寻找 RNA 结合蛋白的靶位。
Wiley Interdiscip Rev RNA. 2014 Jan-Feb;5(1):111-30. doi: 10.1002/wrna.1201. Epub 2013 Nov 11.
6
Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks.使用深度卷积和递归神经网络预测 RNA-蛋白质序列和结构的结合偏好。
BMC Genomics. 2018 Jul 3;19(1):511. doi: 10.1186/s12864-018-4889-1.
7
SARNAclust: Semi-automatic detection of RNA protein binding motifs from immunoprecipitation data.SARNAclust:从免疫沉淀数据中半自动检测 RNA 蛋白质结合基序。
PLoS Comput Biol. 2018 Mar 29;14(3):e1006078. doi: 10.1371/journal.pcbi.1006078. eCollection 2018 Mar.
8
mCarts: Genome-Wide Prediction of Clustered Sequence Motifs as Binding Sites for RNA-Binding Proteins.mCarts:全基因组预测成簇序列基序作为RNA结合蛋白的结合位点
Methods Mol Biol. 2016;1421:215-26. doi: 10.1007/978-1-4939-3591-8_17.
9
DotAligner: identification and clustering of RNA structure motifs.DotAligner:RNA 结构基序的识别和聚类。
Genome Biol. 2017 Dec 28;18(1):244. doi: 10.1186/s13059-017-1371-3.
10
A deep boosting based approach for capturing the sequence binding preferences of RNA-binding proteins from high-throughput CLIP-seq data.一种基于深度增强学习的方法,用于从高通量CLIP-seq数据中捕获RNA结合蛋白的序列结合偏好。
Nucleic Acids Res. 2017 Aug 21;45(14):e129. doi: 10.1093/nar/gkx492.

引用本文的文献

1
Towards in silico CLIP-seq: predicting protein-RNA interaction via sequence-to-signal learning.迈向计算型 CLIP-seq:通过序列到信号学习预测蛋白质-RNA 相互作用。
Genome Biol. 2023 Aug 4;24(1):180. doi: 10.1186/s13059-023-03015-7.
2
Data Science Issues in Studying Protein-RNA Interactions with CLIP Technologies.利用CLIP技术研究蛋白质-RNA相互作用中的数据科学问题
Annu Rev Biomed Data Sci. 2018 Jul 20;1(1):235-261. doi: 10.1146/annurev-biodatasci-080917-013525.
3
RNANetMotif: Identifying sequence-structure RNA network motifs in RNA-protein binding sites.

本文引用的文献

1
AptaTRACE Elucidates RNA Sequence-Structure Motifs from Selection Trends in HT-SELEX Experiments.AptaTRACE 可从 HT-SELEX 实验中的选择趋势阐明 RNA 序列结构基序。
Cell Syst. 2016 Jul;3(1):62-70. doi: 10.1016/j.cels.2016.07.003.
2
Transcription factor trapping by RNA in gene regulatory elements.RNA在基因调控元件中对转录因子的捕获
Science. 2015 Nov 20;350(6263):978-81. doi: 10.1126/science.aad3346. Epub 2015 Oct 29.
3
Leveraging cross-link modification events in CLIP-seq for motif discovery.利用CLIP-seq中的交联修饰事件进行基序发现。
RNANetMotif:在 RNA-蛋白质结合位点中识别序列-结构 RNA 网络基序。
PLoS Comput Biol. 2022 Jul 12;18(7):e1010293. doi: 10.1371/journal.pcbi.1010293. eCollection 2022 Jul.
4
Spatial correlation statistics enable transcriptome-wide characterization of RNA structure binding.空间相关统计可实现 RNA 结构结合的全转录组特征分析。
Cell Rep Methods. 2021 Oct 1;1(6):100088. doi: 10.1016/j.crmeth.2021.100088. eCollection 2021 Oct 25.
5
Dual Attention Mechanisms and Feature Fusion Networks Based Method for Predicting LncRNA-Disease Associations.基于双注意力机制和特征融合网络的 lncRNA-疾病关联预测方法。
Interdiscip Sci. 2022 Jun;14(2):358-371. doi: 10.1007/s12539-021-00492-x. Epub 2022 Jan 24.
6
The nuclear transcription factor, TAF7, is a cytoplasmic regulator of protein synthesis.核转录因子TAF7是蛋白质合成的细胞质调节剂。
Sci Adv. 2021 Dec 10;7(50):eabi5751. doi: 10.1126/sciadv.abi5751.
7
RNALigands: a database and web server for RNA-ligand interactions.RNA 配体:用于 RNA-配体相互作用的数据库和网络服务器。
RNA. 2022 Feb;28(2):115-122. doi: 10.1261/rna.078889.121. Epub 2021 Nov 3.
8
HMGB1 coordinates SASP-related chromatin folding and RNA homeostasis on the path to senescence.HMGB1 协调 SASP 相关染色质折叠和 RNA 动态平衡,从而走向衰老。
Mol Syst Biol. 2021 Jun;17(6):e9760. doi: 10.15252/msb.20209760.
9
Motif Discovery from CLIP Experiments.从 CLIP 实验中发现基序
Methods Mol Biol. 2021;2284:43-50. doi: 10.1007/978-1-0716-1307-8_3.
10
Zooming in on protein-RNA interactions: a multi-level workflow to identify interaction partners.聚焦于蛋白质-RNA 相互作用:一种鉴定相互作用伙伴的多层次工作流程。
Biochem Soc Trans. 2020 Aug 28;48(4):1529-1543. doi: 10.1042/BST20191059.
Nucleic Acids Res. 2015 Jan;43(1):95-103. doi: 10.1093/nar/gku1288. Epub 2014 Dec 10.
4
DoRiNA 2.0--upgrading the doRiNA database of RNA interactions in post-transcriptional regulation.DoRiNA 2.0——升级转录后调控中RNA相互作用的DoRiNA数据库。
Nucleic Acids Res. 2015 Jan;43(Database issue):D160-7. doi: 10.1093/nar/gku1180. Epub 2014 Nov 21.
5
Microprocessor activity controls differential miRNA biogenesis In Vivo.微处理器活性在体内控制着差异性miRNA生物合成。
Cell Rep. 2014 Oct 23;9(2):542-54. doi: 10.1016/j.celrep.2014.09.007. Epub 2014 Oct 9.
6
GraphProt: modeling binding preferences of RNA-binding proteins.GraphProt:RNA结合蛋白结合偏好性建模
Genome Biol. 2014 Jan 22;15(1):R17. doi: 10.1186/gb-2014-15-1-r17.
7
CapR: revealing structural specificities of RNA-binding protein target recognition using CLIP-seq data.CapR:利用CLIP-seq数据揭示RNA结合蛋白靶点识别的结构特异性
Genome Biol. 2014 Jan 21;15(1):R16. doi: 10.1186/gb-2014-15-1-r16.
8
Advancing the functional utility of PAR-CLIP by quantifying background binding to mRNAs and lncRNAs.通过量化与mRNA和lncRNA的背景结合来提升PAR-CLIP的功能效用。
Genome Biol. 2014 Jan 7;15(1):R2. doi: 10.1186/gb-2014-15-1-r2.
9
The core microprocessor component DiGeorge syndrome critical region 8 (DGCR8) is a nonspecific RNA-binding protein.核心微处理器组件 DiGeorge 综合征关键区 8(DGCR8)是一种非特异性 RNA 结合蛋白。
J Biol Chem. 2013 Sep 13;288(37):26785-99. doi: 10.1074/jbc.M112.446880. Epub 2013 Jul 26.
10
Beyond secondary structure: primary-sequence determinants license pri-miRNA hairpins for processing.超越二级结构:一级序列决定子赋予 pri-miRNA 发夹用于加工的许可。
Cell. 2013 Feb 14;152(4):844-58. doi: 10.1016/j.cell.2013.01.031.