Suppr超能文献

RNAelem:一种用于发现RNA结合蛋白所结合RNA中的序列-结构基序的算法。

RNAelem: an algorithm for discovering sequence-structure motifs in RNA bound by RNA-binding proteins.

作者信息

Miyake Hiroshi, Kawaguchi Risa Karakida, Kiryu Hisanori

机构信息

Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Chiba 277-8561, Japan.

Department of Life Science Frontiers, Center for iPS Cell Research and Application (CiRA), Kyoto University, Sakyo-ku 606-8507, Japan.

出版信息

Bioinform Adv. 2024 Sep 28;4(1):vbae144. doi: 10.1093/bioadv/vbae144. eCollection 2024.

Abstract

MOTIVATION

RNA-binding proteins (RBPs) play a crucial role in the post-transcriptional regulation of RNA. Given their importance, analyzing the specific RNA patterns recognized by RBPs has become a significant research focus in bioinformatics. Deep Neural Networks have enhanced the accuracy of prediction for RBP-binding sites, yet understanding the structural basis of RBP-binding specificity from these models is challenging due to their limited interpretability. To address this, we developed RNAelem, which combines profile context-free grammar and the Turner energy model for RNA secondary structure to predict sequence-structure motifs in RBP-binding regions.

RESULTS

RNAelem exhibited superior detection accuracy compared to existing tools for RNA sequences with structural motifs. Upon applying RNAelem to the eCLIP database, we were not only able to reproduce many known primary sequence motifs in the absence of secondary structures, but also discovered many secondary structural motifs that contained sequence-nonspecific insertion regions. Furthermore, the high interpretability of RNAelem yielded insightful findings such as long-range base-pairing interactions in the binding region of the U2AF protein.

AVAILABILITY AND IMPLEMENTATION

The code is available at https://github.com/iyak/RNAelem.

摘要

动机

RNA结合蛋白(RBPs)在RNA的转录后调控中起着至关重要的作用。鉴于其重要性,分析RBPs识别的特定RNA模式已成为生物信息学中的一个重要研究重点。深度神经网络提高了RBP结合位点预测的准确性,但由于其可解释性有限,从这些模型中理解RBP结合特异性的结构基础具有挑战性。为了解决这个问题,我们开发了RNAelem,它结合了轮廓上下文无关语法和RNA二级结构的特纳能量模型,以预测RBP结合区域中的序列-结构基序。

结果

与现有的用于具有结构基序的RNA序列的工具相比,RNAelem表现出更高的检测准确性。将RNAelem应用于eCLIP数据库时,我们不仅能够在不存在二级结构的情况下重现许多已知的一级序列基序,还发现了许多包含序列非特异性插入区域的二级结构基序。此外,RNAelem的高可解释性产生了有深刻见解的发现,例如U2AF蛋白结合区域中的长程碱基配对相互作用。

可用性和实现方式

代码可在https://github.com/iyak/RNAelem获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bfd/11471262/ae483206dceb/vbae144f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验