Suppr超能文献

ASSA:快速识别长链RNA之间具有统计学意义的相互作用

ASSA: Fast identification of statistically significant interactions between long RNAs.

作者信息

Antonov Ivan, Marakhonov Andrey, Zamkova Maria, Medvedeva Yulia

机构信息

* Institute of Bioengineering, Federal Research Center Fundamentals of Biotechnology RAS, Moscow 117312, Russia.

† Department of Molecular and Biological Physics & Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia.

出版信息

J Bioinform Comput Biol. 2018 Feb;16(1):1840001. doi: 10.1142/S0219720018400012. Epub 2018 Jan 29.

Abstract

The discovery of thousands of long noncoding RNAs (lncRNAs) in mammals raises a question about their functionality. It has been shown that some of them are involved in post-transcriptional regulation of other RNAs and form inter-molecular duplexes with their targets. Sequence alignment tools have been used for transcriptome-wide prediction of RNA-RNA interactions. However, such approaches have poor prediction accuracy since they ignore RNA's secondary structure. Application of the thermodynamics-based algorithms to long transcripts is not computationally feasible on a large scale. Here, we describe a new computational pipeline ASSA that combines sequence alignment and thermodynamics-based tools for efficient prediction of RNA-RNA interactions between long transcripts. To measure the hybridization strength, the sum energy of all the putative duplexes is computed. The main novelty implemented in ASSA is the ability to quickly estimate the statistical significance of the observed interaction energies. Most of the functional hybridizations between long RNAs were classified as statistically significant. ASSA outperformed 11 other tools in terms of the Area Under the Curve on two out of four test sets. Additionally, our results emphasized a unique property of the [Formula: see text] repeats with respect to the RNA-RNA interactions in the human transcriptome. ASSA is available at https://sourceforge.net/projects/assa/.

摘要

在哺乳动物中发现了数千种长链非编码RNA(lncRNA),这引发了关于它们功能的问题。已经表明,其中一些参与了其他RNA的转录后调控,并与其靶标形成分子间双链体。序列比对工具已被用于全转录组范围的RNA-RNA相互作用预测。然而,由于这些方法忽略了RNA的二级结构,其预测准确性较差。基于热力学的算法应用于长转录本在大规模计算上是不可行的。在这里,我们描述了一种新的计算流程ASSA,它结合了序列比对和基于热力学的工具,用于高效预测长转录本之间的RNA-RNA相互作用。为了测量杂交强度,计算所有假定双链体的总能量。ASSA实现的主要新颖之处在于能够快速估计观察到的相互作用能量的统计显著性。大多数长链RNA之间的功能性杂交被归类为具有统计显著性。在四个测试集中的两个上,ASSA在曲线下面积方面优于其他11种工具。此外,我们的结果强调了[公式:见正文]重复序列在人类转录组中关于RNA-RNA相互作用的独特性质。可在https://sourceforge.net/projects/assa/上获取ASSA。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验