Suppr超能文献

GraphClust:无比对的局部 RNA 二级结构的结构聚类。

GraphClust: alignment-free structural clustering of local RNA secondary structures.

机构信息

Bioinformatics Group, Department of Computer Science, University of Freiburg,Georges-Köhler-Allee 106, D-79110 Freiburg, Germany.

出版信息

Bioinformatics. 2012 Jun 15;28(12):i224-32. doi: 10.1093/bioinformatics/bts224.

Abstract

MOTIVATION

Clustering according to sequence-structure similarity has now become a generally accepted scheme for ncRNA annotation. Its application to complete genomic sequences as well as whole transcriptomes is therefore desirable but hindered by extremely high computational costs.

RESULTS

We present a novel linear-time, alignment-free method for comparing and clustering RNAs according to sequence and structure. The approach scales to datasets of hundreds of thousands of sequences. The quality of the retrieved clusters has been benchmarked against known ncRNA datasets and is comparable to state-of-the-art sequence-structure methods although achieving speedups of several orders of magnitude. A selection of applications aiming at the detection of novel structural ncRNAs are presented. Exemplarily, we predicted local structural elements specific to lincRNAs likely functionally associating involved transcripts to vital processes of the human nervous system. In total, we predicted 349 local structural RNA elements.

AVAILABILITY

The GraphClust pipeline is available on request.

摘要

动机

根据序列-结构相似性进行聚类现在已经成为 ncRNA 注释的一种普遍接受的方案。因此,将其应用于完整的基因组序列和整个转录组是可取的,但受到极高的计算成本的阻碍。

结果

我们提出了一种新颖的线性时间、无比对的方法,用于根据序列和结构比较和聚类 RNA。该方法可扩展到数十万条序列的数据集。所检索的聚类的质量已经针对已知的 ncRNA 数据集进行了基准测试,与最先进的序列-结构方法相当,尽管实现了几个数量级的加速。还提出了一系列旨在检测新型结构 ncRNA 的应用。例如,我们预测了 lincRNA 特有的局部结构元素,这些元素可能将涉及的转录本与人类神经系统的重要过程联系起来。总共预测了 349 个局部结构 RNA 元件。

可用性

GraphClust 管道可根据要求提供。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ce9/3371856/31fb2e29dab4/bts224f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验