Suppr超能文献

RNAscClust:使用结构保守性和基于图的基元对 RNA 序列进行聚类。

RNAscClust: clustering RNA sequences using structure conservation and graph based motifs.

机构信息

Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg im Breisgau, Germany.

Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark.

出版信息

Bioinformatics. 2017 Jul 15;33(14):2089-2096. doi: 10.1093/bioinformatics/btx114.

Abstract

MOTIVATION

Clustering RNA sequences with common secondary structure is an essential step towards studying RNA function. Whereas structural RNA alignment strategies typically identify common structure for orthologous structured RNAs, clustering seeks to group paralogous RNAs based on structural similarities. However, existing approaches for clustering paralogous RNAs, do not take the compensatory base pair changes obtained from structure conservation in orthologous sequences into account.

RESULTS

Here, we present RNAscClust , the implementation of a new algorithm to cluster a set of structured RNAs taking their respective structural conservation into account. For a set of multiple structural alignments of RNA sequences, each containing a paralog sequence included in a structural alignment of its orthologs, RNAscClust computes minimum free-energy structures for each sequence using conserved base pairs as prior information for the folding. The paralogs are then clustered using a graph kernel-based strategy, which identifies common structural features. We show that the clustering accuracy clearly benefits from an increasing degree of compensatory base pair changes in the alignments.

AVAILABILITY AND IMPLEMENTATION

RNAscClust is available at http://www.bioinf.uni-freiburg.de/Software/RNAscClust .

CONTACT

gorodkin@rth.dk or backofen@informatik.uni-freiburg.de.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

对具有共同二级结构的 RNA 序列进行聚类是研究 RNA 功能的重要步骤。尽管结构 RNA 比对策略通常可以为同源结构的 RNA 识别共同结构,但聚类旨在根据结构相似性对旁系同源 RNA 进行分组。然而,现有的聚类旁系同源 RNA 的方法并没有考虑从同源序列的结构保守性中获得的补偿碱基对变化。

结果

在这里,我们提出了 RNAscClust,这是一种新算法的实现,用于对一组考虑其各自结构保守性的结构化 RNA 进行聚类。对于一组 RNA 序列的多个结构比对,每个比对都包含一个旁系同源序列,这些序列都包含在其同源序列的结构比对中,RNAscClust 使用保守碱基对作为折叠的先验信息,为每个序列计算最小自由能结构。然后使用基于图核的策略对旁系同源物进行聚类,该策略可以识别共同的结构特征。我们表明,聚类准确性明显受益于比对中补偿碱基对变化程度的增加。

可用性和实现

RNAscClust 可在 http://www.bioinf.uni-freiburg.de/Software/RNAscClust 获得。

联系方式

gorodkin@rth.dkbackofen@informatik.uni-freiburg.de

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2826/5870858/7ac14646c213/btx114f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验