Suppr超能文献

基于序列的启发式方法,用于更快地注释非编码RNA家族。

Sequence-based heuristics for faster annotation of non-coding RNA families.

作者信息

Weinberg Zasha, Ruzzo Walter L

机构信息

Department of Computer Science & Engineering, University of Washington, Seattle, WA 98195, USA.

出版信息

Bioinformatics. 2006 Jan 1;22(1):35-9. doi: 10.1093/bioinformatics/bti743. Epub 2005 Nov 2.

Abstract

MOTIVATION

Non-coding RNAs (ncRNAs) are functional RNA molecules that do not code for proteins. Covariance Models (CMs) are a useful statistical tool to find new members of an ncRNA gene family in a large genome database, using both sequence and, importantly, RNA secondary structure information. Unfortunately, CM searches are extremely slow. Previously, we created rigorous filters, which provably sacrifice none of a CM's accuracy, while making searches significantly faster for virtually all ncRNA families. However, these rigorous filters make searches slower than heuristics could be.

RESULTS

In this paper we introduce profile HMM-based heuristic filters. We show that their accuracy is usually superior to heuristics based on BLAST. Moreover, we compared our heuristics with those used in tRNAscan-SE, whose heuristics incorporate a significant amount of work specific to tRNAs, where our heuristics are generic to any ncRNA. Performance was roughly comparable, so we expect that our heuristics provide a high-quality solution that--unlike family-specific solutions--can scale to hundreds of ncRNA families.

AVAILABILITY

The source code is available under GNU Public License at the supplementary web site.

摘要

动机

非编码RNA(ncRNA)是不编码蛋白质的功能性RNA分子。协方差模型(CM)是一种有用的统计工具,可利用序列以及重要的RNA二级结构信息,在大型基因组数据库中寻找ncRNA基因家族的新成员。不幸的是,CM搜索极其缓慢。此前,我们创建了严格的过滤器,在不牺牲CM准确性的前提下,显著加快了几乎所有ncRNA家族的搜索速度。然而,这些严格的过滤器使得搜索速度比启发式方法还要慢。

结果

在本文中,我们引入了基于隐马尔可夫模型(profile HMM)的启发式过滤器。我们表明,其准确性通常优于基于BLAST的启发式方法。此外,我们将我们的启发式方法与tRNAscan-SE中使用的方法进行了比较,tRNAscan-SE的启发式方法包含了大量针对tRNA的特定工作,而我们的启发式方法对任何ncRNA都是通用的。性能大致相当,因此我们预计我们的启发式方法提供了一个高质量的解决方案,与特定家族的解决方案不同,它可以扩展到数百个ncRNA家族。

可用性

源代码可在补充网站上根据GNU通用公共许可证获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验