Suppr超能文献

HISAT:一种内存需求低的快速剪接比对器。

HISAT: a fast spliced aligner with low memory requirements.

作者信息

Kim Daehwan, Langmead Ben, Salzberg Steven L

机构信息

1] Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA. [2] Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, USA.

1] Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA. [2] Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, USA. [3] Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, USA.

出版信息

Nat Methods. 2015 Apr;12(4):357-60. doi: 10.1038/nmeth.3317. Epub 2015 Mar 9.

Abstract

HISAT (hierarchical indexing for spliced alignment of transcripts) is a highly efficient system for aligning reads from RNA sequencing experiments. HISAT uses an indexing scheme based on the Burrows-Wheeler transform and the Ferragina-Manzini (FM) index, employing two types of indexes for alignment: a whole-genome FM index to anchor each alignment and numerous local FM indexes for very rapid extensions of these alignments. HISAT's hierarchical index for the human genome contains 48,000 local FM indexes, each representing a genomic region of ∼64,000 bp. Tests on real and simulated data sets showed that HISAT is the fastest system currently available, with equal or better accuracy than any other method. Despite its large number of indexes, HISAT requires only 4.3 gigabytes of memory. HISAT supports genomes of any size, including those larger than 4 billion bases.

摘要

HISAT(转录本剪接比对的分层索引)是一种用于比对RNA测序实验读数的高效系统。HISAT使用基于Burrows-Wheeler变换和Ferragina-Manzini(FM)索引的索引方案,采用两种类型的索引进行比对:一个全基因组FM索引用于锚定每个比对,以及大量局部FM索引用于这些比对的非常快速的扩展。HISAT针对人类基因组的分层索引包含48,000个局部FM索引,每个索引代表约64,000 bp的基因组区域。对真实和模拟数据集的测试表明,HISAT是目前可用的最快系统,其准确性与任何其他方法相当或更好。尽管其索引数量众多,但HISAT仅需要4.3千兆字节的内存。HISAT支持任何大小的基因组,包括那些大于40亿碱基的基因组。

相似文献

1
8
Fast and accurate long-read alignment with Burrows-Wheeler transform.基于 Burrows-Wheeler 变换的快速准确长读比对。
Bioinformatics. 2010 Mar 1;26(5):589-95. doi: 10.1093/bioinformatics/btp698. Epub 2010 Jan 15.
10
Centrifuge: rapid and sensitive classification of metagenomic sequences.离心机:宏基因组序列的快速灵敏分类
Genome Res. 2016 Dec;26(12):1721-1729. doi: 10.1101/gr.210641.116. Epub 2016 Oct 17.

引用本文的文献

本文引用的文献

1
Systematic evaluation of spliced alignment programs for RNA-seq data.系统评估 RNA-seq 数据拼接比对程序。
Nat Methods. 2013 Dec;10(12):1185-91. doi: 10.1038/nmeth.2722. Epub 2013 Nov 3.
4
STAR: ultrafast universal RNA-seq aligner.STAR:超快通用 RNA-seq 对齐工具。
Bioinformatics. 2013 Jan 1;29(1):15-21. doi: 10.1093/bioinformatics/bts635. Epub 2012 Oct 25.
7
Fast gapped-read alignment with Bowtie 2.快速缺口读对准与 Bowtie 2。
Nat Methods. 2012 Mar 4;9(4):357-9. doi: 10.1038/nmeth.1923.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验