利用附加前缀词提高读段匹配度。

Improving read mapping using additional prefix grams.

机构信息

Department of Computer Science, University of California, Irvine, USA.

出版信息

BMC Bioinformatics. 2014 Feb 5;15:42. doi: 10.1186/1471-2105-15-42.

DOI:10.1186/1471-2105-15-42

PMID:24499321

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3927682/

Abstract

BACKGROUND

Next-generation sequencing (NGS) enables rapid production of billions of bases at a relatively low cost. Mapping reads from next-generation sequencers to a given reference genome is an important first step in many sequencing applications. Popular read mappers, such as Bowtie and BWA, are optimized to return top one or a few candidate locations of each read. However, identifying all mapping locations of each read, instead of just one or a few, is also important in some sequencing applications such as ChIP-seq for discovering binding sites in repeat regions, and RNA-seq for transcript abundance estimation.

RESULTS

Here we present Hobbes2, a software package designed for fast and accurate alignment of NGS reads and specialized in identifying all mapping locations of each read. Hobbes2 efficiently identifies all mapping locations of reads using a novel technique that utilizes additional prefix q-grams to improve filtering. We extensively compare Hobbes2 with state-of-the-art read mappers, and show that Hobbes2 can be an order of magnitude faster than other read mappers while consuming less memory space and achieving similar accuracy.

CONCLUSIONS

We propose Hobbes2 to improve the accuracy of read mapping, specialized in identifying all mapping locations of each read. Hobbes2 is implemented in C++, and the source code is freely available for download at http://hobbes.ics.uci.edu.

摘要

背景

下一代测序（NGS）能够以相对较低的成本快速产生数十亿个碱基。将下一代测序仪的读取映射到给定的参考基因组是许多测序应用中的重要第一步。流行的读取映射器，如 Bowtie 和 BWA，经过优化，可以返回每个读取的一个或几个最佳候选位置。然而，在某些测序应用中，如在重复区域中发现结合位点的 ChIP-seq 和用于估计转录物丰度的 RNA-seq，识别每个读取的所有映射位置而不仅仅是一个或几个最佳候选位置也很重要。

结果

在这里，我们介绍了 Hobbes2，这是一个专门用于快速准确地对齐 NGS 读取并专门用于识别每个读取的所有映射位置的软件包。Hobbes2 利用一种新的技术，利用额外的前缀 q-grams 来改进过滤，有效地识别每个读取的所有映射位置。我们广泛比较了 Hobbes2 与最先进的读取映射器，并表明 Hobbes2 可以比其他读取映射器快一个数量级，同时消耗更少的内存空间并实现相似的准确性。

结论

我们提出 Hobbes2 来提高读取映射的准确性，专门用于识别每个读取的所有映射位置。Hobbes2 是用 C++ 实现的，源代码可在 http://hobbes.ics.uci.edu 上免费下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30b2/3927682/e799976a3e53/1471-2105-15-42-1.jpg

相似文献

Improving read mapping using additional prefix grams.

BMC Bioinformatics. 2014 Feb 5;15:42. doi: 10.1186/1471-2105-15-42.

Hobbes: optimized gram-based methods for efficient read alignment.

Nucleic Acids Res. 2012 Mar;40(6):e41. doi: 10.1093/nar/gkr1246. Epub 2011 Dec 22.

BitMapper: an efficient all-mapper based on bit-vector computing.

BMC Bioinformatics. 2015 Jun 11;16:192. doi: 10.1186/s12859-015-0626-9.

Fast and memory efficient approach for mapping NGS reads to a reference genome.

J Bioinform Comput Biol. 2019 Apr;17(2):1950008. doi: 10.1142/S0219720019500082.

Assessing the impact of exact reads on reducing the error rate of read mapping.

BMC Bioinformatics. 2018 Nov 6;19(1):406. doi: 10.1186/s12859-018-2432-7.

GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping.

Bioinformatics. 2017 Nov 1;33(21):3355-3363. doi: 10.1093/bioinformatics/btx342.

GRIM-Filter: Fast seed location filtering in DNA read mapping using processing-in-memory technologies.

BMC Genomics. 2018 May 9;19(Suppl 2):89. doi: 10.1186/s12864-018-4460-0.

Mapping RNA-seq reads to transcriptomes efficiently based on learning to hash method.

Comput Biol Med. 2020 Jan;116:103539. doi: 10.1016/j.compbiomed.2019.103539. Epub 2019 Nov 13.

Multi-threading the generation of Burrows-Wheeler Alignment.

Genet Mol Res. 2016 May 23;15(2):gmr8650. doi: 10.4238/gmr.15028650.

Comparative analysis of algorithms for next-generation sequencing read alignment.

Bioinformatics. 2011 Oct 15;27(20):2790-6. doi: 10.1093/bioinformatics/btr477. Epub 2011 Aug 19.

引用本文的文献

Technology dictates algorithms: recent developments in read alignment.

Genome Biol. 2021 Aug 26;22(1):249. doi: 10.1186/s13059-021-02443-7.

SRPRISM (Single Read Paired Read Indel Substitution Minimizer): an efficient aligner for assemblies with explicit guarantees.

Gigascience. 2020 Apr 1;9(4). doi: 10.1093/gigascience/giaa023.

Fast and efficient short read mapping based on a succinct hash index.

BMC Bioinformatics. 2018 Mar 9;19(1):92. doi: 10.1186/s12859-018-2094-5.

Computing Platforms for Big Biological Data Analytics: Perspectives and Challenges.

Comput Struct Biotechnol J. 2017 Aug 14;15:403-411. doi: 10.1016/j.csbj.2017.07.004. eCollection 2017.

Short Read Mapping: An Algorithmic Tour.

Proc IEEE Inst Electr Electron Eng. 2017 Mar;105(3):436-458. doi: 10.1109/JPROC.2015.2455551. Epub 2015 Sep 7.

BitMapper: an efficient all-mapper based on bit-vector computing.

BMC Bioinformatics. 2015 Jun 11;16:192. doi: 10.1186/s12859-015-0626-9.

本文引用的文献

A mixture model for expression deconvolution from RNA-seq in heterogeneous tissues.

BMC Bioinformatics. 2013;14 Suppl 5(Suppl 5):S11. doi: 10.1186/1471-2105-14-S5-S11. Epub 2013 Apr 10.

Fast and accurate read mapping with approximate seeds and multiple backtracking.

Nucleic Acids Res. 2013 Apr;41(7):e78. doi: 10.1093/nar/gkt005. Epub 2013 Jan 28.

Streaming fragment assignment for real-time analysis of sequencing experiments.

Nat Methods. 2013 Jan;10(1):71-3. doi: 10.1038/nmeth.2251. Epub 2012 Nov 18.

The GEM mapper: fast, accurate and versatile alignment by filtration.

Nat Methods. 2012 Dec;9(12):1185-8. doi: 10.1038/nmeth.2221. Epub 2012 Oct 28.

RazerS 3: faster, fully sensitive read mapping.

Bioinformatics. 2012 Oct 15;28(20):2592-9. doi: 10.1093/bioinformatics/bts505. Epub 2012 Aug 24.

Fast gapped-read alignment with Bowtie 2.

Nat Methods. 2012 Mar 4;9(4):357-9. doi: 10.1038/nmeth.1923.

Hobbes: optimized gram-based methods for efficient read alignment.

Nucleic Acids Res. 2012 Mar;40(6):e41. doi: 10.1093/nar/gkr1246. Epub 2011 Dec 22.

AREM: aligning short reads from ChIP-sequencing by expectation maximization.

J Comput Biol. 2011 Nov;18(11):1495-505. doi: 10.1089/cmb.2011.0185. Epub 2011 Oct 28.

A novel and well-defined benchmarking method for second generation read mapping.

BMC Bioinformatics. 2011 May 26;12:210. doi: 10.1186/1471-2105-12-210.

SHRiMP2: sensitive yet practical SHort Read Mapping.

Bioinformatics. 2011 Apr 1;27(7):1011-2. doi: 10.1093/bioinformatics/btr046. Epub 2011 Jan 28.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用附加前缀词提高读段匹配度。

Improving read mapping using additional prefix grams.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献