Suppr超能文献

快速准确的重测序读对齐。

Fast and accurate read alignment for resequencing.

机构信息

Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA.

出版信息

Bioinformatics. 2012 Sep 15;28(18):2366-73. doi: 10.1093/bioinformatics/bts450. Epub 2012 Jul 18.

Abstract

MOTIVATION

Next-generation sequence analysis has become an important task both in laboratory and clinical settings. A key stage in the majority sequence analysis workflows, such as resequencing, is the alignment of genomic reads to a reference genome. The accurate alignment of reads with large indels is a computationally challenging task for researchers.

RESULTS

We introduce SeqAlto as a new algorithm for read alignment. For reads longer than or equal to 100 bp, SeqAlto is up to 10 × faster than existing algorithms, while retaining high accuracy and the ability to align reads with large (up to 50 bp) indels. This improvement in efficiency is particularly important in the analysis of future sequencing data where the number of reads approaches many billions. Furthermore, SeqAlto uses less than 8 GB of memory to align against the human genome. SeqAlto is benchmarked against several existing tools with both real and simulated data.

AVAILABILITY

Linux and Mac OS X binaries free for academic use are available at http://www.stanford.edu/group/wonglab/seqalto

CONTACT

whwong@stanford.edu.

摘要

动机

下一代测序分析在实验室和临床环境中都已成为一项重要任务。在大多数测序工作流程(如重测序)中,一个关键步骤是将基因组读取与参考基因组进行比对。对于研究人员来说,准确比对具有较大插入/缺失(indels)的读取是一项具有挑战性的计算任务。

结果

我们引入了 SeqAlto 作为一种新的读取对齐算法。对于长度等于或大于 100 bp 的读取,SeqAlto 的速度比现有算法快 10 倍,同时保持了高精度和对齐具有较大(高达 50 bp)插入/缺失的读取的能力。这种效率的提高在未来测序数据分析中尤为重要,因为读取数量接近数十亿。此外,SeqAlto 在对齐人类基因组时使用的内存少于 8GB。我们使用真实数据和模拟数据对 SeqAlto 进行了基准测试,并与几个现有工具进行了比较。

可用性

可在 http://www.stanford.edu/group/wonglab/seqalto 上免费获取适用于学术用途的 Linux 和 Mac OS X 二进制文件。

联系方式

whwong@stanford.edu

相似文献

1
Fast and accurate read alignment for resequencing.
Bioinformatics. 2012 Sep 15;28(18):2366-73. doi: 10.1093/bioinformatics/bts450. Epub 2012 Jul 18.
2
Fast and accurate short read alignment with Burrows-Wheeler transform.
Bioinformatics. 2009 Jul 15;25(14):1754-60. doi: 10.1093/bioinformatics/btp324. Epub 2009 May 18.
3
SRmapper: a fast and sensitive genome-hashing alignment tool.
Bioinformatics. 2013 Feb 1;29(3):316-21. doi: 10.1093/bioinformatics/bts712. Epub 2012 Dec 24.
4
Comparative analysis of algorithms for next-generation sequencing read alignment.
Bioinformatics. 2011 Oct 15;27(20):2790-6. doi: 10.1093/bioinformatics/btr477. Epub 2011 Aug 19.
5
Ψ-RA: a parallel sparse index for genomic read alignment.
BMC Genomics. 2011;12 Suppl 2(Suppl 2):S7. doi: 10.1186/1471-2164-12-S2-S7. Epub 2011 Jul 27.
6
ARYANA: Aligning Reads by Yet Another Approach.
BMC Bioinformatics. 2014;15 Suppl 9(Suppl 9):S12. doi: 10.1186/1471-2105-15-S9-S12. Epub 2014 Sep 10.
7
BFAST: an alignment tool for large scale genome resequencing.
PLoS One. 2009 Nov 11;4(11):e7767. doi: 10.1371/journal.pone.0007767.
8
Accurate estimation of short read mapping quality for next-generation genome sequencing.
Bioinformatics. 2012 Sep 15;28(18):i349-i355. doi: 10.1093/bioinformatics/bts408.
9
Fast and accurate long-read alignment with Burrows-Wheeler transform.
Bioinformatics. 2010 Mar 1;26(5):589-95. doi: 10.1093/bioinformatics/btp698. Epub 2010 Jan 15.

引用本文的文献

1
PVGA: a precise viral genome assembler using an iterative alignment graph.
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf063.
3
4
The demographic history of house mice (Mus musculus domesticus) in eastern North America.
G3 (Bethesda). 2023 Feb 9;13(2). doi: 10.1093/g3journal/jkac332.
6
Dosage sensitivity and exon shuffling shape the landscape of polymorphic duplicates in Drosophila and humans.
Nat Ecol Evol. 2022 Mar;6(3):273-287. doi: 10.1038/s41559-021-01614-w. Epub 2021 Dec 30.
7
Technology dictates algorithms: recent developments in read alignment.
Genome Biol. 2021 Aug 26;22(1):249. doi: 10.1186/s13059-021-02443-7.
10
Detect accessible chromatin using ATAC-sequencing, from principle to applications.
Hereditas. 2019 Aug 15;156:29. doi: 10.1186/s41065-019-0105-9. eCollection 2019.

本文引用的文献

1
Fast gapped-read alignment with Bowtie 2.
Nat Methods. 2012 Mar 4;9(4):357-9. doi: 10.1038/nmeth.1923.
2
ART: a next-generation sequencing read simulator.
Bioinformatics. 2012 Feb 15;28(4):593-4. doi: 10.1093/bioinformatics/btr708. Epub 2011 Dec 23.
4
A framework for variation discovery and genotyping using next-generation DNA sequencing data.
Nat Genet. 2011 May;43(5):491-8. doi: 10.1038/ng.806. Epub 2011 Apr 10.
5
SHRiMP2: sensitive yet practical SHort Read Mapping.
Bioinformatics. 2011 Apr 1;27(7):1011-2. doi: 10.1093/bioinformatics/btr046. Epub 2011 Jan 28.
6
A map of human genome variation from population-scale sequencing.
Nature. 2010 Oct 28;467(7319):1061-73. doi: 10.1038/nature09534.
7
Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads.
Genome Res. 2011 Jun;21(6):936-9. doi: 10.1101/gr.111120.110. Epub 2010 Oct 27.
8
GASSST: global alignment short sequence search tool.
Bioinformatics. 2010 Oct 15;26(20):2534-40. doi: 10.1093/bioinformatics/btq485. Epub 2010 Aug 24.
9
Fast and accurate long-read alignment with Burrows-Wheeler transform.
Bioinformatics. 2010 Mar 1;26(5):589-95. doi: 10.1093/bioinformatics/btp698. Epub 2010 Jan 15.
10
BFAST: an alignment tool for large scale genome resequencing.
PLoS One. 2009 Nov 11;4(11):e7767. doi: 10.1371/journal.pone.0007767.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验