• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

短读映射:算法之旅。

Short Read Mapping: An Algorithmic Tour.

作者信息

Canzar Stefan, Salzberg Steven L

出版信息

Proc IEEE Inst Electr Electron Eng. 2017 Mar;105(3):436-458. doi: 10.1109/JPROC.2015.2455551. Epub 2015 Sep 7.

DOI:10.1109/JPROC.2015.2455551
PMID:28502990
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5425171/
Abstract

Ultra-high-throughput next-generation sequencing (NGS) technology allows us to determine the sequence of nucleotides of many millions of DNA molecules in parallel. Accompanied by a dramatic reduction in cost since its introduction in 2004, NGS technology has provided a new way of addressing a wide range of biological and biomedical questions, from the study of human genetic disease to the analysis of gene expression, protein-DNA interactions, and patterns of DNA methylation. The data generated by NGS instruments comprise huge numbers of very short DNA sequences, or 'reads', that carry little information by themselves. These reads therefore have to be pieced together by well-engineered algorithms to reconstruct biologically meaningful measurments, such as the level of expression of a gene. To solve this complex, high-dimensional puzzle, reads must be mapped back to a reference genome to determine their origin Due to sequencing errors and to genuine differences between the reference genome and the individual being sequenced, this mapping process must be tolerant of mismatches, insertions, and deletions. Although optimal alignment algorithms to solve this problem have long been available, the practical requirements of aligning hundreds of millions of short reads to the 3 billion base pair long human genome have stimulated the development of new, more efficient methods, which today are used routinely throughout the world for the analysis of NGS data.

摘要

超高通量下一代测序(NGS)技术使我们能够并行确定数百万个DNA分子的核苷酸序列。自2004年问世以来,随着成本大幅降低,NGS技术为解决广泛的生物学和生物医学问题提供了新途径,从人类遗传疾病研究到基因表达分析、蛋白质-DNA相互作用以及DNA甲基化模式分析。NGS仪器生成的数据包含大量非常短的DNA序列,即“读段”,这些读段本身携带的信息很少。因此,必须通过精心设计的算法将这些读段拼接起来,以重建具有生物学意义的测量结果,例如基因的表达水平。为了解决这个复杂的高维难题,读段必须映射回参考基因组以确定其来源。由于测序错误以及参考基因组与被测序个体之间的真实差异,这个映射过程必须容忍错配、插入和缺失。尽管长期以来一直有解决此问题的最优比对算法,但将数亿个短读段与长达30亿碱基对的人类基因组进行比对的实际需求,推动了更高效新方法的开发,如今这些方法在全球范围内被常规用于分析NGS数据。

相似文献

1
Short Read Mapping: An Algorithmic Tour.短读映射:算法之旅。
Proc IEEE Inst Electr Electron Eng. 2017 Mar;105(3):436-458. doi: 10.1109/JPROC.2015.2455551. Epub 2015 Sep 7.
2
Ψ-RA: a parallel sparse index for genomic read alignment.Ψ-RA:一种用于基因组读取比对的并行稀疏索引。
BMC Genomics. 2011;12 Suppl 2(Suppl 2):S7. doi: 10.1186/1471-2164-12-S2-S7. Epub 2011 Jul 27.
3
Fast and memory efficient approach for mapping NGS reads to a reference genome.将二代测序(NGS) reads 映射到参考基因组的快速且内存高效的方法。
J Bioinform Comput Biol. 2019 Apr;17(2):1950008. doi: 10.1142/S0219720019500082.
4
Short Read Alignment Using SOAP2.使用SOAP2进行短序列比对
Methods Mol Biol. 2016;1374:241-52. doi: 10.1007/978-1-4939-3167-5_13.
5
An improved encoding of genetic variation in a Burrows-Wheeler transform.一种改进的 Burrows-Wheeler 变换中的遗传变异编码。
Bioinformatics. 2020 Mar 1;36(5):1413-1419. doi: 10.1093/bioinformatics/btz782.
6
HIA: a genome mapper using hybrid index-based sequence alignment.HIA:一种使用基于混合索引的序列比对的基因组映射器。
Algorithms Mol Biol. 2015 Dec 23;10:30. doi: 10.1186/s13015-015-0062-4. eCollection 2015.
7
Gencore: an efficient tool to generate consensus reads for error suppressing and duplicate removing of NGS data.Gencore:一种高效的工具,用于生成共识读数,以抑制 NGS 数据的错误并去除重复。
BMC Bioinformatics. 2019 Dec 27;20(Suppl 23):606. doi: 10.1186/s12859-019-3280-9.
8
Comparison of Burrows-Wheeler Transform-Based Mapping Algorithms Used in High-Throughput Whole-Genome Sequencing: Application to Illumina Data for Livestock Genomes.用于高通量全基因组测序的基于Burrows-Wheeler变换的映射算法比较:在牲畜基因组Illumina数据中的应用
Front Genet. 2018 Feb 26;9:35. doi: 10.3389/fgene.2018.00035. eCollection 2018.
9
Fast mapping of short sequences with mismatches, insertions and deletions using index structures.使用索引结构对具有错配、插入和缺失的短序列进行快速映射。
PLoS Comput Biol. 2009 Sep;5(9):e1000502. doi: 10.1371/journal.pcbi.1000502. Epub 2009 Sep 11.
10
SRmapper: a fast and sensitive genome-hashing alignment tool.SRmapper:一种快速且灵敏的基因组哈希比对工具。
Bioinformatics. 2013 Feb 1;29(3):316-21. doi: 10.1093/bioinformatics/bts712. Epub 2012 Dec 24.

引用本文的文献

1
Block Aligner: an adaptive SIMD-accelerated aligner for sequences and position-specific scoring matrices.块对齐器:一种自适应的 SIMD 加速序列和位置特定评分矩阵的对齐器。
Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad487.
2
BLEND: a fast, memory-efficient and accurate mechanism to find fuzzy seed matches in genome analysis.BLEND:一种在基因组分析中快速、节省内存且准确地查找模糊种子匹配项的机制。
NAR Genom Bioinform. 2023 Jan 20;5(1):lqad004. doi: 10.1093/nargab/lqad004. eCollection 2023 Mar.
3
Temporal progress of gene expression analysis with RNA-Seq data: A review on the relationship between computational methods.

本文引用的文献

1
Arioc: high-throughput read alignment with GPU-accelerated exploration of the seed-and-extend search space.Arioc:使用 GPU 加速的种子和扩展搜索空间技术进行高通量读取比对。
PeerJ. 2015 Mar 3;3:e808. doi: 10.7717/peerj.808. eCollection 2015.
2
Accurate de novo and transmitted indel detection in exome-capture data using microassembly.利用微组装技术对捕获外显子组数据进行精确的从头和传递插入缺失检测。
Nat Methods. 2014 Oct;11(10):1033-6. doi: 10.1038/nmeth.3069. Epub 2014 Aug 17.
3
MOSAIK: a hash-based algorithm for accurate next-generation sequencing short-read mapping.
基于RNA测序数据的基因表达分析的时间进展:计算方法之间关系的综述
Comput Struct Biotechnol J. 2022 Dec 1;21:86-98. doi: 10.1016/j.csbj.2022.11.051. eCollection 2023.
4
Quantum computing algorithms: getting closer to critical problems in computational biology.量子计算算法:更接近计算生物学中的关键问题。
Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac437.
5
Performance optimization in DNA short-read alignment.DNA 短读比对中的性能优化。
Bioinformatics. 2022 Apr 12;38(8):2081-2087. doi: 10.1093/bioinformatics/btac066.
6
Boosting the power of transcriptomics by developing an efficient gene expression profiling approach.通过开发一种高效的基因表达谱分析方法来提高转录组学的能力。
Plant Biotechnol J. 2022 Jan;20(1):201-210. doi: 10.1111/pbi.13706. Epub 2021 Sep 23.
7
Technology dictates algorithms: recent developments in read alignment.技术决定算法:读段比对的最新进展。
Genome Biol. 2021 Aug 26;22(1):249. doi: 10.1186/s13059-021-02443-7.
8
Metagenomics: a path to understanding the gut microbiome.宏基因组学:理解肠道微生物组的途径。
Mamm Genome. 2021 Aug;32(4):282-296. doi: 10.1007/s00335-021-09889-x. Epub 2021 Jul 14.
9
Accel-Align: a fast sequence mapper and aligner based on the seed-embed-extend method.Accel-Align:一种基于种子嵌入扩展方法的快速序列映射和比对工具。
BMC Bioinformatics. 2021 May 20;22(1):257. doi: 10.1186/s12859-021-04162-z.
10
ARAMIS: From systematic errors of NGS long reads to accurate assemblies.ARAMIS:从 NGS 长读的系统误差到精确组装。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab170.
MOSAIK:一种基于哈希的算法,用于精确的下一代测序短读段比对。
PLoS One. 2014 Mar 5;9(3):e90581. doi: 10.1371/journal.pone.0090581. eCollection 2014.
4
Improving read mapping using additional prefix grams.利用附加前缀词提高读段匹配度。
BMC Bioinformatics. 2014 Feb 5;15:42. doi: 10.1186/1471-2105-15-42.
5
SEME: a fast mapper of Illumina sequencing reads with statistical evaluation.SEME:一种具有统计评估功能的Illumina测序读段快速映射工具。
J Comput Biol. 2013 Nov;20(11):847-60. doi: 10.1089/cmb.2013.0111.
6
NextGenMap: fast and accurate read mapping in highly polymorphic genomes.NextGenMap:在高度多态基因组中快速准确的读取映射。
Bioinformatics. 2013 Nov 1;29(21):2790-1. doi: 10.1093/bioinformatics/btt468. Epub 2013 Aug 23.
7
Benchmarking short sequence mapping tools.短序列比对工具的基准测试。
BMC Bioinformatics. 2013 Jun 7;14:184. doi: 10.1186/1471-2105-14-184.
8
SOAP3-dp: fast, accurate and sensitive GPU-based short read aligner.SOAP3-dp:快速、准确、敏感的基于 GPU 的短读序列比对工具。
PLoS One. 2013 May 31;8(5):e65632. doi: 10.1371/journal.pone.0065632. Print 2013.
9
The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote.Subread 比对工具:基于种子投票的快速、准确和可扩展的读段比对。
Nucleic Acids Res. 2013 May 1;41(10):e108. doi: 10.1093/nar/gkt214. Epub 2013 Apr 4.
10
Accelerating read mapping with FastHASH.使用 FastHASH 加速读映射。
BMC Genomics. 2013;14 Suppl 1(Suppl 1):S13. doi: 10.1186/1471-2164-14-S1-S13. Epub 2013 Jan 21.