• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用混合哈希树数据结构进行快速准确的短读长比对。

Fast and accurate short-read alignment with hybrid hash-tree data structure.

作者信息

Makino Junichiro, Ebisuzaki Toshikazu, Himeno Ryutaro, Hayashizaki Yoshihide

机构信息

Advanced Accelerating Systems Co. Ltd, Deiki 1-28, B1312, Kanazawa-ku, Yokohama, Kanagawa, 236-0021, Japan.

Department of Planetology, Graduate School of Science, Kobe University, 1-1, Rokkodai-cho, Nada-ku, Kobe, 657-8051, Japan.

出版信息

Genomics Inform. 2024 Oct 29;22(1):19. doi: 10.1186/s44342-024-00012-5.

DOI:10.1186/s44342-024-00012-5
PMID:39472988
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11520436/
Abstract

Rapidly increasing the amount of short-read data generated by NGSs (new-generation sequencers) calls for the development of fast and accurate read alignment programs. The programs based on the hash table (BLAST) and Burrows-Wheeler transform (bwa-mem) are used, and the latter is known to give superior performance. We here present a new algorithm, a hybrid of hash table and suffix tree, which we designed to speed up the alignment of short reads against large reference sequences such as the human genome. The total turnaround time for processing one human genome sample (read depth of 30) is just 31 min with our system while that was more than 25 h with bwa-mem/gatk. The time for the aligner alone is 28 min for our system but around 2 h for bwa-mem. Our new algorithm is 4.4 times faster than bwa-mem while achieving similar accuracy. Variant calling and other downstream analyses after the alignment can be done with open-source tools such as SAMtools and Genome Analysis Toolkit (gatk) packages, as well as our own fast variant caller, which is well parallelized and much faster than gatk.

摘要

新一代测序仪(NGS)产生的短读长数据量迅速增加,这就需要开发快速且准确的读段比对程序。人们使用了基于哈希表的程序(BLAST)和基于Burrows-Wheeler变换的程序(bwa-mem),后者性能更优。我们在此展示一种新算法,它是哈希表和后缀树的混合算法,旨在加快短读段与大型参考序列(如人类基因组)的比对速度。使用我们的系统处理一个人类基因组样本(读长深度为30)的总周转时间仅为31分钟,而使用bwa-mem/gatk则超过25小时。仅比对程序而言,我们的系统耗时28分钟,而bwa-mem约需2小时。我们的新算法比bwa-mem快4.4倍,同时具有相似的准确性。比对后的变异检测及其他下游分析可使用诸如SAMtools和基因组分析工具包(gatk)软件包等开源工具,以及我们自己开发的快速变异检测程序,该程序具有良好的并行性,比gatk快得多。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/04da19a681fe/44342_2024_12_Figb_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/16a6a14a87e5/44342_2024_12_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/d3869d62e69a/44342_2024_12_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/35889543ef37/44342_2024_12_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/7296ea81feae/44342_2024_12_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/7bfac0beaa87/44342_2024_12_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/94f33e4fd22b/44342_2024_12_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/addcdea8ab6d/44342_2024_12_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/f82e504a8c9a/44342_2024_12_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/04da19a681fe/44342_2024_12_Figb_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/16a6a14a87e5/44342_2024_12_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/d3869d62e69a/44342_2024_12_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/35889543ef37/44342_2024_12_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/7296ea81feae/44342_2024_12_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/7bfac0beaa87/44342_2024_12_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/94f33e4fd22b/44342_2024_12_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/addcdea8ab6d/44342_2024_12_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/f82e504a8c9a/44342_2024_12_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/456c/11520436/04da19a681fe/44342_2024_12_Figb_HTML.jpg

相似文献

1
Fast and accurate short-read alignment with hybrid hash-tree data structure.使用混合哈希树数据结构进行快速准确的短读长比对。
Genomics Inform. 2024 Oct 29;22(1):19. doi: 10.1186/s44342-024-00012-5.
2
Fast and accurate short read alignment with Burrows-Wheeler transform.使用Burrows-Wheeler变换进行快速准确的短读比对。
Bioinformatics. 2009 Jul 15;25(14):1754-60. doi: 10.1093/bioinformatics/btp324. Epub 2009 May 18.
3
Calling known variants and identifying new variants while rapidly aligning sequence data.在快速对齐序列数据的同时,调用已知变异体并识别新变异体。
J Dairy Sci. 2019 Apr;102(4):3216-3229. doi: 10.3168/jds.2018-15172. Epub 2019 Feb 14.
4
Evaluation of variant calling tools for large plant genome re-sequencing.评价用于大型植物基因组重测序的变异调用工具。
BMC Bioinformatics. 2020 Aug 17;21(1):360. doi: 10.1186/s12859-020-03704-1.
5
An efficient Burrows-Wheeler transform-based aligner for short read mapping.一种基于Burrows-Wheeler变换的高效短读映射比对器。
Comput Biol Chem. 2024 Jun;110:108050. doi: 10.1016/j.compbiolchem.2024.108050. Epub 2024 Mar 5.
6
YOABS: yet other aligner of biological sequences--an efficient linearly scaling nucleotide aligner.YOABS:另一种生物序列比对工具——高效线性比例核苷酸比对工具。
Bioinformatics. 2012 Apr 15;28(8):1070-7. doi: 10.1093/bioinformatics/bts102. Epub 2012 Mar 7.
7
HIA: a genome mapper using hybrid index-based sequence alignment.HIA:一种使用基于混合索引的序列比对的基因组映射器。
Algorithms Mol Biol. 2015 Dec 23;10:30. doi: 10.1186/s13015-015-0062-4. eCollection 2015.
8
A fast read alignment method based on seed-and-vote for next generation sequencing.一种基于种子与投票的用于下一代测序的快速读段比对方法。
BMC Bioinformatics. 2016 Dec 23;17(Suppl 17):466. doi: 10.1186/s12859-016-1329-6.
9
Performance evaluation of pipelines for mapping, variant calling and interval padding, for the analysis of NGS germline panels.用于分析NGS种系基因检测板的映射、变异位点检测和区间填充流程的性能评估。
BMC Bioinformatics. 2021 Apr 28;22(1):218. doi: 10.1186/s12859-021-04144-1.
10
Faster single-end alignment generation utilizing multi-thread for BWA.利用多线程实现更快的BWA单端比对生成。
Biomed Mater Eng. 2015;26 Suppl 1:S1791-6. doi: 10.3233/BME-151480.

引用本文的文献

1
A chromosome-scale genome assembly of Giardia duodenalis by long-read sequencing of ten trophozoites.通过对十个滋养体进行长读长测序获得的十二指肠贾第虫染色体水平的基因组组装。
Sci Data. 2025 Jul 1;12(1):1079. doi: 10.1038/s41597-025-05405-x.
2
Fine mapping and candidate gene analysis of major QTLs for number of seeds per pod in Arachis hypogaea L.花生单荚种子数主要QTL的精细定位及候选基因分析
BMC Genomics. 2025 Apr 15;26(1):376. doi: 10.1186/s12864-025-11560-7.
3
Integrated RNA-Seq and Metabolomics Analyses of Biological Processes and Metabolic Pathways Involved in Seed Development in L.

本文引用的文献

1
PrecisionFDA Truth Challenge V2: Calling variants from short and long reads in difficult-to-map regions.精准FDA真相挑战V2:在难以映射的区域中从短读长和长读长中识别变异体。
Cell Genom. 2022 May 11;2(5). doi: 10.1016/j.xgen.2022.100129. Epub 2022 Apr 27.
2
Accelerating next generation sequencing data analysis: an evaluation of optimized best practices for Genome Analysis Toolkit algorithms.加速下一代测序数据分析:对基因组分析工具包算法优化最佳实践的评估
Genomics Inform. 2020 Mar;18(1):e10. doi: 10.5808/GI.2020.18.1.e10. Epub 2020 Mar 31.
3
Exploring single-sample SNP and INDEL calling with whole-genome de novo assembly.
番茄种子发育过程中涉及的生物过程和代谢途径的RNA测序与代谢组学整合分析
Genes (Basel). 2025 Feb 28;16(3):300. doi: 10.3390/genes16030300.
利用全基因组从头组装进行单样本 SNP 和 INDEL 调用的探索。
Bioinformatics. 2012 Jul 15;28(14):1838-44. doi: 10.1093/bioinformatics/bts280. Epub 2012 May 7.
4
BLAST+: architecture and applications.BLAST+:体系结构与应用。
BMC Bioinformatics. 2009 Dec 15;10:421. doi: 10.1186/1471-2105-10-421.
5
Fast and accurate short read alignment with Burrows-Wheeler transform.使用Burrows-Wheeler变换进行快速准确的短读比对。
Bioinformatics. 2009 Jul 15;25(14):1754-60. doi: 10.1093/bioinformatics/btp324. Epub 2009 May 18.
6
An improved algorithm for matching biological sequences.一种用于匹配生物序列的改进算法。
J Mol Biol. 1982 Dec 15;162(3):705-8. doi: 10.1016/0022-2836(82)90398-9.
7
Basic local alignment search tool.基本局部比对搜索工具
J Mol Biol. 1990 Oct 5;215(3):403-10. doi: 10.1016/S0022-2836(05)80360-2.