• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于全基因组特征,对多种新一代测序比对器的读段比对进行评估。

Evaluation and assessment of read-mapping by multiple next-generation sequencing aligners based on genome-wide characteristics.

作者信息

Thankaswamy-Kosalai Subazini, Sen Partho, Nookaew Intawat

机构信息

Department of Biology and Biological Engineering, Chalmers University of Technology, Kemivägen 10, SE-412 96 Göteborg, Sweden.

Department of Biology and Biological Engineering, Chalmers University of Technology, Kemivägen 10, SE-412 96 Göteborg, Sweden; Department of Biomedical Informatics, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA.

出版信息

Genomics. 2017 Jul;109(3-4):186-191. doi: 10.1016/j.ygeno.2017.03.001. Epub 2017 Mar 9.

DOI:10.1016/j.ygeno.2017.03.001
PMID:28286147
Abstract

Massive data produced due to the advent of next-generation sequencing (NGS) technology is widely used for biological researches and medical diagnosis. The crucial step in NGS analysis is read alignment or mapping which is computationally intensive and complex. The mapping bias tends to affect the downstream analysis, including detection of polymorphisms. In order to provide guidelines to the biologist for suitable selection of aligners; we have evaluated and benchmarked 5 different aligners (BWA, Bowtie2, NovoAlign, Smalt and Stampy) and their mapping bias based on characteristics of 5 microbial genomes. Two million simulated read pairs of various sizes (36bp, 50bp, 72bp, 100bp, 125bp, 150bp, 200bp, 250bp and 300bp) were aligned. Specific alignment features such as sensitivity of mapping, percentage of properly paired reads, alignment time and effect of tandem repeats on incorrectly mapped reads were evaluated. BWA showed faster alignment followed by Bowtie2 and Smalt. NovoAlign and Stampy were comparatively slower. Most of the aligners showed high sensitivity towards long reads (>100bp) mapping. On the other hand NovoAlign showed higher sensitivity towards both short reads (36bp, 50bp, 72bp) and long reads (>100bp) mappings; It also showed higher sensitivity towards mapping a complex genome like Plasmodium falciparum. The percentage of properly paired reads aligned by NovoAlign, BWA and Stampy were markedly higher. None of the aligners outperforms the others in the benchmark, however the aligners perform differently with genome characteristics. We expect that the results from this study will be useful for the end user to choose aligner, thus enhance the accuracy of read mapping.

摘要

由于下一代测序(NGS)技术的出现而产生的海量数据被广泛应用于生物学研究和医学诊断。NGS分析中的关键步骤是读段比对或映射,这在计算上既密集又复杂。映射偏差往往会影响下游分析,包括多态性检测。为了为生物学家选择合适的比对工具提供指导;我们基于5个微生物基因组的特征,对5种不同的比对工具(BWA、Bowtie2、NovoAlign、Smalt和Stampy)及其映射偏差进行了评估和基准测试。对两百万个不同大小(36bp、50bp、72bp、100bp、125bp、150bp、200bp、250bp和300bp)的模拟读段对进行了比对。评估了特定的比对特征,如映射的敏感性、正确配对读段的百分比、比对时间以及串联重复对错误映射读段的影响。BWA的比对速度更快,其次是Bowtie2和Smalt。NovoAlign和Stampy相对较慢。大多数比对工具对长读段(>100bp)映射表现出高敏感性。另一方面,NovoAlign对短读段(36bp、50bp、72bp)和长读段(>100bp)映射都表现出更高的敏感性;它对像恶性疟原虫这样的复杂基因组映射也表现出更高的敏感性。由NovoAlign、BWA和Stampy比对的正确配对读段的百分比明显更高。在基准测试中,没有一个比对工具比其他工具表现更优,然而不同的比对工具在基因组特征方面表现不同。我们期望这项研究的结果将有助于终端用户选择比对工具,从而提高读段映射的准确性。

相似文献

1
Evaluation and assessment of read-mapping by multiple next-generation sequencing aligners based on genome-wide characteristics.基于全基因组特征,对多种新一代测序比对器的读段比对进行评估。
Genomics. 2017 Jul;109(3-4):186-191. doi: 10.1016/j.ygeno.2017.03.001. Epub 2017 Mar 9.
2
Short Sequence Aligner Benchmarking for Chromatin Research.短序列比对工具在染色质研究中的基准测试。
Int J Mol Sci. 2023 Sep 14;24(18):14074. doi: 10.3390/ijms241814074.
3
CUSHAW3: sensitive and accurate base-space and color-space short-read alignment with hybrid seeding.CUSHAW3:采用混合种子策略实现敏感且准确的碱基空间和颜色空间短读长比对
PLoS One. 2014 Jan 22;9(1):e86869. doi: 10.1371/journal.pone.0086869. eCollection 2014.
4
Accelerating the Next Generation Long Read Mapping with the FPGA-Based System.利用基于现场可编程门阵列(FPGA)的系统加速下一代长读长映射
IEEE/ACM Trans Comput Biol Bioinform. 2014 Sep-Oct;11(5):840-52. doi: 10.1109/TCBB.2014.2326876.
5
Systematic benchmark of ancient DNA read mapping.系统评估古 DNA 读段映射。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab076.
6
AlignerBoost: A Generalized Software Toolkit for Boosting Next-Gen Sequencing Mapping Accuracy Using a Bayesian-Based Mapping Quality Framework.AlignerBoost:一种基于贝叶斯映射质量框架提高下一代测序映射准确性的通用软件工具包。
PLoS Comput Biol. 2016 Oct 5;12(10):e1005096. doi: 10.1371/journal.pcbi.1005096. eCollection 2016 Oct.
7
Review of alignment and SNP calling algorithms for next-generation sequencing data.下一代测序数据的比对和单核苷酸多态性(SNP)检测算法综述。
J Appl Genet. 2016 Feb;57(1):71-9. doi: 10.1007/s13353-015-0292-7. Epub 2015 Jun 9.
8
RF: a method for filtering short reads with tandem repeats for genome mapping.RF:一种用于基因组图谱构建的带有串联重复的短读过滤方法。
Genomics. 2013 Jul;102(1):35-7. doi: 10.1016/j.ygeno.2013.03.002. Epub 2013 Mar 29.
9
Aligner optimization increases accuracy and decreases compute times in multi-species sequence data.调整校正器可提高多物种序列数据的准确性并减少计算时间。
Microb Genom. 2017 Jul 8;3(9):e000122. doi: 10.1099/mgen.0.000122. eCollection 2017 Sep.
10
Long read alignment based on maximal exact match seeds.基于最大精确匹配种子的长读比对。
Bioinformatics. 2012 Sep 15;28(18):i318-i324. doi: 10.1093/bioinformatics/bts414.

引用本文的文献

1
Variation of and associations with the depth and evenness of sequencing coverage in archived plastid genomes.存档质体基因组中测序覆盖深度和均匀度的变化及其关联
Res Sq. 2025 Jul 14:rs.3.rs-5784537. doi: 10.21203/rs.3.rs-5784537/v1.
2
Variation of and associations with the depth and evenness of sequencing coverage in archived plastid genomes.存档质体基因组中测序覆盖深度和均匀性的变化及其关联
Sci Rep. 2025 Jul 19;15(1):26294. doi: 10.1038/s41598-025-11568-9.
3
Machine Learning Approach and Bioinformatics Analysis Discovered Key Genomic Signatures for Hepatitis B Virus-Associated Hepatocyte Remodeling and Hepatocellular Carcinoma.
机器学习方法与生物信息学分析发现了乙型肝炎病毒相关肝细胞重塑和肝细胞癌的关键基因组特征。
Cancer Inform. 2025 Apr 16;24:11769351251333847. doi: 10.1177/11769351251333847. eCollection 2025.
4
Benchmarking of five NGS mapping tools for the reference alignment of bacterial outer membrane vesicles-associated small RNAs.用于细菌外膜囊泡相关小RNA参考比对的五种二代测序(NGS)比对工具的基准测试
Front Microbiol. 2024 Jul 19;15:1401985. doi: 10.3389/fmicb.2024.1401985. eCollection 2024.
5
The somatic mutation profile of estrogen receptor-positive HER2-negative metastatic breast cancer in Brazilian patients.巴西患者雌激素受体阳性、人表皮生长因子受体2阴性转移性乳腺癌的体细胞突变谱
Front Oncol. 2024 Jun 17;14:1372947. doi: 10.3389/fonc.2024.1372947. eCollection 2024.
6
ChimericFragments: computation, analysis and visualization of global RNA networks.嵌合片段:全局RNA网络的计算、分析与可视化
NAR Genom Bioinform. 2024 Apr 17;6(2):lqae035. doi: 10.1093/nargab/lqae035. eCollection 2024 Jun.
7
Molecular pathology as basis for timely cancer diagnosis and therapy.分子病理学作为癌症及时诊断和治疗的基础。
Virchows Arch. 2024 Feb;484(2):155-168. doi: 10.1007/s00428-023-03707-2. Epub 2023 Nov 28.
8
Target capture and genome skimming for plant diversity studies.用于植物多样性研究的目标捕获和基因组浅层测序
Appl Plant Sci. 2023 Aug 10;11(4):e11537. doi: 10.1002/aps3.11537. eCollection 2023 Jul-Aug.
9
Genome assembly composition of the String "ACGT" array: a review of data structure accuracy and performance challenges.字符串“ACGT”阵列的基因组组装组成:数据结构准确性和性能挑战综述
PeerJ Comput Sci. 2023 Jul 13;9:e1180. doi: 10.7717/peerj-cs.1180. eCollection 2023.
10
Improved eukaryotic detection compatible with large-scale automated analysis of metagenomes.提高真核生物检测的兼容性,以实现大规模宏基因组自动化分析。
Microbiome. 2023 Apr 10;11(1):72. doi: 10.1186/s40168-023-01505-1.