• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于植物NGS数据分析的读段比对和变异检测工具比较

Comparison of Read Mapping and Variant Calling Tools for the Analysis of Plant NGS Data.

作者信息

Schilbert Hanna Marie, Rempel Andreas, Pucker Boas

机构信息

Genetics and Genomics of Plants, CeBiTec and Faculty of Biology, Bielefeld University, 33615 Bielefeld, Germany.

Graduate School DILS, Bielefeld Institute for Bioinformatics Infrastructure (BIBI), Faculty of Technology, Bielefeld University, 33615 Bielefeld, Germany.

出版信息

Plants (Basel). 2020 Apr 2;9(4):439. doi: 10.3390/plants9040439.

DOI:10.3390/plants9040439
PMID:32252268
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7238416/
Abstract

High-throughput sequencing technologies have rapidly developed during the past years and have become an essential tool in plant sciences. However, the analysis of genomic data remains challenging and relies mostly on the performance of automatic pipelines. Frequently applied pipelines involve the alignment of sequence reads against a reference sequence and the identification of sequence variants. Since most benchmarking studies of bioinformatics tools for this purpose have been conducted on human datasets, there is a lack of benchmarking studies in plant sciences. In this study, we evaluated the performance of 50 different variant calling pipelines, including five read mappers and ten variant callers, on six real plant datasets of the model organism . Sets of variants were evaluated based on various parameters including sensitivity and specificity. We found that all investigated tools are suitable for analysis of NGS data in plant research. When looking at different performance metrics, BWA-MEM and Novoalign were the best mappers and GATK returned the best results in the variant calling step.

摘要

在过去几年中,高通量测序技术迅速发展,已成为植物科学中的一项重要工具。然而,基因组数据分析仍然具有挑战性,并且主要依赖于自动流程的性能。常用的流程包括将序列读数与参考序列进行比对以及识别序列变异。由于为此目的对生物信息学工具进行的大多数基准测试研究都是在人类数据集上进行的,因此植物科学领域缺乏基准测试研究。在本研究中,我们在模式生物的六个真实植物数据集上评估了50种不同的变异检测流程的性能,包括五种读数比对器和十种变异检测工具。基于包括灵敏度和特异性在内的各种参数对变异集进行了评估。我们发现所有研究的工具都适用于植物研究中的NGS数据分析。在查看不同的性能指标时,BWA-MEM和Novoalign是最佳的比对器,而GATK在变异检测步骤中返回了最佳结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d1/7238416/00edc0de2de7/plants-09-00439-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d1/7238416/c93c189d8e2d/plants-09-00439-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d1/7238416/a05863e18032/plants-09-00439-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d1/7238416/e79a43ee1a5e/plants-09-00439-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d1/7238416/e6fe8ff7d152/plants-09-00439-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d1/7238416/b27d093be2ac/plants-09-00439-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d1/7238416/00edc0de2de7/plants-09-00439-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d1/7238416/c93c189d8e2d/plants-09-00439-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d1/7238416/a05863e18032/plants-09-00439-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d1/7238416/e79a43ee1a5e/plants-09-00439-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d1/7238416/e6fe8ff7d152/plants-09-00439-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d1/7238416/b27d093be2ac/plants-09-00439-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d1/7238416/00edc0de2de7/plants-09-00439-g006.jpg

相似文献

1
Comparison of Read Mapping and Variant Calling Tools for the Analysis of Plant NGS Data.用于植物NGS数据分析的读段比对和变异检测工具比较
Plants (Basel). 2020 Apr 2;9(4):439. doi: 10.3390/plants9040439.
2
Performance evaluation of pipelines for mapping, variant calling and interval padding, for the analysis of NGS germline panels.用于分析NGS种系基因检测板的映射、变异位点检测和区间填充流程的性能评估。
BMC Bioinformatics. 2021 Apr 28;22(1):218. doi: 10.1186/s12859-021-04144-1.
3
Evaluation of variant calling tools for large plant genome re-sequencing.评价用于大型植物基因组重测序的变异调用工具。
BMC Bioinformatics. 2020 Aug 17;21(1):360. doi: 10.1186/s12859-020-03704-1.
4
Systematic comparison of variant calling pipelines using gold standard personal exome variants.使用金标准个人外显子变体对变异检测流程进行系统比较。
Sci Rep. 2015 Dec 7;5:17875. doi: 10.1038/srep17875.
5
Benchmarking variant identification tools for plant diversity discovery.植物多样性发现的变异识别工具基准测试。
BMC Genomics. 2019 Sep 9;20(1):701. doi: 10.1186/s12864-019-6057-7.
6
Performance assessment of variant calling pipelines using human whole exome sequencing and simulated data.使用人类全外显子组测序和模拟数据评估变异调用管道的性能。
BMC Bioinformatics. 2019 Jun 17;20(1):342. doi: 10.1186/s12859-019-2928-9.
7
Systematic benchmark of state-of-the-art variant calling pipelines identifies major factors affecting accuracy of coding sequence variant discovery.系统基准测试最先进的变异调用管道,确定影响编码序列变异发现准确性的主要因素。
BMC Genomics. 2022 Feb 22;23(1):155. doi: 10.1186/s12864-022-08365-3.
8
Variant callers for next-generation sequencing data: a comparison study.下一代测序数据的变异调用者:一项比较研究。
PLoS One. 2013 Sep 27;8(9):e75619. doi: 10.1371/journal.pone.0075619. eCollection 2013.
9
Impact of post-alignment processing in variant discovery from whole exome data.全外显子数据变异发现中比对后处理的影响
BMC Bioinformatics. 2016 Oct 3;17(1):403. doi: 10.1186/s12859-016-1279-z.
10
An investigation of causes of false positive single nucleotide polymorphisms using simulated reads from a small eukaryote genome.利用来自小型真核生物基因组的模拟读数对单核苷酸多态性假阳性原因的调查。
BMC Bioinformatics. 2015 Nov 11;16:382. doi: 10.1186/s12859-015-0801-z.

引用本文的文献

1
Benchmarking of low coverage sequencing workflows for precision genotyping in eggplant.茄子中用于精准基因分型的低覆盖度测序工作流程的基准测试
BMC Plant Biol. 2025 Aug 25;25(1):1125. doi: 10.1186/s12870-025-07242-x.
2
NewtCap: An Efficient Target Capture Approach to Boost Genomic Studies in Salamandridae (True Salamanders and Newts).新蝾螈捕获法:一种促进蝾螈科(真蝾螈和蝾螈)基因组研究的高效目标捕获方法。
Ecol Evol. 2025 Aug 12;15(8):e71835. doi: 10.1002/ece3.71835. eCollection 2025 Aug.
3
A reproducible ddRAD-seq protocol reveals novel genomic association signatures for fruit-related traits in peach.

本文引用的文献

1
Eight high-quality genomes reveal pan-genome architecture and ecotype differentiation of Brassica napus.八个高质量基因组揭示了甘蓝型油菜的泛基因组结构和生态型分化。
Nat Plants. 2020 Jan;6(1):34-45. doi: 10.1038/s41477-019-0577-7. Epub 2020 Jan 13.
2
Benchmarking variant identification tools for plant diversity discovery.植物多样性发现的变异识别工具基准测试。
BMC Genomics. 2019 Sep 9;20(1):701. doi: 10.1186/s12864-019-6057-7.
3
A chromosome-level sequence assembly reveals the structure of the Arabidopsis thaliana Nd-1 genome and its gene set.
一种可重复的ddRAD-seq方案揭示了桃果实相关性状的新基因组关联特征。
Plant Methods. 2025 Jul 22;21(1):101. doi: 10.1186/s13007-025-01415-3.
4
A comprehensive catalog of single nucleotide polymorphisms (SNPs) from the black pepper (Piper nigrum L.) genome.来自黑胡椒(Piper nigrum L.)基因组的单核苷酸多态性(SNP)综合目录。
BMC Genomics. 2025 Mar 17;26(1):256. doi: 10.1186/s12864-025-11414-2.
5
NAVIP: Unraveling the influence of neighboring small sequence variants on functional impact prediction.NAVIP:揭示相邻小序列变异对功能影响预测的影响
PLoS Comput Biol. 2025 Feb 18;21(2):e1012732. doi: 10.1371/journal.pcbi.1012732. eCollection 2025 Feb.
6
Development and validation of a minimal SNP genotyping panel for the differentiation of Cannabis sativa cultivars.用于区分大麻品种的最小SNP基因分型面板的开发与验证。
BMC Genomics. 2025 Jan 28;26(1):83. doi: 10.1186/s12864-025-11263-z.
7
High-throughput method characterizes hundreds of previously unknown antibiotic resistance mutations.高通量方法鉴定出数百种先前未知的抗生素抗性突变。
Nat Commun. 2025 Jan 17;16(1):780. doi: 10.1038/s41467-025-56050-2.
8
An in vitro approach reveals molecular mechanisms underlying endocrine disruptor-induced epimutagenesis.体外研究揭示了环境内分泌干扰物诱导表观遗传突变的分子机制。
Elife. 2024 Oct 3;13:RP93975. doi: 10.7554/eLife.93975.
9
Assessing myBaits Target Capture Sequencing Methodology Using Short-Read Sequencing for Variant Detection in Oat Genomics and Breeding.利用短读测序评估 myBaits 目标捕获测序方法在 oat 基因组学和育种中的变异检测。
Genes (Basel). 2024 May 27;15(6):700. doi: 10.3390/genes15060700.
10
Efficient wastewater sample filtration improves the detection of SARS-CoV-2 variants: An extensive analysis based on sequencing parameters.高效的废水样本过滤可提高 SARS-CoV-2 变体的检测率:基于测序参数的广泛分析。
PLoS One. 2024 May 24;19(5):e0304158. doi: 10.1371/journal.pone.0304158. eCollection 2024.
一个染色体水平的序列组装揭示了拟南芥 Nd-1 基因组及其基因集的结构。
PLoS One. 2019 May 21;14(5):e0216233. doi: 10.1371/journal.pone.0216233. eCollection 2019.
4
Structural variants in 3000 rice genomes.3000 份水稻基因组中的结构变异。
Genome Res. 2019 May;29(5):870-880. doi: 10.1101/gr.241240.118. Epub 2019 Apr 16.
5
Comparative analysis of whole-genome sequencing pipelines to minimize false negative findings.全基因组测序流程的比较分析,以尽量减少假阴性发现。
Sci Rep. 2019 Mar 1;9(1):3219. doi: 10.1038/s41598-019-39108-2.
6
Comparing the performance of selected variant callers using synthetic data and genome segmentation.使用合成数据和基因组分割比较选定变异调用程序的性能。
BMC Bioinformatics. 2018 Nov 19;19(1):429. doi: 10.1186/s12859-018-2440-7.
7
Plant Genetics and Molecular Biology: An Introduction.植物遗传学与分子生物学导论
Adv Biochem Eng Biotechnol. 2018;164:1-9. doi: 10.1007/10_2017_45.
8
Pan-genome analysis highlights the extent of genomic variation in cultivated and wild rice.泛基因组分析突出了栽培稻和野生稻基因组变异的程度。
Nat Genet. 2018 Feb;50(2):278-284. doi: 10.1038/s41588-018-0041-z. Epub 2018 Jan 15.
9
Discovery and genotyping of novel sequence insertions in many sequenced individuals.在许多测序个体中发现和基因分型新的序列插入。
Bioinformatics. 2017 Jul 15;33(14):i161-i169. doi: 10.1093/bioinformatics/btx254.
10
From next-generation resequencing reads to a high-quality variant data set.从新一代重测序 reads 到高质量变异数据集。
Heredity (Edinb). 2017 Feb;118(2):111-124. doi: 10.1038/hdy.2016.102. Epub 2016 Oct 19.