• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评价用于大型植物基因组重测序的变异调用工具。

Evaluation of variant calling tools for large plant genome re-sequencing.

机构信息

Morden Research and Development Centre, Agriculture and Agri-Food Canada, 101 Route 100, Morden, Manitoba, R6M 1Y5, Canada.

Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, 960 Carling Avenue, Ottawa, Ontario, K1A 0C6, Canada.

出版信息

BMC Bioinformatics. 2020 Aug 17;21(1):360. doi: 10.1186/s12859-020-03704-1.

DOI:10.1186/s12859-020-03704-1
PMID:32807073
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7430858/
Abstract

BACKGROUND

Discovering single nucleotide polymorphisms (SNPs) from agriculture crop genome sequences has been a widely used strategy for developing genetic markers for several applications including marker-assisted breeding, population diversity studies for eco-geographical adaption, genotyping crop germplasm collections, and others. Accurately detecting SNPs from large polyploid crop genomes such as wheat is crucial and challenging. A few variant calling methods have been previously developed but they show a low concordance between their variant calls. A gold standard of variant sets generated from one human individual sample was established for variant calling tool evaluations, however hitherto no gold standard of crop variant set is available for wheat use. The intent of this study was to evaluate seven SNP variant calling tools (FreeBayes, GATK, Platypus, Samtools/mpileup, SNVer, VarScan, VarDict) with the two most popular mapping tools (BWA-mem and Bowtie2) on wheat whole exome capture (WEC) re-sequencing data from allohexaploid wheat.

RESULTS

We found the BWA-mem mapping tool had both a higher mapping rate and a higher accuracy rate than Bowtie2. With the same mapping quality (MQ) cutoff, BWA-mem detected more variant bases in mapping reads than Bowtie2. The reads preprocessed with quality trimming or duplicate removal did not significantly affect the final mapping performance in terms of mapped reads. Based on the concordance and receiver operating characteristic (ROC), the Samtools/mpileup variant calling tool with BWA-mem mapping of raw sequence reads outperformed other tests followed by FreeBayes and GATK in terms of specificity and sensitivity. VarDict and VarScan were the poorest performing variant calling tools with the wheat WEC sequence data.

CONCLUSION

The BWA-mem and Samtools/mpileup pipeline, with no need to preprocess the raw read data before mapping onto the reference genome, was ascertained the optimum for SNP calling for the complex wheat genome re-sequencing. These results also provide useful guidelines for reliable variant identification from deep sequencing of other large polyploid crop genomes.

摘要

背景

从农业作物基因组序列中发现单核苷酸多态性(SNP),一直以来都是一种广泛应用的策略,可用于多种应用,包括标记辅助育种、生态地理适应的群体多样性研究、作物种质资源的基因型分析等。准确检测如小麦等大型多倍体作物基因组中的 SNP 至关重要,但也极具挑战性。之前已经开发了一些变异调用方法,但它们的变异调用之间一致性较低。虽然已经建立了一个来自人类个体样本的变异集的黄金标准,用于变异调用工具评估,但迄今为止,还没有适用于小麦的作物变异集黄金标准。本研究的目的是评估七种 SNP 变异调用工具(FreeBayes、GATK、Platypus、Samtools/mpileup、SNVer、VarScan、VarDict)在六倍体小麦全外显子捕获(WEC)重测序数据上与两种最流行的映射工具(BWA-mem 和 Bowtie2)的使用情况。

结果

我们发现 BWA-mem 映射工具的映射率和准确率都高于 Bowtie2。在相同的映射质量(MQ)截止值下,BWA-mem 在映射读段中检测到的变异碱基比 Bowtie2 多。经过质量修剪或去除重复序列预处理的读段,在映射读段方面不会显著影响最终的映射性能。根据一致性和接收器工作特性(ROC),Samtools/mpileup 变异调用工具与 BWA-mem 映射原始序列读段的组合表现优于其他测试,其次是 FreeBayes 和 GATK 在特异性和敏感性方面。在使用小麦 WEC 序列数据时,VarDict 和 VarScan 是性能最差的变异调用工具。

结论

对于复杂的小麦基因组重测序,不需要在映射到参考基因组之前对原始读段数据进行预处理的 BWA-mem 和 Samtools/mpileup 组合,被确定为 SNP 调用的最佳方法。这些结果还为其他大型多倍体作物基因组的深度测序中可靠的变异鉴定提供了有用的指导。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/200b/7430858/d9412c655e2a/12859_2020_3704_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/200b/7430858/20615d9222e1/12859_2020_3704_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/200b/7430858/1188b05bc124/12859_2020_3704_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/200b/7430858/679ca09f0bb5/12859_2020_3704_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/200b/7430858/b08cac069309/12859_2020_3704_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/200b/7430858/c776942ecaef/12859_2020_3704_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/200b/7430858/d9412c655e2a/12859_2020_3704_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/200b/7430858/20615d9222e1/12859_2020_3704_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/200b/7430858/1188b05bc124/12859_2020_3704_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/200b/7430858/679ca09f0bb5/12859_2020_3704_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/200b/7430858/b08cac069309/12859_2020_3704_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/200b/7430858/c776942ecaef/12859_2020_3704_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/200b/7430858/d9412c655e2a/12859_2020_3704_Fig6_HTML.jpg

相似文献

1
Evaluation of variant calling tools for large plant genome re-sequencing.评价用于大型植物基因组重测序的变异调用工具。
BMC Bioinformatics. 2020 Aug 17;21(1):360. doi: 10.1186/s12859-020-03704-1.
2
Multiple Variant Calling Pipelines in Wheat Whole Exome Sequencing.小麦全外显子组测序中的多种变异calling 分析流程。
Int J Mol Sci. 2021 Sep 27;22(19):10400. doi: 10.3390/ijms221910400.
3
Systematic comparison of variant calling pipelines using gold standard personal exome variants.使用金标准个人外显子变体对变异检测流程进行系统比较。
Sci Rep. 2015 Dec 7;5:17875. doi: 10.1038/srep17875.
4
Benchmarking variant identification tools for plant diversity discovery.植物多样性发现的变异识别工具基准测试。
BMC Genomics. 2019 Sep 9;20(1):701. doi: 10.1186/s12864-019-6057-7.
5
Systematic benchmark of state-of-the-art variant calling pipelines identifies major factors affecting accuracy of coding sequence variant discovery.系统基准测试最先进的变异调用管道,确定影响编码序列变异发现准确性的主要因素。
BMC Genomics. 2022 Feb 22;23(1):155. doi: 10.1186/s12864-022-08365-3.
6
Performance evaluation of pipelines for mapping, variant calling and interval padding, for the analysis of NGS germline panels.用于分析NGS种系基因检测板的映射、变异位点检测和区间填充流程的性能评估。
BMC Bioinformatics. 2021 Apr 28;22(1):218. doi: 10.1186/s12859-021-04144-1.
7
Single Nucleotide Polymorphism Identification in Polyploids: A Review, Example, and Recommendations.多倍体中单核苷酸多态性的鉴定:综述、实例与建议。
Mol Plant. 2015 Jun;8(6):831-46. doi: 10.1016/j.molp.2015.02.002. Epub 2015 Feb 10.
8
Calling known variants and identifying new variants while rapidly aligning sequence data.在快速对齐序列数据的同时,调用已知变异体并识别新变异体。
J Dairy Sci. 2019 Apr;102(4):3216-3229. doi: 10.3168/jds.2018-15172. Epub 2019 Feb 14.
9
Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing.多种变异calling 管道一致性低:外显子组和基因组测序的实际影响。
Genome Med. 2013 Mar 27;5(3):28. doi: 10.1186/gm432. eCollection 2013.
10
Validation and assessment of variant calling pipelines for next-generation sequencing.下一代测序变异检测流程的验证与评估
Hum Genomics. 2014 Jul 30;8(1):14. doi: 10.1186/1479-7364-8-14.

引用本文的文献

1
Benchmarking of low coverage sequencing workflows for precision genotyping in eggplant.茄子中用于精准基因分型的低覆盖度测序工作流程的基准测试
BMC Plant Biol. 2025 Aug 25;25(1):1125. doi: 10.1186/s12870-025-07242-x.
2
Benchmarking Genomic Variant Calling Tools in Inbred Mouse Strains: Recommendations and Considerations.近交系小鼠品系中基因组变异检测工具的基准测试:建议与注意事项
bioRxiv. 2025 May 31:2025.05.28.656711. doi: 10.1101/2025.05.28.656711.
3
Discovery of variation in genes related to agronomic traits by sequencing the genome of Cucurbita pepo varieties.

本文引用的文献

1
Read trimming is not required for mapping and quantification of RNA-seq reads at the gene level.在基因水平上对RNA测序读数进行比对和定量时,无需进行读数修剪。
NAR Genom Bioinform. 2020 Sep 3;2(3):lqaa068. doi: 10.1093/nargab/lqaa068. eCollection 2020 Sep.
2
Comparing Single-SNP, Multi-SNP, and Haplotype-Based Approaches in Association Studies for Major Traits in Barley.比较大麦主要性状关联研究中单 SNP、多 SNP 和单倍型分析方法。
Plant Genome. 2019 Nov;12(3):1-14. doi: 10.3835/plantgenome2019.05.0036.
3
Evaluation of Seven Different RNA-Seq Alignment Tools Based on Experimental Data from the Model Plant .
通过对西葫芦品种的基因组进行测序发现与农艺性状相关的基因变异。
BMC Genomics. 2025 Apr 3;26(1):335. doi: 10.1186/s12864-025-11370-x.
4
Pangenome graph mitigates heterozygosity overestimation from mapping bias: a case study in Chinese indigenous pigs.泛基因组图谱减轻了因映射偏差导致的杂合度高估:以中国本土猪为例的研究
BMC Biol. 2025 Mar 26;23(1):89. doi: 10.1186/s12915-025-02194-y.
5
A comprehensive catalog of single nucleotide polymorphisms (SNPs) from the black pepper (Piper nigrum L.) genome.来自黑胡椒(Piper nigrum L.)基因组的单核苷酸多态性(SNP)综合目录。
BMC Genomics. 2025 Mar 17;26(1):256. doi: 10.1186/s12864-025-11414-2.
6
NAVIP: Unraveling the influence of neighboring small sequence variants on functional impact prediction.NAVIP:揭示相邻小序列变异对功能影响预测的影响
PLoS Comput Biol. 2025 Feb 18;21(2):e1012732. doi: 10.1371/journal.pcbi.1012732. eCollection 2025 Feb.
7
Assessing myBaits Target Capture Sequencing Methodology Using Short-Read Sequencing for Variant Detection in Oat Genomics and Breeding.利用短读测序评估 myBaits 目标捕获测序方法在 oat 基因组学和育种中的变异检测。
Genes (Basel). 2024 May 27;15(6):700. doi: 10.3390/genes15060700.
8
Mapping QTL associated with resistance to pv. in kiwifruit ( var. ).绘制与猕猴桃(品种)对猕猴桃溃疡病菌抗性相关的数量性状基因座。
Front Plant Sci. 2024 Mar 26;14:1255506. doi: 10.3389/fpls.2023.1255506. eCollection 2023.
9
Kuura-An automated workflow for analyzing WES and WGS data.Kuura—一种用于分析 WES 和 WGS 数据的自动化工作流程。
PLoS One. 2024 Jan 18;19(1):e0296785. doi: 10.1371/journal.pone.0296785. eCollection 2024.
10
Transcriptome and Metabolome Analyses Reveal the Mechanism of Corpus Luteum Cyst Formation in Pigs.转录组和代谢组分析揭示了猪黄体囊肿形成的机制。
Genes (Basel). 2023 Sep 23;14(10):1848. doi: 10.3390/genes14101848.
基于模式植物实验数据评估七种不同的 RNA-Seq 比对工具。
Int J Mol Sci. 2020 Mar 3;21(5):1720. doi: 10.3390/ijms21051720.
4
Benchmarking variant identification tools for plant diversity discovery.植物多样性发现的变异识别工具基准测试。
BMC Genomics. 2019 Sep 9;20(1):701. doi: 10.1186/s12864-019-6057-7.
5
Consequences of PCA graphs, SNP codings, and PCA variants for elucidating population structure.PCA 图谱、SNP 编码和 PCA 变体对阐明群体结构的影响。
PLoS One. 2019 Jun 18;14(6):e0218306. doi: 10.1371/journal.pone.0218306. eCollection 2019.
6
Publisher Correction: Exome sequencing highlights the role of wild-relative introgression in shaping the adaptive landscape of the wheat genome.出版商更正:外显子组测序凸显了野生近缘种渗入在塑造小麦基因组适应性景观中的作用。
Nat Genet. 2019 Jul;51(7):1194. doi: 10.1038/s41588-019-0463-2.
7
Genome-Wide Association Studies for Pasmo Resistance in Flax (.).亚麻对叶锈病抗性的全基因组关联研究(.)
Front Plant Sci. 2019 Jan 14;9:1982. doi: 10.3389/fpls.2018.01982. eCollection 2018.
8
Evaluation and Recommendations for Routine Genotyping Using Skim Whole Genome Re-sequencing in Canola.油菜籽脱脂全基因组重测序用于常规基因分型的评估与建议
Front Plant Sci. 2018 Dec 7;9:1809. doi: 10.3389/fpls.2018.01809. eCollection 2018.
9
Comparison of Burrows-Wheeler Transform-Based Mapping Algorithms Used in High-Throughput Whole-Genome Sequencing: Application to Illumina Data for Livestock Genomes.用于高通量全基因组测序的基于Burrows-Wheeler变换的映射算法比较:在牲畜基因组Illumina数据中的应用
Front Genet. 2018 Feb 26;9:35. doi: 10.3389/fgene.2018.00035. eCollection 2018.
10
Gaining comprehensive biological insight into the transcriptome by performing a broad-spectrum RNA-seq analysis.通过进行广谱RNA测序分析,全面深入了解转录组的生物学特性。
Nat Commun. 2017 Jul 5;8(1):59. doi: 10.1038/s41467-017-00050-4.