• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估用于从全基因组短读数据中检测罕见变异的工具的性能。

Evaluating the performance of tools used to call minority variants from whole genome short-read data.

作者信息

Said Mohammed Khadija, Kibinge Nelson, Prins Pjotr, Agoti Charles N, Cotten Matthew, Nokes D J, Brand Samuel, Githinji George

机构信息

Pwani University, Kilifi, Kenya.

KEMRI-Wellcome Trust Research Programme, KEMRI Centre for Geographic Medicine Research - Coast, Kilifi, Kenya.

出版信息

Wellcome Open Res. 2018 Sep 13;3:21. doi: 10.12688/wellcomeopenres.13538.2. eCollection 2018.

DOI:10.12688/wellcomeopenres.13538.2
PMID:30483597
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6234735/
Abstract

High-throughput whole genome sequencing facilitates investigation of minority virus sub-populations from virus positive samples. Minority variants are useful in understanding within and between host diversity, population dynamics and can potentially assist in elucidating person-person transmission pathways. Several minority variant callers have been developed to describe low frequency sub-populations from whole genome sequence data. These callers differ based on bioinformatics and statistical methods used to discriminate sequencing errors from low-frequency variants. We evaluated the diagnostic performance and concordance between published minority variant callers used in identifying minority variants from whole-genome sequence data from virus samples. We used the ART-Illumina read simulation tool to generate three artificial short-read datasets of varying coverage and error profiles from an RSV reference genome. The datasets were spiked with nucleotide variants at predetermined positions and frequencies. Variants were called using FreeBayes, LoFreq, Vardict, and VarScan2. The variant callers' agreement in identifying known variants was quantified using two measures; concordance accuracy and the inter-caller concordance. The variant callers reported differences in identifying minority variants from the datasets. Concordance accuracy and inter-caller concordance were positively correlated with sample coverage. FreeBayes identified the majority of variants although it was characterised by variable sensitivity and precision in addition to a high false positive rate relative to the other minority variant callers and which varied with sample coverage. LoFreq was the most conservative caller. We conducted a performance and concordance evaluation of four minority variant calling tools used to identify and quantify low frequency variants. Inconsistency in the quality of sequenced samples impacts on sensitivity and accuracy of minority variant callers. Our study suggests that combining at least three tools when identifying minority variants is useful in filtering errors when calling low frequency variants.

摘要

高通量全基因组测序有助于对病毒阳性样本中的少数病毒亚群进行研究。少数变异体有助于理解宿主内部和宿主之间的多样性、群体动态,并且可能有助于阐明人际传播途径。已经开发了几种少数变异体检测工具来描述全基因组序列数据中的低频亚群。这些检测工具因用于区分测序错误和低频变异体的生物信息学和统计方法而异。我们评估了已发表的少数变异体检测工具在从病毒样本的全基因组序列数据中识别少数变异体时的诊断性能和一致性。我们使用ART-Illumina读段模拟工具从呼吸道合胞病毒(RSV)参考基因组生成了三个具有不同覆盖度和错误谱的人工短读段数据集。这些数据集在预定位置和频率处掺入了核苷酸变异体。使用FreeBayes、LoFreq、Vardict和VarScan2对变异体进行检测。使用两种方法对变异体检测工具在识别已知变异体方面的一致性进行了量化;一致性准确性和检测工具间的一致性。变异体检测工具在从数据集中识别少数变异体方面存在差异。一致性准确性和检测工具间的一致性与样本覆盖度呈正相关。FreeBayes识别出了大多数变异体,尽管其具有可变的灵敏度和精确性,并且相对于其他少数变异体检测工具具有较高的假阳性率,且该假阳性率随样本覆盖度而变化。LoFreq是最保守的检测工具。我们对用于识别和量化低频变异体的四种少数变异体检测工具进行了性能和一致性评估。测序样本质量的不一致会影响少数变异体检测工具的灵敏度和准确性。我们的研究表明,在识别少数变异体时至少结合使用三种工具,有助于在检测低频变异体时过滤错误。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51c0/6234745/4b0bee553e0a/wellcomeopenres-3-16071-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51c0/6234745/74bec7607993/wellcomeopenres-3-16071-g0000.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51c0/6234745/c3fbcb4f0f17/wellcomeopenres-3-16071-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51c0/6234745/71c6ac472693/wellcomeopenres-3-16071-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51c0/6234745/f7bcc8db8f91/wellcomeopenres-3-16071-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51c0/6234745/4b0bee553e0a/wellcomeopenres-3-16071-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51c0/6234745/74bec7607993/wellcomeopenres-3-16071-g0000.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51c0/6234745/c3fbcb4f0f17/wellcomeopenres-3-16071-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51c0/6234745/71c6ac472693/wellcomeopenres-3-16071-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51c0/6234745/f7bcc8db8f91/wellcomeopenres-3-16071-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51c0/6234745/4b0bee553e0a/wellcomeopenres-3-16071-g0004.jpg

相似文献

1
Evaluating the performance of tools used to call minority variants from whole genome short-read data.评估用于从全基因组短读数据中检测罕见变异的工具的性能。
Wellcome Open Res. 2018 Sep 13;3:21. doi: 10.12688/wellcomeopenres.13538.2. eCollection 2018.
2
Benchmarking UMI-aware and standard variant callers for low frequency ctDNA variant detection.基于 UMIs 的低频 ctDNA 变异检测与标准变异 caller 的基准测试
BMC Genomics. 2024 Sep 3;25(1):827. doi: 10.1186/s12864-024-10737-w.
3
Comparing the performance of selected variant callers using synthetic data and genome segmentation.使用合成数据和基因组分割比较选定变异调用程序的性能。
BMC Bioinformatics. 2018 Nov 19;19(1):429. doi: 10.1186/s12859-018-2440-7.
4
Benchmarking reveals superiority of deep learning variant callers on bacterial nanopore sequence data.基准测试显示深度学习变异调用程序在细菌纳米孔测序数据上的优越性。
Elife. 2024 Oct 10;13:RP98300. doi: 10.7554/eLife.98300.
5
Evaluating the performance of low-frequency variant calling tools for the detection of variants from short-read deep sequencing data.评估低频变异调用工具在检测短读长深度测序数据中的变异方面的性能。
Sci Rep. 2023 Nov 22;13(1):20444. doi: 10.1038/s41598-023-47135-3.
6
Detection of minor variants in Mycobacterium tuberculosis whole genome sequencing data.结核分枝杆菌全基因组测序数据中小变异的检测。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab541.
7
Systematic benchmark of state-of-the-art variant calling pipelines identifies major factors affecting accuracy of coding sequence variant discovery.系统基准测试最先进的变异调用管道,确定影响编码序列变异发现准确性的主要因素。
BMC Genomics. 2022 Feb 22;23(1):155. doi: 10.1186/s12864-022-08365-3.
8
Performance of somatic structural variant calling in lung cancer using Oxford Nanopore sequencing technology.利用牛津纳米孔测序技术在肺癌中进行体细胞结构变异检测的性能。
BMC Genomics. 2024 Sep 30;25(1):898. doi: 10.1186/s12864-024-10792-3.
9
Optimized quantification of intra-host viral diversity in SARS-CoV-2 and influenza virus sequence data.优化 SARS-CoV-2 和流感病毒序列数据中宿主内病毒多样性的定量分析。
mBio. 2023 Aug 31;14(4):e0104623. doi: 10.1128/mbio.01046-23. Epub 2023 Jun 30.
10
Comparison among three variant callers and assessment of the accuracy of imputation from SNP array data to whole-genome sequence level in chicken.鸡中三种变异检测工具的比较以及从SNP芯片数据到全基因组序列水平的填充准确性评估。
BMC Genomics. 2015 Oct 21;16:824. doi: 10.1186/s12864-015-2059-2.

引用本文的文献

1
An improved catalogue for whole-genome sequencing prediction of bedaquiline resistance in using a reproducible algorithmic approach.一种使用可重复算法方法改进的用于预测贝达喹啉耐药性的全基因组测序目录。
Microb Genom. 2025 Jun;11(6). doi: 10.1099/mgen.0.001429.
2
UNISOM: Unified Somatic Calling and Machine Learning-based Classification Enhance the Discovery of CHIP.UNISOM:统一体细胞变异检测与基于机器学习的分类提升了克隆性造血的发现
Genomics Proteomics Bioinformatics. 2025 May 30;23(2). doi: 10.1093/gpbjnl/qzaf040.
3
Recommendations for Uniform Variant Calling of SARS-CoV-2 Genome Sequence across Bioinformatic Workflows.
针对 SARS-CoV-2 基因组序列在生物信息学工作流程中的统一变异调用的建议。
Viruses. 2024 Mar 11;16(3):430. doi: 10.3390/v16030430.
4
Inclusion of minor alleles improves catalogue-based prediction of fluoroquinolone resistance in .纳入次要等位基因可改善基于目录的氟喹诺酮耐药性预测。
JAC Antimicrob Resist. 2023 Apr 4;5(2):dlad039. doi: 10.1093/jacamr/dlad039. eCollection 2023 Apr.
5
Long-Read Genome Assembly and Gene Model Annotations for the Rodent Malaria Parasite 17XNL.啮齿类疟原虫17XNL的长读长基因组组装和基因模型注释
bioRxiv. 2023 Jan 7:2023.01.06.523040. doi: 10.1101/2023.01.06.523040.
6
A general approach to identify low-frequency variants within influenza samples collected during routine surveillance.一种在常规监测中鉴定流感样本中低频变异的通用方法。
Microb Genom. 2022 Sep;8(9). doi: 10.1099/mgen.0.000867.
7
Tool evaluation for the detection of variably sized indels from next generation whole genome and targeted sequencing data.工具评估用于检测下一代全基因组和靶向测序数据中的可变大小插入缺失。
PLoS Comput Biol. 2022 Feb 17;18(2):e1009269. doi: 10.1371/journal.pcbi.1009269. eCollection 2022 Feb.
8
Genomic epidemiology of SARS-CoV-2 under an elimination strategy in Hong Kong.香港消除策略下的 SARS-CoV-2 基因组流行病学。
Nat Commun. 2022 Feb 8;13(1):736. doi: 10.1038/s41467-022-28420-7.
9
Detection of minor variants in Mycobacterium tuberculosis whole genome sequencing data.结核分枝杆菌全基因组测序数据中小变异的检测。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab541.
10
Evaluating assembly and variant calling software for strain-resolved analysis of large DNA viruses.评估组装和变异调用软件,用于大型 DNA 病毒的菌株解析分析。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa123.