Suppr超能文献

基于全基因组测序的细菌分离株亚型数据分析工具的详细评估:作为概念验证

Detailed Evaluation of Data Analysis Tools for Subtyping of Bacterial Isolates Based on Whole Genome Sequencing: as a Proof of Concept.

作者信息

Saltykova Assia, Mattheus Wesley, Bertrand Sophie, Roosens Nancy H C, Marchal Kathleen, De Keersmaecker Sigrid C J

机构信息

Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium.

IDLab, IMEC, Department of Information Technology, Ghent University, Ghent, Belgium.

出版信息

Front Microbiol. 2019 Dec 18;10:2897. doi: 10.3389/fmicb.2019.02897. eCollection 2019.

Abstract

Whole genome sequencing is increasingly recognized as the most informative approach for characterization of bacterial isolates. Success of the routine use of this technology in public health laboratories depends on the availability of well-characterized and verified data analysis methods. However, multiple subtyping workflows are now often being used for a single organism, and differences between them are not always well described. Moreover, methodologies for comparison of subtyping workflows, and assessment of their performance are only beginning to emerge. Current work focuses on the detailed comparison of WGS-based subtyping workflows and evaluation of their suitability for the organism and the research context in question. We evaluated the performance of pipelines used for subtyping of , including the currently widely applied cgMLST approach and different SNP-based methods. In addition, the impact of the use of different tools for detection and filtering of recombinant regions and of different reference genomes were tested. Our benchmarking analysis included both assessment of technical performance of the pipelines and functional comparison of the generated genetic distance matrices and phylogenetic trees. It was carried out using replicate sequencing datasets of high- and low-coverage, consisting mainly of isolates belonging to the clonal complex 269. We demonstrated that cgMLST and some of the SNP-based subtyping workflows showed very good performance characteristics and highly similar genetic distance matrices and phylogenetic trees with isolates belonging to the same clonal complex. However, only two of the tested workflows demonstrated reproducible results for a group of more closely related isolates. Additionally, results of the SNP-based subtyping workflows were to some level dependent on the reference genome used. Interestingly, the use of recombination-filtering software generally reduced the similarity between the gene-by-gene and SNP-based methodologies for subtyping of . Our study, where was taken as an example, clearly highlights the need for more benchmarking comparative studies to eventually contribute to a justified use of a specific WGS data analysis workflow within an international public health laboratory context.

摘要

全基因组测序越来越被认为是鉴定细菌分离株最具信息量的方法。在公共卫生实验室中常规使用该技术的成功与否取决于是否有经过充分表征和验证的数据分析方法。然而,现在对于单一生物体常常使用多种亚型分析工作流程,而且它们之间的差异并不总是得到很好的描述。此外,用于比较亚型分析工作流程及其性能评估的方法才刚刚出现。当前的工作重点是基于全基因组测序的亚型分析工作流程的详细比较,以及评估它们在所研究的生物体和研究背景下的适用性。我们评估了用于[具体细菌名称未给出]亚型分析的流程的性能,包括目前广泛应用的核心多位点序列分型(cgMLST)方法和不同的基于单核苷酸多态性(SNP)的方法。此外,还测试了使用不同工具检测和过滤重组区域以及不同参考基因组的影响。我们的基准分析既包括对流程技术性能的评估,也包括对生成的遗传距离矩阵和系统发育树的功能比较。这是使用高覆盖度和低覆盖度的重复测序数据集进行的,这些数据集主要由属于克隆复合体269的分离株组成。我们证明,cgMLST和一些基于SNP的亚型分析工作流程表现出非常好的性能特征,并且对于属于同一克隆复合体的分离株,其遗传距离矩阵和系统发育树高度相似。然而,在所测试的工作流程中,只有两个对一组关系更密切的分离株显示出可重复的结果。此外,基于SNP的亚型分析工作流程的结果在一定程度上依赖于所使用的参考基因组。有趣的是,使用重组过滤软件通常会降低基于逐个基因和基于SNP的[具体细菌名称未给出]亚型分析方法之间的相似性。我们以[具体细菌名称未给出]为例的研究清楚地表明,需要进行更多的基准比较研究,以便最终有助于在国际公共卫生实验室环境中合理使用特定的全基因组测序数据分析工作流程。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6b6/6930190/73fc773ef9ea/fmicb-10-02897-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验