文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

ReporTree:一种面向监测的工具,用于加强病原体遗传聚类与流行病学数据之间的联系。

ReporTree: a surveillance-oriented tool to strengthen the linkage between pathogen genetic clusters and epidemiological data.

机构信息

Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal.

National Reference Centre (NRC) for Whole Genome Sequencing of Microbial Pathogens: Database and Bioinformatics analysis (GENPAT), Istituto Zooprofilattico Sperimentale Dell'Abruzzo E del Molise "Giuseppe Caporale" (IZSAM), Teramo, Italy.

出版信息

Genome Med. 2023 Jun 15;15(1):43. doi: 10.1186/s13073-023-01196-1.


DOI:10.1186/s13073-023-01196-1
PMID:37322495
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10273728/
Abstract

BACKGROUND: Genomics-informed pathogen surveillance strengthens public health decision-making, playing an important role in infectious diseases' prevention and control. A pivotal outcome of genomics surveillance is the identification of pathogen genetic clusters and their characterization in terms of geotemporal spread or linkage to clinical and demographic data. This task often consists of the visual exploration of (large) phylogenetic trees and associated metadata, being time-consuming and difficult to reproduce. RESULTS: We developed ReporTree, a flexible bioinformatics pipeline that allows diving into the complexity of pathogen diversity to rapidly identify genetic clusters at any (or all) distance threshold(s) or cluster stability regions and to generate surveillance-oriented reports based on the available metadata, such as timespan, geography, or vaccination/clinical status. ReporTree is able to maintain cluster nomenclature in subsequent analyses and to generate a nomenclature code combining cluster information at different hierarchical levels, thus facilitating the active surveillance of clusters of interest. By handling several input formats and clustering methods, ReporTree is applicable to multiple pathogens, constituting a flexible resource that can be smoothly deployed in routine surveillance bioinformatics workflows with negligible computational and time costs. This is demonstrated through a comprehensive benchmarking of (i) the cg/wgMLST workflow with large datasets of four foodborne bacterial pathogens and (ii) the alignment-based SNP workflow with a large dataset of Mycobacterium tuberculosis. To further validate this tool, we reproduced a previous large-scale study on Neisseria gonorrhoeae, demonstrating how ReporTree is able to rapidly identify the main species genogroups and characterize them with key surveillance metadata, such as antibiotic resistance data. By providing examples for SARS-CoV-2 and the foodborne bacterial pathogen Listeria monocytogenes, we show how this tool is currently a useful asset in genomics-informed routine surveillance and outbreak detection of a wide variety of species. CONCLUSIONS: In summary, ReporTree is a pan-pathogen tool for automated and reproducible identification and characterization of genetic clusters that contributes to a sustainable and efficient public health genomics-informed pathogen surveillance. ReporTree is implemented in python 3.8 and is freely available at https://github.com/insapathogenomics/ReporTree .

摘要

背景:基于基因组学的病原体监测增强了公共卫生决策,在传染病的预防和控制中发挥了重要作用。基因组监测的一个关键结果是识别病原体遗传群集,并根据地理时空传播或与临床和人口统计学数据的关联对其进行特征描述。这项任务通常包括(大型)系统发育树和相关元数据的可视化探索,既耗时又难以重现。

结果:我们开发了 ReporTree,这是一个灵活的生物信息学管道,可以深入了解病原体多样性的复杂性,快速识别任何(或所有)距离阈值或聚类稳定性区域的遗传聚类,并根据可用元数据(如时间跨度、地理位置或接种/临床状况)生成面向监测的报告。ReporTree 能够在后续分析中保持聚类命名法,并生成一个命名法代码,该代码结合了不同层次水平的聚类信息,从而便于对感兴趣的聚类进行主动监测。通过处理多种输入格式和聚类方法,ReporTree 适用于多种病原体,是一种灵活的资源,可以在常规监测生物信息学工作流程中平稳部署,计算和时间成本可以忽略不计。这通过对四个食源性细菌病原体的大型数据集的 cg/wgMLST 工作流程(i)和大规模 Mycobacterium tuberculosis 数据集的基于比对的 SNP 工作流程(ii)的全面基准测试得到了证明。为了进一步验证该工具,我们重现了之前关于淋病奈瑟菌的大规模研究,展示了 ReporTree 如何能够快速识别主要的物种基因组群,并使用关键监测元数据(如抗生素耐药数据)对其进行特征描述。通过为 SARS-CoV-2 和食源性细菌病原体李斯特菌提供示例,我们展示了该工具如何在各种物种的基于基因组学的常规监测和暴发检测中成为有用的资产。

结论:总之,ReporTree 是一种用于自动和可重复识别和特征描述遗传聚类的泛病原体工具,有助于实现可持续和高效的公共卫生基因组学病原体监测。ReporTree 是用 python 3.8 实现的,可以在 https://github.com/insapathogenomics/ReporTree 上免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b9/10273728/919b1c391a2b/13073_2023_1196_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b9/10273728/3cf9a2bd4341/13073_2023_1196_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b9/10273728/f98e1fb8a68e/13073_2023_1196_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b9/10273728/919b1c391a2b/13073_2023_1196_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b9/10273728/3cf9a2bd4341/13073_2023_1196_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b9/10273728/f98e1fb8a68e/13073_2023_1196_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b9/10273728/919b1c391a2b/13073_2023_1196_Fig3_HTML.jpg

相似文献

[1]
ReporTree: a surveillance-oriented tool to strengthen the linkage between pathogen genetic clusters and epidemiological data.

Genome Med. 2023-6-15

[2]
INSaFLU-TELEVIR: an open web-based bioinformatics suite for viral metagenomic detection and routine genomic surveillance.

Genome Med. 2024-4-25

[3]
Taxonium, a web-based tool for exploring large phylogenetic trees.

Elife. 2022-11-15

[4]
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022-2-1

[5]
Benchmark datasets for phylogenomic pipeline validation, applications for foodborne pathogen surveillance.

PeerJ. 2017-10-6

[6]
P-DOR, an easy-to-use pipeline to reconstruct bacterial outbreaks using genomics.

Bioinformatics. 2023-9-2

[7]
Comparative Genomics Reveals Early Emergence and Biased Spatiotemporal Distribution of SARS-CoV-2.

Mol Biol Evol. 2021-5-19

[8]
Cov2clusters: genomic clustering of SARS-CoV-2 sequences.

BMC Genomics. 2022-10-19

[9]
Within-host diversity improves phylogenetic and transmission reconstruction of SARS-CoV-2 outbreaks.

Elife. 2023-9-21

[10]
Genomic epidemiology of the clinically dominant clonal complex 1 in the population in the UK.

Microb Genom. 2024-1

引用本文的文献

[1]
and comparative analysis of 79 clinical isolates.

Microbiol Spectr. 2025-7

[2]
Multi-country and intersectoral assessment of cluster congruence between pipelines for genomics surveillance of foodborne pathogens.

Nat Commun. 2025-4-28

[3]
Applying prospective tree-temporal scan statistics to genomic surveillance data to detect emerging SARS-CoV-2 variants and salmonellosis clusters in New York City.

Int J Epidemiol. 2025-2-16

[4]
Genomic epidemiology and antimicrobial resistance of clinical isolates between 2016 and 2023.

Front Cell Infect Microbiol. 2025-1-31

[5]
Exploring SNP filtering strategies: the influence of strict vs soft core.

Microb Genom. 2025-1

[6]
Development of a multilocus sequence typing scheme.

Front Microbiol. 2024-10-31

[7]
DODGE: automated point source bacterial outbreak detection using cumulative long term genomic surveillance.

Bioinformatics. 2024-7-1

[8]
from Food Products and Food Associated Environments: Antimicrobial Resistance, Genetic Clustering and Biofilm Insights.

Antibiotics (Basel). 2024-5-14

[9]
Unveiling a Outbreak in a Rabbit Farm: Clinical Manifestation, Antimicrobial Resistance, Genomic Insights and Environmental Investigation.

Microorganisms. 2024-4-12

[10]
Raw milk cheeses from Beira Baixa, Portugal-A contributive study for the microbiological hygiene and safety assessment.

Braz J Microbiol. 2024-6

本文引用的文献

[1]
Taxonium, a web-based tool for exploring large phylogenetic trees.

Elife. 2022-11-15

[2]
SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels.

BMC Genomics. 2021-10-30

[3]
Decentralized Investigation of Bacterial Outbreaks Based on Hashed cgMLST.

Front Microbiol. 2021-5-28

[4]
Species-Specific Quality Control, Assembly and Contamination Detection in Microbial Isolate Sequences with AQUAMIS.

Genes (Basel). 2021-4-26

[5]
HierCC: a multi-level clustering scheme for population assignments based on core genome MLST.

Bioinformatics. 2021-10-25

[6]
clustering to reveal major European whole-genome-sequencing-based genogroups in association with antimicrobial resistance.

Microb Genom. 2021-2

[7]
Chewie Nomenclature Server (chewie-NS): a deployable nomenclature server for easy sharing of core and whole genome MLST schemas.

Nucleic Acids Res. 2021-1-8

[8]
A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology.

Nat Microbiol. 2020-7-15

[9]
SciPy 1.0: fundamental algorithms for scientific computing in Python.

Nat Methods. 2020-2-3

[10]
The EnteroBase user's guide, with case studies on transmissions, phylogeny, and core genomic diversity.

Genome Res. 2019-12-6

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索