Wadsworth Center, New York State Department of Health, Albany, New York, USA
Wadsworth Center, New York State Department of Health, Albany, New York, USA.
J Clin Microbiol. 2021 Jan 21;59(2). doi: 10.1128/JCM.00967-20.
Legionnaires' disease, a severe lung infection caused by the bacterium , occurs as single cases or in outbreaks that are actively tracked by public health departments. To determine the point source of an outbreak, clinical isolates need to be compared to environmental samples to find matching isolates. One confounding factor is the genome plasticity of , making an exact sequence comparison by whole-genome sequencing (WGS) challenging. Here, we present a WGS analysis pipeline, LegioCluster, that is designed to circumvent this problem by automatically selecting the best matching reference genome prior to mapping and variant calling. This approach reduces the number of false-positive variant calls, maximizes the fraction of all genomes that are being compared, and naturally clusters the isolates according to their reference strain. Isolates that are too distant from any genome in the database are added to the list of candidate references, thereby creating a new cluster. Short insertions or deletions are considered in addition to single-nucleotide polymorphisms for increased discriminatory power. This manuscript describes the use of this automated and "locked down" bioinformatic pipeline deployed at the New York State Department of Health's Wadsworth Center for investigating relatedness between clinical and environmental isolates. A similar pipeline has not been widely available for use to support these critically important public health investigations.
军团病是一种由细菌引起的严重肺部感染,以散发病例或疫情形式出现,公共卫生部门会对其进行积极追踪。为了确定疫情的源头,需要将临床分离株与环境样本进行比较,以找到匹配的分离株。一个复杂的因素是 的基因组可塑性,这使得通过全基因组测序(WGS)进行精确的序列比较具有挑战性。在这里,我们提出了一个 WGS 分析管道 LegioCluster,旨在通过在映射和变异调用之前自动选择最佳匹配的参考基因组来解决这个问题。这种方法减少了假阳性变异调用的数量,最大化了正在比较的所有基因组的比例,并根据参考菌株自然地对分离株进行聚类。与数据库中的任何基因组都相差太远的分离株将被添加到候选参考列表中,从而创建一个新的聚类。除了单核苷酸多态性之外,还考虑了短插入或缺失,以提高区分能力。本文描述了在纽约州卫生部的 Wadsworth 中心使用这种自动化和“锁定”的生物信息学管道来调查临床和环境分离株之间的相关性,该管道尚未广泛用于支持这些至关重要的公共卫生调查。