Delft Bioinformatics Lab, Delft University of Technology, Van Mourik Broekmanweg 6, Delft, 2628XE, The Netherlands.
Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA, 02142, USA.
BMC Genomics. 2020 Jan 28;21(1):80. doi: 10.1186/s12864-020-6486-3.
Mixed infections of Mycobacterium tuberculosis and antibiotic heteroresistance continue to complicate tuberculosis (TB) diagnosis and treatment. Detection of mixed infections has been limited to molecular genotyping techniques, which lack the sensitivity and resolution to accurately estimate the multiplicity of TB infections. In contrast, whole genome sequencing offers sensitive views of the genetic differences between strains of M. tuberculosis within a sample. Although metagenomic tools exist to classify strains in a metagenomic sample, most tools have been developed for more divergent species, and therefore cannot provide the sensitivity required to disentangle strains within closely related bacterial species such as M. tuberculosis. Here we present QuantTB, a method to identify and quantify individual M. tuberculosis strains in whole genome sequencing data. QuantTB uses SNP markers to determine the combination of strains that best explain the allelic variation observed in a sample. QuantTB outputs a list of identified strains, their corresponding relative abundances, and a list of drugs for which resistance-conferring mutations (or heteroresistance) have been predicted within the sample.
We show that QuantTB has a high degree of resolution and is capable of differentiating communities differing by less than 25 SNPs and identifying strains down to 1× coverage. Using simulated data, we found QuantTB outperformed other metagenomic strain identification tools at detecting strains and quantifying strain multiplicity. In a real-world scenario, using a dataset of 50 paired clinical isolates from a study of patients with either reinfections or relapses, we found that QuantTB could detect mixed infections and reinfections at rates concordant with a manually curated approach.
QuantTB can determine infection multiplicity, identify hetero-resistance patterns, enable differentiation between relapse and re-infection, and clarify transmission events across seemingly unrelated patients - even in low-coverage (1×) samples. QuantTB outperforms existing tools and promises to serve as a valuable resource for both clinicians and researchers working with clinical TB samples.
结核分枝杆菌(Mycobacterium tuberculosis)的混合感染和抗生素异质性耐药继续使结核病(TB)的诊断和治疗复杂化。混合感染的检测一直受到分子基因分型技术的限制,这些技术缺乏敏感性和分辨率,无法准确估计 TB 感染的多重性。相比之下,全基因组测序提供了对样本中结核分枝杆菌菌株间遗传差异的敏感观察。尽管存在用于对宏基因组样本中的菌株进行分类的宏基因组工具,但大多数工具都是为更具差异性的物种开发的,因此无法提供在诸如结核分枝杆菌等密切相关的细菌物种中分离菌株所需的敏感性。在这里,我们提出了 QuantTB,这是一种用于在全基因组测序数据中识别和量化个体结核分枝杆菌菌株的方法。QuantTB 使用 SNP 标记来确定最佳解释样本中观察到的等位基因变异的菌株组合。QuantTB 输出一份已识别菌株的列表、它们对应的相对丰度以及一份在样本中预测到存在耐药性(或异质性耐药)的药物列表。
我们表明,QuantTB 具有高度的分辨率,能够区分差异小于 25SNP 的群落,并识别出覆盖率低至 1×的菌株。使用模拟数据,我们发现 QuantTB 在检测菌株和量化菌株多样性方面优于其他宏基因组菌株鉴定工具。在实际情况中,使用来自一项对复发或再感染患者的研究中 50 对临床分离株的数据集,我们发现 QuantTB 可以以与人工精心处理方法一致的速率检测混合感染和再感染。
QuantTB 可以确定感染的多样性,识别异质性耐药模式,能够区分复发和再感染,并澄清看似无关的患者之间的传播事件——即使在低覆盖率(1×)样本中也是如此。QuantTB 优于现有的工具,有望成为临床结核样本的临床医生和研究人员的宝贵资源。