Worby Colin J, Lipsitch Marc, Hanage William P
Am J Epidemiol. 2017 Nov 15;186(10):1209-1216. doi: 10.1093/aje/kwx182.
Sequencing pathogen samples during a communicable disease outbreak is becoming an increasingly common procedure in epidemiologic investigations. Identifying who infected whom sheds considerable light on transmission patterns, high-risk settings and subpopulations, and the effectiveness of infection control. Genomic data shed new light on transmission dynamics and can be used to identify clusters of individuals likely to be linked by direct transmission. However, identification of individual routes of infection via single genome samples typically remains uncertain. We investigated the potential of deep sequence data to provide greater resolution on transmission routes, via the identification of shared genomic variants. We assessed several easily implemented methods to identify transmission routes using both shared variants and genetic distance, demonstrating that shared variants can provide considerable additional information in most scenarios. While shared-variant approaches identify relatively few links in the presence of a small transmission bottleneck, these links are highly accurate. Furthermore, we propose a hybrid approach that also incorporates phylogenetic distance to provide greater resolution. We applied our methods to data collected during the 2014 Ebola outbreak, identifying several likely routes of transmission. Our study highlights the power of data from deep sequencing of pathogens as a component of outbreak investigation and epidemiologic analyses.
在传染病暴发期间对病原体样本进行测序,正日益成为流行病学调查中一种常见的程序。确定谁感染了谁,能让我们对传播模式、高风险环境和亚人群以及感染控制的效果有相当深入的了解。基因组数据为传播动态提供了新的见解,并可用于识别可能通过直接传播而相互关联的个体集群。然而,通过单个基因组样本确定个体感染途径通常仍不明确。我们研究了深度序列数据通过识别共享基因组变异来提供更高分辨率的传播途径信息的潜力。我们评估了几种利用共享变异和遗传距离来识别传播途径的易于实施的方法,结果表明,在大多数情况下,共享变异能够提供大量额外信息。虽然在存在小的传播瓶颈的情况下,共享变异方法识别出的联系相对较少,但这些联系非常准确。此外,我们提出了一种结合系统发育距离的混合方法,以提供更高的分辨率。我们将我们的方法应用于2014年埃博拉疫情期间收集的数据,识别出了几条可能的传播途径。我们的研究突出了病原体深度测序数据作为疫情调查和流行病学分析组成部分的作用。