Jombart Thibaut, Cori Anne, Didelot Xavier, Cauchemez Simon, Fraser Christophe, Ferguson Neil
MRC Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, United Kingdom.
PLoS Comput Biol. 2014 Jan;10(1):e1003457. doi: 10.1371/journal.pcbi.1003457. Epub 2014 Jan 23.
Recent years have seen progress in the development of statistically rigorous frameworks to infer outbreak transmission trees ("who infected whom") from epidemiological and genetic data. Making use of pathogen genome sequences in such analyses remains a challenge, however, with a variety of heuristic approaches having been explored to date. We introduce a statistical method exploiting both pathogen sequences and collection dates to unravel the dynamics of densely sampled outbreaks. Our approach identifies likely transmission events and infers dates of infections, unobserved cases and separate introductions of the disease. It also proves useful for inferring numbers of secondary infections and identifying heterogeneous infectivity and super-spreaders. After testing our approach using simulations, we illustrate the method with the analysis of the beginning of the 2003 Singaporean outbreak of Severe Acute Respiratory Syndrome (SARS), providing new insights into the early stage of this epidemic. Our approach is the first tool for disease outbreak reconstruction from genetic data widely available as free software, the R package outbreaker. It is applicable to various densely sampled epidemics, and improves previous approaches by detecting unobserved and imported cases, as well as allowing multiple introductions of the pathogen. Because of its generality, we believe this method will become a tool of choice for the analysis of densely sampled disease outbreaks, and will form a rigorous framework for subsequent methodological developments.
近年来,在开发从流行病学和基因数据推断疫情传播树(“谁感染了谁”)的统计严谨框架方面取得了进展。然而,在这类分析中利用病原体基因组序列仍然是一项挑战,到目前为止人们探索了各种启发式方法。我们引入了一种统计方法,利用病原体序列和采集日期来揭示密集采样疫情的动态。我们的方法能够识别可能的传播事件,并推断感染日期、未观察到的病例以及疾病的不同引入情况。它还被证明在推断二次感染数量以及识别异质性传染性和超级传播者方面很有用。在使用模拟测试我们的方法之后,我们通过分析2003年新加坡严重急性呼吸综合征(SARS)疫情的初期来说明该方法,为这一疫情的早期阶段提供了新的见解。我们的方法是第一个可从广泛可用的免费软件R包outbreaker中获取的用于从基因数据重建疾病爆发的工具。它适用于各种密集采样的疫情,并且通过检测未观察到的和输入的病例以及允许病原体的多次引入改进了先前的方法。由于其通用性,我们相信这种方法将成为分析密集采样疾病爆发的首选工具,并将为后续的方法学发展形成一个严谨的框架。