Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA 98109.
HHMI, Seattle, WA 98109.
Proc Natl Acad Sci U S A. 2024 Apr 9;121(15):e2305299121. doi: 10.1073/pnas.2305299121. Epub 2024 Apr 3.
Quantifying transmission intensity and heterogeneity is crucial to ascertain the threat posed by infectious diseases and inform the design of interventions. Methods that jointly estimate the reproduction number and the dispersion parameter have however mainly remained limited to the analysis of epidemiological clusters or contact tracing data, whose collection often proves difficult. Here, we show that clusters of identical sequences are imprinted by the pathogen offspring distribution, and we derive an analytical formula for the distribution of the size of these clusters. We develop and evaluate an inference framework to jointly estimate the reproduction number and the dispersion parameter from the size distribution of clusters of identical sequences. We then illustrate its application across a range of epidemiological situations. Finally, we develop a hypothesis testing framework relying on clusters of identical sequences to determine whether a given pathogen genetic subpopulation is associated with increased or reduced transmissibility. Our work provides tools to estimate the reproduction number and transmission heterogeneity from pathogen sequences without building a phylogenetic tree, thus making it easily scalable to large pathogen genome datasets.
量化传播强度和异质性对于确定传染病的威胁并为干预措施的设计提供信息至关重要。然而,联合估计繁殖数 和离散参数 的方法主要限于对流行病学聚类或接触追踪数据的分析,而这些数据的收集往往很困难。在这里,我们表明相同序列的聚类被病原体后代分布所印记,并且我们推导出了这些聚类大小分布的解析公式。我们开发并评估了一种从相同序列聚类的大小分布中联合估计繁殖数和离散参数的推断框架。然后,我们在一系列流行病学情况下说明了其应用。最后,我们开发了一个基于相同序列聚类的假设检验框架,以确定给定病原体遗传亚群是否与传染性增加或降低相关。我们的工作提供了从病原体序列估计繁殖数和传播异质性的工具,而无需构建系统发育树,因此很容易扩展到大的病原体基因组数据集。