D'Agostino McGowan Lucy, Wohl Shirlee, Lessler Justin
Department of Statistical Sciences, Wake Forest University, Winston-Salem, NC 27109, United States.
Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, United States.
Am J Epidemiol. 2025 Aug 5;194(8):2367-2375. doi: 10.1093/aje/kwae378.
The quality of the inferences we make from pathogen sequence data is determined by the number and composition of pathogen sequences that make up the sample used to drive that inference. However, there remains limited guidance on how to best structure and power studies when the end goal is phylogenetic inference. One question that we can attempt to answer with molecular data is whether some people are more likely to transmit a pathogen than others. Here we present an estimator to quantify differential transmission, as measured by the ratio of reproductive numbers between people with different characteristics, using transmission pairs linked by molecular data, along with a sample size calculation for this estimator. We also provide extensions to our method to correct for imperfect identification of transmission-linked pairs, overdispersion in the transmission process, and group imbalance. We validate this method via simulation and provide tools to implement it in an R package, phylosamp.
我们从病原体序列数据得出的推断质量,取决于构成用于驱动该推断的样本的病原体序列数量和组成。然而,当最终目标是系统发育推断时,关于如何最佳构建研究结构和确定研究效力,仍缺乏指导。我们可以尝试用分子数据回答的一个问题是,是否有些人比其他人更有可能传播病原体。在此,我们提出一种估计方法,通过不同特征人群之间的繁殖数之比来量化差异传播,该方法使用由分子数据关联的传播对,并给出了此估计方法的样本量计算。我们还对方法进行了扩展,以校正传播关联对识别不完美、传播过程中的过度离散以及组间不平衡的问题。我们通过模拟验证了该方法,并提供了在R包phylosamp中实现它的工具。