Spouge John L
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
Theor Popul Biol. 2019 Jun;127:7-15. doi: 10.1016/j.tpb.2019.03.001. Epub 2019 Mar 12.
If viruses or other pathogens infect a single host, the outcome of infection often hinges on the fate of the initial invaders. The initial basic reproduction number R, the expected number of cells infected by a single infected cell, helps determine whether the initial viruses can establish a successful beachhead. To determine R, the Kingman coalescent or continuous-time birth-and-death process can be used to infer the rate of exponential growth in an historical population. Given M sequences sampled in the present, the two models can make the inference from the site frequency spectrum (SFS), the count of mutations that appear in exactly k sequences (k=1,2,…,M). In the case of viruses, however, if R is large and an infected cell bursts while propagating virus, the two models are suspect, because they are Markovian with only binary branching. Accordingly, this article develops an approximation for the SFS of a discrete-time branching process with synchronous generations (i.e., a Galton-Watson process). When evaluated in simulations with an asynchronous, non-Markovian model (a Bellman-Harris process) with parameters intended to mimic the bursting viral reproduction of HIV, the approximation proved superior to approximations derived from the Kingman coalescent or continuous-time birth-and-death process. This article demonstrates that in analogy to methods in human genetics, the SFS of viral sequences sampled well after latent infection can remain informative about the initial R. Thus, it suggests the utility of analyzing the SFS of sequences derived from patient and animal trials of viral therapies, because in some cases, the initial R may be able to indicate subtle therapeutic progress, even in the absence of statistically significant differences in the infection of treatment and control groups.
如果病毒或其他病原体感染单个宿主,感染的结果通常取决于初始入侵者的命运。初始基本繁殖数R,即单个感染细胞预期感染的细胞数,有助于确定初始病毒是否能成功建立滩头阵地。为了确定R,可以使用金曼合并过程或连续时间生死过程来推断历史种群中的指数增长率。给定当前采样的M个序列,这两种模型可以根据位点频率谱(SFS)进行推断,位点频率谱是指在恰好k个序列(k = 1, 2, …, M)中出现的突变计数。然而,对于病毒来说,如果R很大且感染细胞在传播病毒时破裂,这两种模型就值得怀疑,因为它们是具有二元分支的马尔可夫模型。因此,本文针对具有同步世代的离散时间分支过程(即高尔顿 - 沃森过程)的位点频率谱开发了一种近似方法。在用具有模拟HIV爆发性病毒繁殖参数的异步、非马尔可夫模型(贝尔曼 - 哈里斯过程)进行模拟评估时,该近似方法被证明优于从金曼合并过程或连续时间生死过程得出的近似方法。本文表明,类似于人类遗传学中的方法,在潜伏感染后很久采样的病毒序列的位点频率谱对于初始R仍可能具有信息价值。因此,它表明分析来自病毒疗法患者和动物试验的序列的位点频率谱是有用的,因为在某些情况下,即使治疗组和对照组的感染没有统计学上的显著差异,初始R也可能能够指示微妙的治疗进展。