Finney Elizabeth E, Lee Brian, Ahmed Syed Faraz, Sohail Muhammad Saqib, Quadeer Ahmed Abdul, McKay Matthew R, Barton John P
bioRxiv. 2025 Jul 1:2025.06.29.662219. doi: 10.1101/2025.06.29.662219.
Highly transmissible SARS-CoV-2 variants have emerged throughout the COVID-19 pandemic, driving new waves of infections. Genomic surveillance data can provide insights into the virus's evolution and biology. However, delayed and limited regional data can introduce biases in epidemiological models, potentially obscuring transmission patterns. To address this issue, we used a novel, variant-specific back-projection model to estimate a distribution of likely infection times from sample collection times. We combined this approach with epidemiological modeling to estimate selection for increased transmission in a way that accounts for the uncertainty in infection times. Tests in simulations demonstrated that our method can make the inference of selection more reliable. We also applied our approach to SARS-CoV-2 data, where it excelled in smoothing and extending data from geographic regions or times with poor sampling. Overall, our method can aid in the reliable identification of mutations and variants with higher transmission rates.
在整个新冠疫情期间,出现了具有高度传播性的严重急性呼吸综合征冠状病毒2(SARS-CoV-2)变体,引发了新一波感染浪潮。基因组监测数据可以为病毒的进化和生物学特性提供见解。然而,区域数据的延迟和有限可能会在流行病学模型中引入偏差,从而可能掩盖传播模式。为了解决这个问题,我们使用了一种新颖的、针对变体的反向投影模型,从样本采集时间估计可能的感染时间分布。我们将这种方法与流行病学建模相结合,以一种考虑感染时间不确定性的方式来估计对传播增加的选择。模拟测试表明,我们的方法可以使选择的推断更加可靠。我们还将我们的方法应用于SARS-CoV-2数据,该方法在平滑和扩展来自采样较差的地理区域或时间段的数据方面表现出色。总体而言,我们的方法有助于可靠地识别具有较高传播率的突变和变体。