低多样性序列数据对新兴传染病疫情期间系统发育推断的影响。

The Impacts of Low Diversity Sequence Data on Phylodynamic Inference during an Emerging Epidemic.

机构信息

Department of Microbiology and Immunology, Peter Doherty Institute for Infection and Immunity, University of Melbourne, Melbourne, VIC 3010, Australia.

出版信息

Viruses. 2021 Jan 8;13(1):79. doi: 10.3390/v13010079.

DOI:10.3390/v13010079

PMID:33430050

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7826997/

Abstract

Phylodynamic inference is a pivotal tool in understanding transmission dynamics of viral outbreaks. These analyses are strongly guided by the input of an epidemiological model as well as sequence data that must contain sufficient intersequence variability in order to be informative. These criteria, however, may not be met during the early stages of an outbreak. Here we investigate the impact of low diversity sequence data on phylodynamic inference using the birth-death and coalescent exponential models. Through our simulation study, estimating the molecular evolutionary rate required enough sequence diversity and is an essential first step for any phylodynamic inference. Following this, the birth-death model outperforms the coalescent exponential model in estimating epidemiological parameters, when faced with low diversity sequence data due to explicitly exploiting the sampling times. In contrast, the coalescent model requires additional samples and therefore variability in sequence data before accurate estimates can be obtained. These findings were also supported through our empirical data analyses of an Australian and a New Zealand cluster outbreaks of SARS-CoV-2. Overall, the birth-death model is more robust when applied to datasets with low sequence diversity given sampling is specified and this should be considered for future viral outbreak investigations.

摘要

系统发育推断是理解病毒爆发传播动力学的关键工具。这些分析受到流行病学模型输入以及序列数据的强烈指导，这些序列数据必须包含足够的序列变异，以便提供信息。然而，在爆发的早期阶段，这些标准可能无法得到满足。在这里，我们使用出生-死亡和合并指数模型研究了低多样性序列数据对系统发育推断的影响。通过我们的模拟研究，估计分子进化率需要足够的序列多样性，这是任何系统发育推断的必要第一步。在此之后，由于明确利用了采样时间，出生-死亡模型在估计流行病参数方面优于合并指数模型，当遇到低多样性序列数据时。相比之下，由于需要额外的样本和序列数据的变异性，合并模型才能获得准确的估计。我们对澳大利亚和新西兰 SARS-CoV-2 聚集性爆发的实证数据分析也支持了这些发现。总体而言，在指定采样的情况下，当应用于具有低序列多样性的数据集时，出生-死亡模型更稳健，这应该在未来的病毒爆发调查中加以考虑。