Cambridge Infectious Diseases Consortium, Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom.
PLoS Comput Biol. 2011 Mar;7(3):e1002027. doi: 10.1371/journal.pcbi.1002027. Epub 2011 Mar 31.
The development of modern and affordable sequencing technologies has allowed the study of viral populations to an unprecedented depth. This is of particular interest for the study of within-host RNA viral populations, where variation due to error-prone polymerases can lead to immune escape, antiviral resistance and adaptation to new host species. Methods to sequence RNA virus genomes include reverse transcription (RT) and polymerase chain reaction (PCR). RT-PCR is a molecular biology technique widely used to amplify DNA from an RNA template. The method itself relies on the in vitro synthesis of copy DNA from RNA followed by multiple cycles of DNA amplification. However, this method introduces artefactual errors that can act as confounding factors when the sequence data are analysed. Although there are a growing number of published studies exploring the intra- and inter-host evolutionary dynamics of RNA viruses, the complexity of the methods used to generate sequences makes it difficult to produce probabilistic statements about the likely sources of observed sequence variants. This complexity is further compounded as both the depth of sequencing and the length of the genome segment of interest increase. Here we develop a bayesian method to characterise and differentiate between likely structures for the background viral population. This approach can then be used to identify nucleotide sites that show evidence of change in the within-host viral population structure, either over time or relative to a reference sequence (e.g. an inoculum or another source of infection), or both, without having to build complex evolutionary models. Identification of these sites can help to inform the design of more focussed experiments using molecular biology tools, such as site-directed mutagenesis, to assess the function of specific amino acids. We illustrate the method by applying to datasets from experimental transmission of equine influenza, and a pre-clinical vaccine trial for HIV-1.
现代且经济实惠的测序技术的发展使得对病毒群体的研究达到了前所未有的深度。这对于研究宿主内 RNA 病毒群体特别有意义,因为易错聚合酶引起的变异可导致免疫逃逸、抗病毒耐药性和对新宿主物种的适应。用于测序 RNA 病毒基因组的方法包括逆转录(RT)和聚合酶链反应(PCR)。RT-PCR 是一种广泛用于从 RNA 模板扩增 DNA 的分子生物学技术。该方法本身依赖于体外合成 RNA 模板的 cDNA,然后进行多次 DNA 扩增循环。然而,这种方法引入了人为错误,在分析序列数据时可能会成为混杂因素。尽管有越来越多的关于 RNA 病毒的宿主内和宿主间进化动态的已发表研究,但用于生成序列的方法的复杂性使得难以对观察到的序列变异的可能来源做出概率性陈述。当测序深度和感兴趣的基因组片段长度增加时,这种复杂性会进一步加剧。在这里,我们开发了一种贝叶斯方法来描述和区分背景病毒群体的可能结构。然后可以使用这种方法来识别核苷酸位点,这些位点显示出宿主内病毒群体结构在时间上或相对于参考序列(例如接种物或另一个感染源)或两者都发生变化的证据,而无需构建复杂的进化模型。这些位点的识别可以帮助使用分子生物学工具(例如定点诱变)设计更有针对性的实验,以评估特定氨基酸的功能。我们通过应用于马流感的实验传播和 HIV-1 的临床前疫苗试验数据集来说明该方法。