Henke David, Piedra Felipe-Andrés, Avadhanula Vasanthi, Doddapaneni Harsha, Muzny Donna M, Menon Vipin K, Hoffman Kristi L, Ross Matthew C, Javornik Cregeen Sara J, Metcalf Ginger, Gibbs Richard A, Petrosino Joseph F, Piedra Pedro A
Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, USA.
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
bioRxiv. 2024 Sep 3:2023.05.17.541198. doi: 10.1101/2023.05.17.541198.
Every viral infection entails an evolving population of viral genomes. High-throughput sequencing technologies can be used to characterize such populations, but to date there are few published examples of such work. In addition, mixed sequencing data are sometimes used to infer properties of infecting genomes without discriminating between genome-derived reads and reads from the much more abundant, in the case of a typical active viral infection, transcripts. Here we apply capture probe-based short read high-throughput sequencing to nasal wash samples taken from a previously described group of adult hematopoietic cell transplant (HCT) recipients naturally infected with respiratory syncytial virus (RSV). We separately analyzed reads from genomes and transcripts for the levels and distribution of genetic variation by calculating per position Shannon entropies. Our analysis reveals a low level of genetic variation within the RSV infections analyzed here, but with interesting differences between genomes and transcripts in 1) average per sample Shannon entropies; 2) the genomic distribution of variation 'hotspots'; and 3) the genomic distribution of hotspots encoding alternative amino acids. In all, our results suggest the importance of separately analyzing reads from genomes and transcripts when interpreting high-throughput sequencing data for insight into intra-host viral genome replication, expression, and evolution.
每次病毒感染都会产生不断变化的病毒基因组群体。高通量测序技术可用于表征此类群体,但迄今为止,此类工作的公开实例很少。此外,在典型的活跃病毒感染中,混合测序数据有时用于推断感染基因组的特性,而不区分来自基因组的reads和来自数量多得多的转录本的reads。在这里,我们将基于捕获探针的短读高通量测序应用于从先前描述的一组自然感染呼吸道合胞病毒(RSV)的成人造血细胞移植(HCT)受者采集的鼻腔冲洗样本。我们通过计算每个位置的香农熵,分别分析了来自基因组和转录本的reads的遗传变异水平和分布。我们的分析揭示了此处分析的RSV感染内遗传变异水平较低,但在基因组和转录本之间存在有趣的差异:1)每个样本的平均香农熵;2)变异“热点”的基因组分布;3)编码替代氨基酸的热点的基因组分布。总之,我们的结果表明,在解释高通量测序数据以深入了解宿主内病毒基因组复制、表达和进化时,分别分析来自基因组和转录本的reads非常重要。