Center For Applied Plant Sciences, The Ohio State Universitygrid.261331.4, Columbus, Ohio, USA.
Infectious Diseases Institute, The Ohio State Universitygrid.261331.4, Columbus, Ohio, USA.
Appl Environ Microbiol. 2022 Nov 22;88(22):e0087422. doi: 10.1128/aem.00874-22. Epub 2022 Oct 26.
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)/coronavirus disease 2019 (COVID-19) pandemic has highlighted an important role for efficient surveillance of microbial pathogens. High-throughput sequencing technologies provide valuable surveillance tools, offering opportunities to conduct high-resolution monitoring from diverse sample types, including from environmental sources. However, given their large size and potential to contain mixtures of lineages within samples, such genomic data sets can present challenges for analyzing the data and communicating results with diverse stakeholders. Here, we report MixviR, an R package for exploring, analyzing, and visualizing genomic data from potentially mixed samples of a target microbial group. MixviR characterizes variation at both the nucleotide and amino acid levels and offers the RShiny interactive dashboard for exploring data. We demonstrate MixviR's utility with validation studies using mixtures of known lineages from both SARS-CoV-2 and Mycobacterium tuberculosis and with a case study analyzing lineages of SARS-CoV-2 in wastewater samples over time at a sampling location in Ohio, USA. High-throughput sequencing technologies hold great potential for contributing to genomic-based surveillance of microbial diversity from environmental samples. However, the size of the data sets, along with the potential for environmental samples to contain multiple evolutionary lineages of interest, present challenges around analyzing and effectively communicating inferences from these data sets. The software described here provides a novel and valuable tool for exploring such data. Though originally designed and used for monitoring SARS-CoV-2 lineages in wastewater, it can also be applied to analyses of genomic diversity in other microbial groups.
严重急性呼吸综合征冠状病毒 2(SARS-CoV-2)/2019 年冠状病毒病(COVID-19)大流行突显了对微生物病原体进行有效监测的重要作用。高通量测序技术提供了有价值的监测工具,为从包括环境来源在内的各种样本类型进行高分辨率监测提供了机会。然而,鉴于它们的体积较大且有可能在样本中包含多种谱系的混合物,此类基因组数据集在分析数据和与不同利益相关者交流结果方面可能会带来挑战。在这里,我们报告了 MixviR,这是一个用于探索、分析和可视化目标微生物群体的潜在混合样本的基因组数据的 R 包。MixviR 可在核苷酸和氨基酸水平上对变异进行特征描述,并提供 RShiny 交互式仪表板来探索数据。我们使用来自 SARS-CoV-2 和结核分枝杆菌的已知谱系混合物的验证研究以及对美国俄亥俄州一个采样点的污水样本中 SARS-CoV-2 谱系随时间变化的案例研究,展示了 MixviR 的实用性。高通量测序技术在为从环境样本中进行基于基因组的微生物多样性监测提供了巨大的潜力。然而,数据集的大小,以及环境样本中可能包含多个感兴趣的进化谱系,在分析和有效地从这些数据集得出推论方面带来了挑战。这里描述的软件提供了一个新颖而有价值的探索此类数据的工具。虽然最初是为监测污水中的 SARS-CoV-2 谱系而设计和使用的,但它也可以应用于其他微生物群体的基因组多样性分析。