Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland; SIB, Basel, Switzerland.
Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland; SIB, Basel, Switzerland.
Virus Res. 2017 Jul 15;239:17-32. doi: 10.1016/j.virusres.2016.09.016. Epub 2016 Sep 28.
Rapidly evolving RNA viruses prevail within a host as a collection of closely related variants, referred to as viral quasispecies. Advances in high-throughput sequencing (HTS) technologies have facilitated the assessment of the genetic diversity of such virus populations at an unprecedented level of detail. However, analysis of HTS data from virus populations is challenging due to short, error-prone reads. In order to account for uncertainties originating from these limitations, several computational and statistical methods have been developed for studying the genetic heterogeneity of virus population. Here, we review methods for the analysis of HTS reads, including approaches to local diversity estimation and global haplotype reconstruction. Challenges posed by aligning reads, as well as the impact of reference biases on diversity estimates are also discussed. In addition, we address some of the experimental approaches designed to improve the biological signal-to-noise ratio. In the future, computational methods for the analysis of heterogeneous virus populations are likely to continue being complemented by technological developments.
RNA 病毒在宿主中迅速进化为一组密切相关的变体,称为病毒准种。高通量测序 (HTS) 技术的进步使得以前所未有的详细程度评估此类病毒群体的遗传多样性成为可能。然而,由于短读长和易错性,病毒群体的 HTS 数据分析具有挑战性。为了说明这些限制所产生的不确定性,已经开发了几种用于研究病毒群体遗传异质性的计算和统计方法。在这里,我们回顾了用于分析 HTS 读取的方法,包括用于局部多样性估计和全局单倍型重建的方法。还讨论了读取对齐带来的挑战,以及参考偏差对多样性估计的影响。此外,我们还讨论了一些旨在提高生物信号噪声比的实验方法。未来,用于分析异质病毒群体的计算方法可能会继续得到技术发展的补充。