Guang August, Howison Mark, Ledingham Lauren, D'Antuono Matthew, Chan Philip A, Lawrence Charles, Dunn Casey W, Kantor Rami
Center for Computational Biology of Human Disease, Brown University, Providence, RI, United States.
Center for Computation and Visualization, Brown University, Providence, RI, United States.
Front Microbiol. 2022 Feb 17;12:803190. doi: 10.3389/fmicb.2021.803190. eCollection 2021.
Phylogenetic analyses of HIV sequences are used to detect clusters and inform public health interventions. Conventional approaches summarize within-host HIV diversity with a single consensus sequence per host of the gene, obtained from Sanger or next-generation sequencing (NGS). There is growing recognition that this approach discards potentially important information about within-host sequence variation, which can impact phylogenetic inference. However, whether alternative summary methods that incorporate intra-host variation impact phylogenetic inference of transmission network features is unknown.
We introduce , a method to incorporate within-host NGS sequence diversity into phylogenetic HIV cluster inference. We compare this approach to Sanger- and NGS-derived and near-whole-genome consensus sequences and evaluate its potential benefits in identifying molecular clusters among all newly-HIV-diagnosed individuals over six months at the largest HIV center in Rhode Island.
cluster inference demonstrated that within-host viral diversity impacts phylogenetic inference across individuals, and that consensus sequence approaches can obscure both magnitude and effect of these impacts. Clustering differed between Sanger- and NGS-derived consensus and sequences, and across gene regions.
can incorporate within-host HIV diversity captured by NGS into phylogenetic analyses. This additional information can improve robustness of cluster detection.
对HIV序列进行系统发育分析可用于检测聚类并为公共卫生干预提供信息。传统方法通过对每个宿主的基因采用单条一致序列来概括宿主内的HIV多样性,该一致序列通过桑格测序或新一代测序(NGS)获得。人们越来越认识到,这种方法会丢弃有关宿主内序列变异的潜在重要信息,而这些信息可能会影响系统发育推断。然而,纳入宿主内变异的替代概括方法是否会影响传播网络特征的系统发育推断尚不清楚。
我们引入了一种方法,将宿主内NGS序列多样性纳入HIV系统发育聚类推断中。我们将这种方法与源自桑格测序和NGS的一致序列以及近全基因组一致序列进行比较,并评估其在罗德岛最大的HIV中心对所有新诊断出HIV的个体进行为期六个月的分子聚类识别中的潜在益处。
聚类推断表明,宿主内病毒多样性会影响个体间的系统发育推断,并且一致序列方法可能会掩盖这些影响的程度和效果。源自桑格测序和NGS的一致序列以及序列之间的聚类情况不同,并且在基因区域之间也存在差异。
可以将NGS捕获的宿主内HIV多样性纳入系统发育分析中。这些额外信息可以提高聚类检测的稳健性。