Duke Human Vaccine Institute, Duke University Medical Center, Durham, North Carolina, USA.
Department of Biochemistry & Molecular Medicine, The George Washington School of Medicine and Health Sciences, Washington, DC, USA.
mSphere. 2020 Oct 14;5(5):e00551-20. doi: 10.1128/mSphere.00551-20.
High-throughput sequencing (HTS) has been widely used to characterize HIV-1 genome sequences. There are no algorithms currently that can directly determine genotype and quasispecies population using short HTS reads generated from long genome sequences without additional software. To establish a robust subpopulation, subtype, and recombination analysis workflow, we amplified the HIV-1 3'-half genome from plasma samples of 65 HIV-1-infected individuals and sequenced the entire amplicon (∼4,500 bp) by HTS. With direct analysis of raw reads using HIVE-hexahedron, we showed that 48% of samples harbored 2 to 13 subpopulations. We identified various subtypes (17 A1s, 4 Bs, 27 Cs, 6 CRF02_AGs, and 11 unique recombinant forms) and defined recombinant breakpoints of 10 recombinants. These results were validated with viral genome sequences generated by single genome sequencing (SGS) or the analysis of consensus sequence of the HTS reads. The HIVE-hexahedron workflow is more sensitive and accurate than just evaluating the consensus sequence and also more cost-effective than SGS. The highly recombinogenic nature of human immunodeficiency virus type 1 (HIV-1) leads to recombination and emergence of quasispecies. It is important to reliably identify subpopulations to understand the complexity of a viral population for drug resistance surveillance and vaccine development. High-throughput sequencing (HTS) provides improved resolution over Sanger sequencing for the analysis of heterogeneous viral subpopulations. However, current methods of analysis of HTS reads are unable to fully address accurate population reconstruction. Hence, there is a dire need for a more sensitive, accurate, user-friendly, and cost-effective method to analyze viral quasispecies. For this purpose, we have improved the HIVE-hexahedron algorithm that we previously developed with short sequences to analyze raw HTS short reads. The significance of this study is that our standalone algorithm enables a streamlined analysis of quasispecies, subtype, and recombination patterns from long HIV-1 genome regions without the need of additional sequence analysis tools. Distinct viral populations and recombination patterns identified by HIVE-hexahedron are further validated by comparison with sequences obtained by single genome sequencing (SGS).
高通量测序(HTS)已被广泛用于描述 HIV-1 基因组序列。目前尚无算法可以直接使用从长基因组序列生成的短 HTS 读取来确定基因型和准种群体,而无需额外的软件。为了建立一个强大的亚群、亚型和重组分析工作流程,我们从 65 名 HIV-1 感染个体的血浆样本中扩增了 HIV-1 3'半基因组,并通过 HTS 对整个扩增子(约 4500bp)进行测序。通过直接分析使用 HIVE-hexahedron 的原始读数,我们发现 48%的样本含有 2 到 13 个亚群。我们鉴定了各种亚型(17 个 A1 株、4 个 B 株、27 个 C 株、6 个 CRF02_AG 株和 11 个独特的重组形式)和定义了 10 个重组体的重组断点。这些结果通过单基因组测序(SGS)或 HTS 读取的共识序列分析得到了验证。HIVE-hexahedron 工作流程比仅仅评估共识序列更敏感和准确,而且比 SGS 更具成本效益。人类免疫缺陷病毒 1(HIV-1)的高度重组性质导致重组和准种的出现。可靠地识别亚群对于了解病毒群体的复杂性以进行耐药性监测和疫苗开发非常重要。高通量测序(HTS)为分析异质病毒亚群提供了比 Sanger 测序更高的分辨率。然而,目前分析 HTS 读取的方法无法完全解决准确的种群重建问题。因此,迫切需要一种更敏感、准确、用户友好且具有成本效益的方法来分析病毒准种。为此,我们改进了之前使用短序列开发的 HIVE-hexahedron 算法,以分析原始 HTS 短读取。这项研究的意义在于,我们的独立算法能够在无需额外序列分析工具的情况下,从长 HIV-1 基因组区域中流线型地分析准种、亚型和重组模式。通过与单基因组测序(SGS)获得的序列进行比较,进一步验证了 HIVE-hexahedron 鉴定的不同病毒群体和重组模式。