IAME, UMR 1137, INSERM, Université Paris Diderot, Sorbonne Paris Cité, AP-HP, Laboratoire de Virologie, Hôpital Bichat, AP-HP, Paris, France.
Sorbonne University, UPMC Univ Paris 06, INSERM, Institut Pierre Louis d'épidémiologie et de Santé Publique (IPLESP UMRS 1136), Paris, France.
PLoS One. 2018 Jun 1;13(6):e0198334. doi: 10.1371/journal.pone.0198334. eCollection 2018.
Reliable detection of HIV minority resistant variants (MRVs) requires bioinformatics analysis with specific algorithms to obtain good quality alignments. The aim of this study was to analyze ultra-deep sequencing (UDS) data using different analysis pipelines.
HIV-1 protease, reverse transcriptase (RT) and integrase sequences from antiretroviral-naïve patients were obtained using GS-Junior® (Roche) and MiSeq® (Illumina) platforms. MRVs were defined as variants harbouring resistance-mutation present at a frequency of 1%-20%. Reads were analyzed using different alignment algorithms: Amplicon Variant Analyzer®, Geneious® compared to SmartGene® NGS HIV-1 module.
101 protease and 51 RT MRVs identified in 139 protease and 124 RT sequences generated with a GS-Junior® platform were analyzed using AVA® and SmartGene® software. The correlation coefficients for the MRVs were R2 = 0.974 for protease and R2 = 0.972 for RT. Discordances (n = 13 in protease and n = 15 in RT) mainly concerned low-level MRVs (i.e., with frequencies of 1%-2%, n = 18/28) and they were located in homopolymeric regions (n = 10/15). Geneious® and SmartGene® software were used to analyze 143 protease, 45 RT and 26 integrase MRVs identified in 172 protease, 69 RT, and 72 integrase sequences generated with a MiSeq® platform. The correlation coefficients for the MRVs were R2 = 0.987 for protease, R2 = 0.995 for RT and R2 = 0.993 for integrase. Discordances (n = 9 in protease, n = 3 in RT, and n = 3 in integrase) mainly concerned low-level MRVs (n = 13/15).
We found an excellent correlation between the various UDS analysis pipelines that we tested. However, our results indicate that specific attention should be paid to low-level MRVs, for which the use of two different analysis pipelines and visual inspection of sequences alignments might be beneficial. Thus, our results argue for use of a 2% threshold for MRV detection, rather than the 1% threshold, to minimize misalignments and time-consuming sight reading steps essential to ensure accurate results for MRV frequencies below 2%.
可靠地检测 HIV 少数耐药变异体 (MRV) 需要使用特定算法的生物信息学分析来获得高质量的比对。本研究旨在使用不同的分析管道分析超深度测序 (UDS) 数据。
使用罗氏 GS-Junior® 和 Illumina MiSeq® 平台从未接受过抗逆转录病毒治疗的患者中获得 HIV-1 蛋白酶、逆转录酶 (RT) 和整合酶序列。将耐药变异体定义为频率为 1%-20%的携带耐药突变的变异体。使用不同的对齐算法(Amplicon Variant Analyzer®、Geneious®)对读取数据进行分析,并与 SmartGene® NGS HIV-1 模块进行比较。
在使用 GS-Junior® 平台生成的 139 个蛋白酶和 124 个 RT 序列中,分析了 101 个蛋白酶和 51 个 RT MRV,使用 AVA® 和 SmartGene® 软件分析。蛋白酶的相关系数 R2=0.974,RT 的 R2=0.972。差异(蛋白酶 13 个,RT 15 个)主要涉及低水平的 MRV(即频率为 1%-2%,n=18/28),并且它们位于同聚区域(n=10/15)。使用 Geneious® 和 SmartGene® 软件分析了在使用 MiSeq® 平台生成的 172 个蛋白酶、69 个 RT 和 72 个整合酶序列中发现的 143 个蛋白酶、45 个 RT 和 26 个整合酶 MRV。MRV 的相关系数 R2 分别为蛋白酶 0.987、RT 0.995 和整合酶 0.993。差异(蛋白酶 9 个,RT 3 个,整合酶 3 个)主要涉及低水平的 MRV(n=13/15)。
我们发现我们测试的各种 UDS 分析管道之间存在极好的相关性。然而,我们的结果表明,应特别注意低水平的 MRV,使用两种不同的分析管道和对序列比对进行目视检查可能会有所帮助。因此,我们的结果支持使用 2%的阈值来检测 MRV,而不是 1%的阈值,以最小化错误对齐并减少耗时的目视阅读步骤,这对于确保频率低于 2%的 MRV 结果准确至关重要。