Department of Laboratory Medicine, Karolinska Institutet, 14186 Huddinge, Sweden.
Infect Genet Evol. 2013 Aug;18:125-31. doi: 10.1016/j.meegid.2013.03.050. Epub 2013 Apr 11.
Identification of recent HIV infection within populations is a public health priority for accurate estimation of HIV incidence rates and transmitted drug resistance at population level. Determining HIV incidence rates by prospective follow-up of HIV-uninfected individuals is challenging and serological assays have important limitations. HIV diversity within an infected host increases with duration of infection. We explore a simple bioinformatics approach to assess viral diversity by determining the percentage of ambiguous base calls in sequences derived from standard genotyping of HIV-1 protease and reverse transcriptase. Sequences from 691 recently infected (≤1 year) and chronically infected (>1 year) individuals from Sweden, Vietnam and Ethiopia were analyzed for ambiguity. A significant difference (p<0.0001) in the proportion of ambiguous bases was observed between sequences from individuals with recent and chronic infection in both HIV-1 subtype B and non-B infection, consistent with previous studies. In our analysis, a cutoff of <0.47% ambiguous base calls identified recent infection with a sensitivity and specificity of 88.8% and 74.6% respectively. 1,728 protease and reverse transcriptase sequences from 36 surveys of transmitted HIV drug resistance performed following World Health Organization guidance were analyzed for ambiguity. The 0.47% ambiguity cutoff was applied and survey sequences were classified as likely derived from recently or chronically infected individuals. 71% of patients were classified as likely to have been infected within one year of genotyping but results varied considerably amongst surveys. This bioinformatics approach may provide supporting population-level information to identify recent infection but its application is limited by infection with more than one viral variant, decreasing viral diversity in advanced disease and technical aspects of population based sequencing. Standardization of sequencing techniques and base calling and the addition of other parameters such as CD4 cell count may address some of the technical limitations and increase the usefulness of the approach.
在人群中识别近期感染的 HIV 是准确估计 HIV 发病率和人群水平传播耐药性的公共卫生重点。通过对未感染 HIV 的个体进行前瞻性随访来确定 HIV 发病率具有挑战性,且血清学检测存在重要局限性。受感染宿主内的 HIV 多样性会随着感染时间的延长而增加。我们探索了一种简单的生物信息学方法,通过确定从 HIV-1 蛋白酶和逆转录酶标准基因分型中获得的序列中模糊碱基的百分比来评估病毒多样性。对来自瑞典、越南和埃塞俄比亚的 691 名近期感染(≤1 年)和慢性感染(>1 年)个体的序列进行了模糊性分析。在 HIV-1 亚型 B 和非-B 感染中,近期感染和慢性感染个体的序列中模糊碱基的比例存在显著差异(p<0.0001),与之前的研究一致。在我们的分析中,<0.47%的模糊碱基调用可识别近期感染,其敏感性和特异性分别为 88.8%和 74.6%。根据世界卫生组织的指导,对 36 项传播性 HIV 耐药性调查中的 1728 个蛋白酶和逆转录酶序列进行了模糊性分析。应用 0.47%的模糊性截止值,将调查序列分类为可能来自近期或慢性感染个体。71%的患者被分类为在基因分型后一年内可能感染,但结果在不同的调查中差异很大。这种生物信息学方法可以提供支持人群水平的信息来识别近期感染,但它的应用受到以下因素的限制:感染了多种病毒变异体、晚期疾病中病毒多样性降低以及基于人群的测序的技术方面。测序技术和碱基调用的标准化以及添加其他参数(如 CD4 细胞计数)可能会解决一些技术限制,增加该方法的实用性。