Huang Austin, Hogan Joseph W, Istrail Sorin, Delong Allison, Katzenstein David A, Kantor Rami
Division of Infectious Diseases, Brown University, Providence, RI, USA.
Future Virol. 2012 May;7(5):505-517. doi: 10.2217/fvl.12.37.
HIV-1 sequence diversity can affect host immune responses and phenotypic characteristics such as antiretroviral drug resistance. Current HIV-1 sequence diversity classification uses phylogeny-based methods to identify subtypes and recombinants, which may overlook distinct subpopulations within subtypes. While local epidemic studies have characterized sequence-level clustering within subtypes using phylogeny, identification of new genotype - phenotype associations are based on mutational correlations at individual sequence positions. We perform a systematic, global analysis of position-specific pol gene sequence variation across geographic regions within HIV-1 subtypes to characterize subpopulation differences that may be missed by standard subtyping methods and sequence-level phylogenetic clustering analyses. MATERIALS #ENTITYSTARTX00026; METHODS: Analysis was performed on a large, globally diverse, cross-sectional pol sequence dataset. Sequences were partitioned into subtypes and geographic subpopulations within subtypes. For each subtype, we identified positions that varied according to geography using VESPA (viral epidemiology signature pattern analysis) to identify sequence signature differences and a likelihood ratio test adjusted for multiple comparisons to characterize differences in amino acid (AA) frequencies, including minority mutations. Synonymous nonsynonymous analysis program (SNAP) was used to explore the role of evolutionary selection witihin subtype C. RESULTS: In 7693 protease (PR) and reverse transcriptase (RT) sequences from untreated patients in multiple geographic regions, 11 PR and 11 RT positions exhibited sequence signature differences within subtypes. Thirty six PR and 80 RT positions exhibited within-subtype geography-dependent differences in AA distributions, including minority mutations, at both conserved and variable loci. Among subtype C samples from India and South Africa, nine PR and nine RT positions had significantly different AA distributions, including one PR and five RT positions that differed in consensus AA between regions. A selection analysis of subtype C using SNAP demonstrated that estimated rates of nonsynonymous and synonymous mutations are consistent with the possibility of positive selection across geographic subpopulations within subtypes. CONCLUSION: We characterized systematic genotypic pol differences across geographic regions within subtypes that are not captured by the subtyping nomenclature. Awareness of such differences may improve the interpretation of future studies determining the phenotypic consequences of genetic backgrounds.
HIV-1序列多样性可影响宿主免疫反应以及诸如抗逆转录病毒药物耐药性等表型特征。当前HIV-1序列多样性分类采用基于系统发育的方法来识别亚型和重组体,这可能会忽略亚型内不同的亚群。虽然局部流行研究已利用系统发育对亚型内的序列水平聚类进行了特征描述,但新基因型-表型关联的识别是基于单个序列位置的突变相关性。我们对HIV-1亚型内不同地理区域的特定位置的pol基因序列变异进行了系统的全球分析,以表征标准亚型分类方法和序列水平系统发育聚类分析可能遗漏的亚群差异。
对一个大型的、全球多样的横断面pol序列数据集进行分析。序列被划分为亚型以及亚型内的地理亚群。对于每个亚型,我们使用VESPA(病毒流行病学特征模式分析)来识别根据地理而变化的位置,以识别序列特征差异,并使用针对多重比较进行调整的似然比检验来表征氨基酸(AA)频率的差异,包括少数突变。同义非同义分析程序(SNAP)用于探索C亚型内进化选择的作用。
在来自多个地理区域未接受治疗患者的7693个蛋白酶(PR)和逆转录酶(RT)序列中,11个PR和11个RT位置在亚型内表现出序列特征差异。36个PR和80个RT位置在保守和可变位点的AA分布上表现出亚型内地理依赖性差异,包括少数突变。在来自印度和南非的C亚型样本中,9个PR和9个RT位置具有显著不同的AA分布,包括1个PR和5个RT位置在区域间的共有AA不同。使用SNAP对C亚型进行的选择分析表明,非同义突变和同义突变的估计速率与亚型内不同地理亚群存在正选择的可能性一致。
我们表征了亚型内不同地理区域间系统的基因型pol差异,而这些差异未被亚型分类法所涵盖。意识到这些差异可能会改善对未来确定遗传背景表型后果研究的解释。