Pérez-Losada Marcos, Castel Amanda D, Lewis Brittany, Kharfen Michael, Cartwright Charles P, Huang Bruce, Maxwell Taylor, Greenberg Alan E, Crandall Keith A
Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Ashburn, VA, United States of America.
CIBIO-InBIO, Universidade do Porto, Campus Agrário de Vairão, Vairão, Portugal.
PLoS One. 2017 Sep 29;12(9):e0185644. doi: 10.1371/journal.pone.0185644. eCollection 2017.
Washington DC has a high burden of HIV with a 2.0% HIV prevalence. The city is a national and international hub potentially containing a broad diversity of HIV variants; yet few sequences from DC are available on GenBank to assess the evolutionary history of HIV in the US capital. Towards this general goal, here we analyze extensive sequence data and investigate HIV diversity, phylodynamics, and drug resistant mutations (DRM) in DC.
Molecular HIV-1 sequences were collected from participants infected through 2015 as part of the DC Cohort, a longitudinal observational study of HIV+ patients receiving care at 13 DC clinics. Sequences were paired with Cohort demographic, risk, and clinical data and analyzed using maximum likelihood, Bayesian and coalescent approaches of phylogenetic, network and population genetic inference. We analyzed 601 sequences from 223 participants for int (864 bp) and 2,810 sequences from 1,659 participants for PR/RT (1497 bp).
Ninety-nine and 94% of the int and PR/RT sequences, respectively, were identified as subtype B, with 14 non-B subtypes also detected. Phylodynamic analyses of US born infected individuals showed that HIV population size varied little over time with no significant decline in diversity. Phylogenetic analyses grouped 13.5% of the int sequences into 14 clusters of 2 or 3 sequences, and 39.0% of the PR/RT sequences into 203 clusters of 2-32 sequences. Network analyses grouped 3.6% of the int sequences into 4 clusters of 2 sequences, and 10.6% of the PR/RT sequences into 76 clusters of 2-7 sequences. All network clusters were detected in our phylogenetic analyses. Higher proportions of clustered sequences were found in zip codes where HIV prevalence is highest (r = 0.607; P<0.00001). We detected a high prevalence of DRM for both int (17.1%) and PR/RT (39.1%), but only 8 int and 12 PR/RT amino acids were identified as under adaptive selection. We observed a significant (P<0.0001) association between main risk factors (men who have sex with men and heterosexuals) and genotypes in the five well-supported clusters with sufficient sample size for testing.
Pairing molecular data with clinical and demographic data provided novel insights into HIV population dynamics in Washington, DC. Identification of populations and geographic locations where clustering occurs can inform and complement active surveillance efforts to interrupt HIV transmission.
华盛顿特区的艾滋病毒负担沉重,艾滋病毒流行率为2.0%。该市是一个国家和国际枢纽,可能包含各种各样的艾滋病毒变体;然而,GenBank上来自华盛顿特区的序列很少,无法评估美国首都艾滋病毒的进化史。为了实现这一总体目标,我们在此分析了大量序列数据,并调查了华盛顿特区的艾滋病毒多样性、系统动力学和耐药突变(DRM)。
作为华盛顿特区队列研究的一部分,收集了2015年之前感染的参与者的艾滋病毒-1分子序列,该研究是对在华盛顿特区13家诊所接受治疗的艾滋病毒阳性患者进行的纵向观察研究。将序列与队列的人口统计学、风险和临床数据配对,并使用系统发育、网络和群体遗传推断的最大似然法、贝叶斯法和溯祖法进行分析。我们分析了来自223名参与者的601条整合酶(int,约864 bp)序列和来自1659名参与者的2810条蛋白酶/逆转录酶(PR/RT,约1497 bp)序列。
分别有99%和94%的int和PR/RT序列被鉴定为B亚型,同时还检测到14种非B亚型。对在美国出生的受感染个体的系统动力学分析表明,艾滋病毒群体规模随时间变化不大,多样性没有显著下降。系统发育分析将13.5%的int序列分为14个由2或3条序列组成的簇,39.0%的PR/RT序列分为203个由2至32条序列组成的簇。网络分析将3.6%的int序列分为4个由2条序列组成的簇,10.6%的PR/RT序列分为76个由2至7条序列组成的簇。所有网络簇都在我们的系统发育分析中被检测到。在艾滋病毒流行率最高的邮政编码区域发现了更高比例的聚类序列(r = 0.607;P<0.00001)。我们检测到int(17.1%)和PR/RT(39.1%)的DRM流行率都很高,但只有8个int和12个PR/RT氨基酸被鉴定为处于适应性选择之下。我们观察到在五个有足够样本量进行测试且支持度良好的簇中,主要风险因素(男男性行为者和异性恋者)与基因型之间存在显著关联(P<0.0001)。
将分子数据与临床和人口统计学数据相结合,为华盛顿特区的艾滋病毒群体动态提供了新的见解。识别出现聚类的人群和地理位置可以为中断艾滋病毒传播的主动监测工作提供信息并加以补充。