Dampier William, Evans Perry, Ungar Lyle, Tozeren Aydin
Center for Integrated Bioinformatics, Drexel University, Bossone Research Center 711, 3120 Market Street, Philadelphia, PA 19104, USA.
BMC Med Genomics. 2009 Jul 23;2:47. doi: 10.1186/1755-8794-2-47.
The HIV viral genome mutates at a high rate and poses a significant long term health risk even in the presence of combination antiretroviral therapy. Current methods for predicting a patient's response to therapy rely on site-directed mutagenesis experiments and in vitro resistance assays. In this bioinformatics study we treat response to antiretroviral therapy as a two-body problem: response to therapy is considered to be a function of both the host and pathogen proteomes. We set out to identify potential responders based on the presence or absence of host protein and DNA motifs on the HIV proteome.
An alignment of thousands of HIV-1 sequences attested to extensive variation in nucleotide sequence but also showed conservation of eukaryotic short linear motifs on the protein coding regions. The reduction in viral load of patients in the Stanford HIV Drug Resistance Database exhibited a bimodal distribution after 24 weeks of antiretroviral therapy, with 2,000 copies/ml cutoff. Similarly, patients allocated into responder/non-responder categories based on consistent viral load reduction during a 24 week period showed clear separation. In both cases of phenotype identification, a set of features composed of short linear motifs in the reverse transcriptase region of HIV sequence accurately predicted a patient's response to therapy. Motifs that overlap resistance sites were highly predictive of responder identification in single drug regimens but these features lost importance in defining responders in multi-drug therapies.
HIV sequence mutates in a way that preferentially preserves peptide sequence motifs that are also found in the human proteome. The presence and absence of such motifs at specific regions of the HIV sequence is highly predictive of response to therapy. Some of these predictive motifs overlap with known HIV-1 resistance sites. These motifs are well established in bioinformatics databases and hence do not require identification via in vitro mutation experiments.
即使在联合抗逆转录病毒疗法存在的情况下,HIV病毒基因组也会以高频率发生突变,并构成重大的长期健康风险。目前预测患者对治疗反应的方法依赖于定点诱变实验和体外耐药性检测。在这项生物信息学研究中,我们将对抗逆转录病毒疗法的反应视为一个双体问题:对治疗的反应被认为是宿主和病原体蛋白质组的函数。我们着手根据HIV蛋白质组上宿主蛋白质和DNA基序的存在与否来识别潜在的反应者。
数千个HIV-1序列的比对证明了核苷酸序列的广泛变异,但也显示出蛋白质编码区域真核短线性基序的保守性。斯坦福HIV耐药数据库中患者的病毒载量在抗逆转录病毒治疗24周后呈现双峰分布,截断值为2000拷贝/毫升。同样,根据24周期间病毒载量持续降低而分为反应者/非反应者类别的患者表现出明显的区分。在这两种表型识别情况下,由HIV序列逆转录酶区域的短线性基序组成的一组特征准确地预测了患者对治疗的反应。与耐药位点重叠的基序在单一药物治疗方案中对反应者识别具有高度预测性,但这些特征在多药治疗中定义反应者时失去了重要性。
HIV序列的突变方式优先保留了在人类蛋白质组中也能找到的肽序列基序。HIV序列特定区域此类基序的存在与否对治疗反应具有高度预测性。其中一些预测性基序与已知的HIV-1耐药位点重叠。这些基序在生物信息学数据库中已得到充分确立,因此无需通过体外突变实验进行识别。