CONICET-Universidad de Buenos Aires, Instituto de Investigaciones Biomédicas en Retrovirus y SIDA (INBIRS), Universidad de Buenos Aires- CONICET, Paraguay 2155 Piso 11, Buenos Aires C1121ABG, Argentina.
Fundación Huésped, Buenos Aires C1202ABB, Argentina.
Viruses. 2018 Jan 13;10(1):34. doi: 10.3390/v10010034.
Progression of HIV infection is variable among individuals, and definition disease progression biomarkers is still needed. Here, we aimed to categorize the predictive potential of several variables using feature selection methods and decision trees. A total of seventy-five treatment-naïve subjects were enrolled during acute/early HIV infection. CD4⁺ T-cell counts (CD4TC) and viral load (VL) levels were determined at enrollment and for one year. Immune activation, HIV-specific immune response, Human Leukocyte Antigen (HLA) and C-C chemokine receptor type 5 (CCR5) genotypes, and plasma levels of 39 cytokines were determined. Data were analyzed by machine learning and non-parametric methods. Variable hierarchization was performed by Weka correlation-based feature selection and J48 decision tree. Plasma interleukin (IL)-10, interferon gamma-induced protein (IP)-10, soluble IL-2 receptor alpha (sIL-2Rα) and tumor necrosis factor alpha (TNF-α) levels correlated directly with baseline VL, whereas IL-2, TNF-α, fibroblast growth factor (FGF)-2 and macrophage inflammatory protein (MIP)-1β correlated directly with CD4⁺ T-cell activation ( < 0.05). However, none of these cytokines had good predictive values to distinguish "progressors" from "non-progressors". Similarly, immune activation, HIV-specific immune responses and HLA/CCR5 genotypes had low discrimination power. Baseline CD4TC was the most potent discerning variable with a cut-off of 438 cells/μL (accuracy = 0.93, κ-Cohen = 0.85). Limited discerning power of the other factors might be related to frequency, variability and/or sampling time. Future studies based on decision trees to identify biomarkers of post-treatment control are warrantied.
HIV 感染的进展在个体之间是可变的,仍然需要定义疾病进展的生物标志物。在这里,我们旨在使用特征选择方法和决策树对几种变量的预测潜力进行分类。共有 75 名未经治疗的急性/早期 HIV 感染患者入组。在入组时和入组后一年测定 CD4+T 细胞计数(CD4TC)和病毒载量(VL)水平。测定免疫激活、HIV 特异性免疫反应、人类白细胞抗原(HLA)和 C-C 趋化因子受体 5(CCR5)基因型以及 39 种细胞因子的血浆水平。使用机器学习和非参数方法进行数据分析。通过 Weka 基于相关性的特征选择和 J48 决策树进行变量层次化。血浆白细胞介素(IL)-10、干扰素γ诱导蛋白(IP)-10、可溶性白细胞介素 2 受体α(sIL-2Rα)和肿瘤坏死因子-α(TNF-α)水平与基线 VL 直接相关,而 IL-2、TNF-α、成纤维细胞生长因子(FGF)-2 和巨噬细胞炎症蛋白(MIP)-1β与 CD4+T 细胞激活直接相关(<0.05)。然而,这些细胞因子中没有一个具有良好的预测值来区分“进展者”和“非进展者”。同样,免疫激活、HIV 特异性免疫反应和 HLA/CCR5 基因型的区分能力较低。基线 CD4TC 是最有力的鉴别变量,截断值为 438 个细胞/μL(准确性=0.93,κ-Cohen=0.85)。其他因素的鉴别能力有限可能与频率、可变性和/或采样时间有关。基于决策树识别治疗后控制的生物标志物的未来研究是必要的。