Altmann André, Rosen-Zvi Michal, Prosperi Mattia, Aharoni Ehud, Neuvirth Hani, Schülter Eugen, Büch Joachim, Struck Daniel, Peres Yardena, Incardona Francesca, Sönnerborg Anders, Kaiser Rolf, Zazzi Maurizio, Lengauer Thomas
Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Saarbrücken, Germany.
PLoS One. 2008;3(10):e3470. doi: 10.1371/journal.pone.0003470. Epub 2008 Oct 21.
Analysis of the viral genome for drug resistance mutations is state-of-the-art for guiding treatment selection for human immunodeficiency virus type 1 (HIV-1)-infected patients. These mutations alter the structure of viral target proteins and reduce or in the worst case completely inhibit the effect of antiretroviral compounds while maintaining the ability for effective replication. Modern anti-HIV-1 regimens comprise multiple drugs in order to prevent or at least delay the development of resistance mutations. However, commonly used HIV-1 genotype interpretation systems provide only classifications for single drugs. The EuResist initiative has collected data from about 18,500 patients to train three classifiers for predicting response to combination antiretroviral therapy, given the viral genotype and further information. In this work we compare different classifier fusion methods for combining the individual classifiers.
The individual classifiers yielded similar performance, and all the combination approaches considered performed equally well. The gain in performance due to combining methods did not reach statistical significance compared to the single best individual classifier on the complete training set. However, on smaller training set sizes (200 to 1,600 instances compared to 2,700) the combination significantly outperformed the individual classifiers (p<0.01; paired one-sided Wilcoxon test). Together with a consistent reduction of the standard deviation compared to the individual prediction engines this shows a more robust behavior of the combined system. Moreover, using the combined system we were able to identify a class of therapy courses that led to a consistent underestimation (about 0.05 AUC) of the system performance. Discovery of these therapy courses is a further hint for the robustness of the combined system.
The combined EuResist prediction engine is freely available at http://engine.euresist.org.
分析病毒基因组中的耐药性突变是指导人类免疫缺陷病毒1型(HIV-1)感染患者治疗选择的先进方法。这些突变会改变病毒靶蛋白的结构,在保持有效复制能力的同时,降低或在最坏的情况下完全抑制抗逆转录病毒化合物的作用。现代抗HIV-1治疗方案包含多种药物,以预防或至少延缓耐药性突变的产生。然而,常用的HIV-1基因型解读系统仅提供单一药物的分类。EuResist计划收集了约18500名患者的数据,用于训练三个分类器,以便在已知病毒基因型和其他信息的情况下预测对抗逆转录病毒联合疗法的反应。在这项工作中,我们比较了用于组合各个分类器的不同分类器融合方法。
各个分类器的表现相似,所有考虑的组合方法表现相当。与完整训练集上的单个最佳分类器相比,组合方法带来的性能提升未达到统计学显著性。然而,在较小的训练集规模下(与2700个实例相比,为200至1600个实例),组合方法显著优于单个分类器(p<0.01;配对单侧Wilcoxon检验)。与单个预测引擎相比,标准偏差持续降低,这表明组合系统的行为更稳健。此外,使用组合系统我们能够识别出一类治疗疗程,这类疗程导致系统性能持续被低估(约0.05 AUC)。发现这些治疗疗程进一步证明了组合系统的稳健性。
组合后的EuResist预测引擎可在http://engine.euresist.org免费获取。