Zhang Hao, Lundegaard Claus, Nielsen Morten
Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Lyngby 2800, Denmark.
Bioinformatics. 2009 Jan 1;25(1):83-9. doi: 10.1093/bioinformatics/btn579. Epub 2008 Nov 7.
MHC:peptide binding plays a central role in activating the immune surveillance. Computational approaches to determine T-cell epitopes restricted to any given major histocompatibility complex (MHC) molecule are of special practical value in the development of for instance vaccines with broad population coverage against emerging pathogens. Methods have recently been published that are able to predict peptide binding to any human MHC class I molecule. In contrast to conventional allele-specific methods, these methods do allow for extrapolation to uncharacterized MHC molecules. These pan-specific human lymphocyte antigen (HLA) predictors have not previously been compared using independent evaluation sets.
A diverse set of quantitative peptide binding affinity measurements was collected from Immune Epitope database (IEDB), together with a large set of HLA class I ligands from the SYFPEITHI database. Based on these datasets, three different pan-specific HLA web-accessible predictors NetMHCpan, adaptive double threading (ADT) and kernel-based inter-allele peptide binding prediction system (KISS) were evaluated. The performance of the pan-specific predictors was also compared with a well performing allele-specific MHC class I predictor, NetMHC, as well as a consensus approach integrating the predictions from the NetMHC and NetMHCpan methods.
The benchmark demonstrated that pan-specific methods do provide accurate predictions also for previously uncharacterized MHC molecules. The NetMHCpan method trained to predict actual binding affinities was consistently top ranking both on quantitative (affinity) and binary (ligand) data. However, the KISS method trained to predict binary data was one of the best performing methods when benchmarked on binary data. Finally, a consensus method integrating predictions from the two best performing methods was shown to improve the prediction accuracy.
MHC:肽结合在激活免疫监视中起核心作用。确定限于任何给定主要组织相容性复合体(MHC)分子的T细胞表位的计算方法在开发例如针对新兴病原体具有广泛人群覆盖率的疫苗方面具有特殊的实用价值。最近已发表能够预测肽与任何人类MHC I类分子结合的方法。与传统的等位基因特异性方法不同,这些方法确实允许外推到未表征的MHC分子。这些泛特异性人类淋巴细胞抗原(HLA)预测器以前尚未使用独立评估集进行比较。
从免疫表位数据库(IEDB)收集了一组多样的定量肽结合亲和力测量值,以及来自SYFPEITHI数据库的一大组HLA I类配体。基于这些数据集,评估了三种不同的可通过网络访问的泛特异性HLA预测器NetMHCpan、自适应双线程(ADT)和基于核的等位基因间肽结合预测系统(KISS)。泛特异性预测器的性能还与性能良好的等位基因特异性MHC I类预测器NetMHC以及整合NetMHC和NetMHCpan方法预测的共识方法进行了比较。
基准测试表明,泛特异性方法确实也能为以前未表征的MHC分子提供准确的预测。训练用于预测实际结合亲和力的NetMHCpan方法在定量(亲和力)和二元(配体)数据上始终排名第一。然而,训练用于预测二元数据的KISS方法在二元数据基准测试中是性能最佳的方法之一。最后,整合两种性能最佳方法预测的共识方法显示提高了预测准确性。