Nielsen Morten, Lundegaard Claus, Blicher Thomas, Peters Bjoern, Sette Alessandro, Justesen Sune, Buus Søren, Lund Ole
Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark.
PLoS Comput Biol. 2008 Jul 4;4(7):e1000107. doi: 10.1371/journal.pcbi.1000107.
CD4 positive T helper cells control many aspects of specific immunity. These cells are specific for peptides derived from protein antigens and presented by molecules of the extremely polymorphic major histocompatibility complex (MHC) class II system. The identification of peptides that bind to MHC class II molecules is therefore of pivotal importance for rational discovery of immune epitopes. HLA-DR is a prominent example of a human MHC class II. Here, we present a method, NetMHCIIpan, that allows for pan-specific predictions of peptide binding to any HLA-DR molecule of known sequence. The method is derived from a large compilation of quantitative HLA-DR binding events covering 14 of the more than 500 known HLA-DR alleles. Taking both peptide and HLA sequence information into account, the method can generalize and predict peptide binding also for HLA-DR molecules where experimental data is absent. Validation of the method includes identification of endogenously derived HLA class II ligands, cross-validation, leave-one-molecule-out, and binding motif identification for hitherto uncharacterized HLA-DR molecules. The validation shows that the method can successfully predict binding for HLA-DR molecules-even in the absence of specific data for the particular molecule in question. Moreover, when compared to TEPITOPE, currently the only other publicly available prediction method aiming at providing broad HLA-DR allelic coverage, NetMHCIIpan performs equivalently for alleles included in the training of TEPITOPE while outperforming TEPITOPE on novel alleles. We propose that the method can be used to identify those hitherto uncharacterized alleles, which should be addressed experimentally in future updates of the method to cover the polymorphism of HLA-DR most efficiently. We thus conclude that the presented method meets the challenge of keeping up with the MHC polymorphism discovery rate and that it can be used to sample the MHC "space," enabling a highly efficient iterative process for improving MHC class II binding predictions.
CD4 阳性辅助性 T 细胞控制着特异性免疫的许多方面。这些细胞对源自蛋白质抗原且由高度多态的主要组织相容性复合体(MHC)II 类系统分子呈递的肽具有特异性。因此,鉴定与 MHC II 类分子结合的肽对于合理发现免疫表位至关重要。HLA - DR 是人类 MHC II 类的一个突出例子。在此,我们提出一种名为 NetMHCIIpan 的方法,该方法能够对肽与任何已知序列的 HLA - DR 分子的结合进行泛特异性预测。该方法源自对大量定量 HLA - DR 结合事件的汇编,涵盖了 500 多个已知 HLA - DR 等位基因中的 14 个。该方法同时考虑肽和 HLA 序列信息,对于缺乏实验数据的 HLA - DR 分子也能够进行泛化并预测肽结合情况。该方法的验证包括内源性 HLA II 类配体的鉴定、交叉验证、留一分子法以及对迄今未表征的 HLA - DR 分子的结合基序鉴定。验证表明,即使在没有针对特定分子的具体数据的情况下,该方法也能够成功预测 HLA - DR 分子的结合情况。此外,与目前唯一另一种旨在提供广泛 HLA - DR 等位基因覆盖范围的公开可用预测方法 TEPITOPE 相比,NetMHCIIpan 在 TEPITOPE 训练中包含的等位基因上表现相当,而在新等位基因上则优于 TEPITOPE。我们建议该方法可用于鉴定那些迄今未表征的等位基因,在该方法未来的更新中应通过实验对其进行研究,以最有效地涵盖 HLA - DR 的多态性。因此,我们得出结论,所提出的方法应对了跟上 MHC 多态性发现率这一挑战,并且可用于对 MHC“空间”进行采样,从而实现用于改进 MHC II 类结合预测的高效迭代过程。