Kumar Narendra, Mohanty Debasisa
National Institute of Immunology, Aruna Asaf Ali Marg, New Delhi 110067, India.
Mol Biosyst. 2010 Dec;6(12):2508-20. doi: 10.1039/c0mb00013b. Epub 2010 Oct 18.
Identification of MHC binding peptides is essential for understanding the molecular mechanism of immune response. However, most of the prediction methods use motifs/profiles derived from experimental peptide binding data for specific MHC alleles, thus limiting their applicability only to those alleles for which such data is available. In this work we have developed a structure-based method which does not require experimental peptide binding data for training. Our method models MHC-peptide complexes using crystal structures of 170 MHC-peptide complexes and evaluates the binding energies using two well known residue based statistical pair potentials, namely Betancourt-Thirumalai (BT) and Miyazawa-Jernigan (MJ) matrices. Extensive benchmarking of prediction accuracy on a data set of 1654 epitopes from class I and class II alleles available in the SYFPEITHI database indicate that BT pair-potential can predict more than 60% of the known binders in case of 14 MHC alleles with AUC values for ROC curves ranging from 0.6 to 0.9. Similar benchmarking on 29,522 class I and class II MHC binding peptides with known IC(50) values in the IEDB database showed AUC values higher than 0.6 for 10 class I alleles and 9 class II alleles in predictions involving classification of a peptide to be binder or non-binder. Comparison with recently available benchmarking studies indicated that, the prediction accuracy of our method for many of the class I and class II MHC alleles was comparable to the sequence based methods, even if it does not use any experimental data for training. It is also encouraging to note that the ranks of true binding peptides could further be improved, when high scoring peptides obtained from pair potential were re-ranked using all atom forcefield and MM/PBSA method.
识别MHC结合肽对于理解免疫反应的分子机制至关重要。然而,大多数预测方法使用从特定MHC等位基因的实验性肽结合数据衍生而来的基序/轮廓,因此其适用性仅限于那些有此类数据的等位基因。在这项工作中,我们开发了一种基于结构的方法,该方法在训练时不需要实验性肽结合数据。我们的方法使用170个MHC-肽复合物的晶体结构对MHC-肽复合物进行建模,并使用两种著名的基于残基的统计对势,即贝当古-蒂鲁马莱(BT)矩阵和宫泽-杰尔尼根(MJ)矩阵来评估结合能。对SYFPEITHI数据库中1654个来自I类和II类等位基因的表位数据集进行的广泛预测准确性基准测试表明,对于14个MHC等位基因,BT对势可以预测超过60%的已知结合物,ROC曲线的AUC值范围为0.6至0.9。对IEDB数据库中29,522个具有已知IC(50)值的I类和II类MHC结合肽进行的类似基准测试表明,在涉及将肽分类为结合物或非结合物的预测中,10个I类等位基因和9个II类等位基因的AUC值高于0.6。与最近可用的基准测试研究相比表明,即使我们的方法在训练时不使用任何实验数据,其对许多I类和II类MHC等位基因的预测准确性与基于序列的方法相当。同样令人鼓舞的是,当使用全原子力场和MM/PBSA方法对从对势获得的高分肽进行重新排序时,真实结合肽的排名可以进一步提高。