Wang Peng, Sidney John, Dow Courtney, Mothé Bianca, Sette Alessandro, Peters Bjoern
La Jolla Institute for Allergy and Immunology, La Jolla, California, United States of America.
PLoS Comput Biol. 2008 Apr 4;4(4):e1000048. doi: 10.1371/journal.pcbi.1000048.
The identification of MHC class II restricted peptide epitopes is an important goal in immunological research. A number of computational tools have been developed for this purpose, but there is a lack of large-scale systematic evaluation of their performance. Herein, we used a comprehensive dataset consisting of more than 10,000 previously unpublished MHC-peptide binding affinities, 29 peptide/MHC crystal structures, and 664 peptides experimentally tested for CD4+ T cell responses to systematically evaluate the performances of publicly available MHC class II binding prediction tools. While in selected instances the best tools were associated with AUC values up to 0.86, in general, class II predictions did not perform as well as historically noted for class I predictions. It appears that the ability of MHC class II molecules to bind variable length peptides, which requires the correct assignment of peptide binding cores, is a critical factor limiting the performance of existing prediction tools. To improve performance, we implemented a consensus prediction approach that combines methods with top performances. We show that this consensus approach achieved best overall performance. Finally, we make the large datasets used publicly available as a benchmark to facilitate further development of MHC class II binding peptide prediction methods.
鉴定MHC II类限制性肽表位是免疫研究中的一个重要目标。为此已开发了许多计算工具,但缺乏对其性能的大规模系统评估。在此,我们使用了一个综合数据集,该数据集包含10000多个以前未发表的MHC-肽结合亲和力、29个肽/MHC晶体结构以及664个经实验测试的针对CD4+ T细胞反应的肽,以系统评估公开可用的MHC II类结合预测工具的性能。虽然在某些选定情况下,最佳工具的AUC值高达0.86,但总体而言,II类预测的表现不如I类预测历来的表现。MHC II类分子结合可变长度肽的能力,这需要正确确定肽结合核心,似乎是限制现有预测工具性能的关键因素。为了提高性能,我们实施了一种将具有最佳性能的方法结合起来的共识预测方法。我们表明,这种共识方法实现了最佳的总体性能。最后,我们将所使用的大型数据集公开作为基准,以促进MHC II类结合肽预测方法的进一步发展。