School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai, China.
PLoS One. 2012;7(2):e30483. doi: 10.1371/journal.pone.0030483. Epub 2012 Feb 23.
Accurate identification of peptides binding to specific Major Histocompatibility Complex Class II (MHC-II) molecules is of great importance for elucidating the underlying mechanism of immune recognition, as well as for developing effective epitope-based vaccines and promising immunotherapies for many severe diseases. Due to extreme polymorphism of MHC-II alleles and the high cost of biochemical experiments, the development of computational methods for accurate prediction of binding peptides of MHC-II molecules, particularly for the ones with few or no experimental data, has become a topic of increasing interest. TEPITOPE is a well-used computational approach because of its good interpretability and relatively high performance. However, TEPITOPE can be applied to only 51 out of over 700 known HLA DR molecules.
We have developed a new method, called TEPITOPEpan, by extrapolating from the binding specificities of HLA DR molecules characterized by TEPITOPE to those uncharacterized. First, each HLA-DR binding pocket is represented by amino acid residues that have close contact with the corresponding peptide binding core residues. Then the pocket similarity between two HLA-DR molecules is calculated as the sequence similarity of the residues. Finally, for an uncharacterized HLA-DR molecule, the binding specificity of each pocket is computed as a weighted average in pocket binding specificities over HLA-DR molecules characterized by TEPITOPE.
The performance of TEPITOPEpan has been extensively evaluated using various data sets from different viewpoints: predicting MHC binding peptides, identifying HLA ligands and T-cell epitopes and recognizing binding cores. Among the four state-of-the-art competing pan-specific methods, for predicting binding specificities of unknown HLA-DR molecules, TEPITOPEpan was roughly the second best method next to NETMHCIIpan-2.0. Additionally, TEPITOPEpan achieved the best performance in recognizing binding cores. We further analyzed the motifs detected by TEPITOPEpan, examining the corresponding literature of immunology. Its online server and PSSMs therein are available at http://www.biokdd.fudan.edu.cn/Service/TEPITOPEpan/.
准确识别与特定主要组织相容性复合体 II(MHC-II)分子结合的肽对于阐明免疫识别的基本机制,以及开发针对许多严重疾病的有效基于表位的疫苗和有前途的免疫疗法都非常重要。由于 MHC-II 等位基因的极端多态性和生化实验的高成本,开发用于准确预测 MHC-II 分子结合肽的计算方法,特别是对于那些实验数据较少或没有实验数据的分子,已成为一个日益受到关注的话题。TEPITOPE 是一种常用的计算方法,因为它具有良好的可解释性和相对较高的性能。然而,TEPITOPE 只能应用于已知的超过 700 个 HLA-DR 分子中的 51 个。
我们通过从 TEPITOPE 表征的 HLA-DR 分子的结合特异性推断到那些未表征的分子,开发了一种新方法,称为 TEPITOPEpan。首先,每个 HLA-DR 结合口袋由与相应肽结合核心残基紧密接触的氨基酸残基表示。然后,通过 TEPITOPE 表征的 HLA-DR 分子的残基序列相似性来计算两个 HLA-DR 分子之间的口袋相似性。最后,对于未表征的 HLA-DR 分子,通过对 TEPITOPE 表征的 HLA-DR 分子的口袋结合特异性进行加权平均来计算每个口袋的结合特异性。
从不同角度广泛评估了 TEPITOPEpan 的性能,包括预测 MHC 结合肽、鉴定 HLA 配体和 T 细胞表位以及识别结合核心。在四种最先进的竞争泛特异性方法中,对于预测未知 HLA-DR 分子的结合特异性,TEPITOPEpan 仅次于 NETMHCIIpan-2.0,是第二好的方法。此外,TEPITOPEpan 在识别结合核心方面表现最佳。我们进一步分析了 TEPITOPEpan 检测到的基序,检查了免疫学相关文献。其在线服务器及其 PSSMs 可在 http://www.biokdd.fudan.edu.cn/Service/TEPITOPEpan/ 上获得。