Advanced Database Research and Modelling (ADReM), Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium.
Biomedical Informatics Research Network Antwerp (biomina), University of Antwerp, Antwerp, Belgium.
Immunogenetics. 2018 Mar;70(3):159-168. doi: 10.1007/s00251-017-1023-5. Epub 2017 Aug 4.
Current T cell epitope prediction tools are a valuable resource in designing targeted immunogenicity experiments. They typically focus on, and are able to, accurately predict peptide binding and presentation by major histocompatibility complex (MHC) molecules on the surface of antigen-presenting cells. However, recognition of the peptide-MHC complex by a T cell receptor (TCR) is often not included in these tools. We developed a classification approach based on random forest classifiers to predict recognition of a peptide by a T cell receptor and discover patterns that contribute to recognition. We considered two approaches to solve this problem: (1) distinguishing between two sets of TCRs that each bind to a known peptide and (2) retrieving TCRs that bind to a given peptide from a large pool of TCRs. Evaluation of the models on two HIV-1, B*08-restricted epitopes reveals good performance and hints towards structural CDR3 features that can determine peptide immunogenicity. These results are of particular importance as they show that prediction of T cell epitope and T cell epitope recognition based on sequence data is a feasible approach. In addition, the validity of our models not only serves as a proof of concept for the prediction of immunogenic T cell epitopes but also paves the way for more general and high-performing models.
当前的 T 细胞表位预测工具是设计靶向免疫原性实验的有价值的资源。它们通常专注于并能够准确预测主要组织相容性复合物(MHC)分子在抗原呈递细胞表面上的肽结合和呈递。然而,这些工具通常不包括 T 细胞受体(TCR)对肽-MHC 复合物的识别。我们开发了一种基于随机森林分类器的分类方法来预测 T 细胞受体对肽的识别,并发现有助于识别的模式。我们考虑了两种解决此问题的方法:(1)区分分别结合已知肽的两组 TCR,以及(2)从大量 TCR 中检索与给定肽结合的 TCR。对两种 HIV-1、B*08 限制的表位的模型评估显示出良好的性能,并暗示了可以确定肽免疫原性的结构 CDR3 特征。这些结果尤为重要,因为它们表明基于序列数据预测 T 细胞表位和 T 细胞表位识别是一种可行的方法。此外,我们的模型的有效性不仅为免疫原性 T 细胞表位的预测提供了概念验证,而且为更通用和高性能的模型铺平了道路。