NetMHCpan-4.2：通过使用迁移学习和结构特征改进对CD8+表位的预测

NetMHCpan-4.2: improved prediction of CD8+ epitopes by use of transfer learning and structural features.

作者信息

Nilsson Jonas Birkelund, Greenbaum Jason, Peters Bjoern, Nielsen Morten

机构信息

Department of Health Technology, Technical University of Denmark, Lyngby, Denmark.

Center for Vaccine Innovation, La Jolla Institute for Immunology, La Jolla, CA, United States.

出版信息

Front Immunol. 2025 Aug 7;16:1616113. doi: 10.3389/fimmu.2025.1616113. eCollection 2025.

DOI:10.3389/fimmu.2025.1616113

PMID:40852704

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12367478/

Abstract

INTRODUCTION

Identification of CD8+ T cell epitopes is crucial for advancing vaccine development and immunotherapy strategies. Traditional methods for predicting T cell epitopes primarily focus on MHC presentation, leveraging immunopeptidome data. Recent advancements however suggest significant performance improvements through transfer learning and refinement using epitope data.

METHODS

To further investigate this, we here develop an enhanced MHC class I (MHC-I) antigen presentation predictor by integrating newly curated binding affinity and eluted ligand datasets, expanding MHC allele coverage, and incorporating novel input features related to the structural constraints of the MHC-I peptide-binding cleft. We next apply transfer learning using experimentally validated pathogen- and cancer-derived epitopes from public databases to refine our prediction method, ensuring comprehensive data partitioning to prevent performance overestimation.

RESULTS

Integration of structural features results in improved predictive power and enhanced identification of peptide residues likely to interact with the MHC. However, our findings indicate that fine-tuning on epitope data only yields a minor accuracy boost. Moreover, the transferability between cancer and pathogen-derived epitopes is limited, suggesting distinct properties between these data types.

DISCUSSION

In conclusion, while transfer learning can enhance T cell epitope prediction, the performance gains are modest and data type specific. Our final NetMHCpan-4.2 model is publicly accessible at https://services.healthtech.dtu.dk/services/NetMHCpan-4.2, providing a valuable resource for immunological research and therapeutic development.

摘要

引言

鉴定CD8 + T细胞表位对于推进疫苗开发和免疫治疗策略至关重要。预测T细胞表位的传统方法主要侧重于利用免疫肽组数据进行MHC呈递。然而，最近的进展表明，通过迁移学习和使用表位数据进行优化可显著提高性能。

方法

为了进一步研究这一点，我们在此开发了一种增强的I类MHC（MHC-I）抗原呈递预测器，通过整合新整理的结合亲和力和洗脱配体数据集、扩大MHC等位基因覆盖范围，并纳入与MHC-I肽结合裂隙的结构限制相关的新输入特征。接下来，我们使用来自公共数据库的经过实验验证的病原体和癌症衍生表位进行迁移学习，以优化我们的预测方法，确保进行全面的数据划分以防止性能高估。